Skip to main content

Wiretap

Challenge

Content preserved from the original writeup source. Minimal normalization was applied to fit platform format.

Solution

Original Writeup Content (Preserved)

conneticut-wiretap - Writeup

Challenge

  • Name: Wiretap
  • Category: Forensics / Audio
  • Points: 998
  • Author: hypnos
  • Artifact: beep_beep_boop.wav
  • Provided SHA1: fb8ef1616ef3e993e81d7f23f9d56b76d51175be

Final Flag

CIT{g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t}

TL;DR

The WAV contains a modem-style FSK stream. Demodulating it recovers an HTTP response with an HTML page. Inside the page are 3 embedded pixel-art SVG strips that decode to:

g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t

Wrap in challenge format to get the flag.


1) Verify integrity

cd /Users/elenaeftimie/Desktop/CTFs/hens_ctf/conneticut-wiretap
shasum -a 1 beep_beep_boop.wav

Expected:

fb8ef1616ef3e993e81d7f23f9d56b76d51175be  beep_beep_boop.wav

2) Inspect basic audio metadata

file beep_beep_boop.wav
soxi beep_beep_boop.wav

Observed:

  • RIFF/WAV PCM
  • mono, 44100 Hz, 16-bit
  • duration around 3:57

3) Visualize frequencies (spectrogram)

ffmpeg -y -i beep_beep_boop.wav -lavfi showspectrumpic=s=2048x1024:legend=1 spectrogram.png

This helps confirm narrow-band telephone/modem-like tones.

4) Demodulate FSK payload

A custom script was used to demodulate the modem-like stream and carve bytes from the recovered UART frames.

Run:

python3 decode_fsk.py

This writes:

  • decoded_raw.bin
  • decoded_carved.bin

Quick check:

strings -n 6 decoded_carved.bin | head -n 80

You should see HTTP headers, e.g.:

  • HTTP/1.0 200 OK
  • Server: Apache/1.3.6 (Unix)
  • Content-Type: text/html

5) Reconstruct HTML body

python3 -c "from pathlib import Path; b=Path('decoded_carved.bin').read_bytes(); i=b.find(b'\r\n\r\n'); Path('decoded.html').write_bytes(b[i+4:] if i!=-1 else b)"

Generated:

  • decoded.html

6) Extract embedded SVG strips

The recovered HTML contains three inline SVG pixel strips.

Run:

python3 extract_svgs.py

Generated:

  • strip1.svg
  • strip2.svg
  • strip3.svg

7) Render SVGs for manual reading

rsvg-convert -w 1200 strip1.svg -o strip1.png
rsvg-convert -w 1200 strip2.svg -o strip2.png
rsvg-convert -w 1200 strip3.svg -o strip3.png

(Optional OCR, can be noisy on tiny retro fonts):

tesseract strip1.png stdout --psm 7
tesseract strip2.png stdout --psm 7
tesseract strip3.png stdout --psm 7

8) Decode the SVG text

Reading the three pixel strips yields:

g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t

Apply flag format:

CIT{g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t}


Notes

  • A close typo is 1nern3t; the correct ending is 1ntern3t.
  • The key insight is to treat the audio as data, not speech.