Wiretap
Challenge
Content preserved from the original writeup source. Minimal normalization was applied to fit platform format.
Solution
Original Writeup Content (Preserved)
conneticut-wiretap - Writeup
Challenge
- Name: Wiretap
- Category: Forensics / Audio
- Points: 998
- Author: hypnos
- Artifact:
beep_beep_boop.wav - Provided SHA1:
fb8ef1616ef3e993e81d7f23f9d56b76d51175be
Final Flag
CIT{g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t}
TL;DR
The WAV contains a modem-style FSK stream. Demodulating it recovers an HTTP response with an HTML page. Inside the page are 3 embedded pixel-art SVG strips that decode to:
g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t
Wrap in challenge format to get the flag.
1) Verify integrity
cd /Users/elenaeftimie/Desktop/CTFs/hens_ctf/conneticut-wiretap
shasum -a 1 beep_beep_boop.wav
Expected:
fb8ef1616ef3e993e81d7f23f9d56b76d51175be beep_beep_boop.wav
2) Inspect basic audio metadata
file beep_beep_boop.wav
soxi beep_beep_boop.wav
Observed:
- RIFF/WAV PCM
- mono, 44100 Hz, 16-bit
- duration around 3:57
3) Visualize frequencies (spectrogram)
ffmpeg -y -i beep_beep_boop.wav -lavfi showspectrumpic=s=2048x1024:legend=1 spectrogram.png
This helps confirm narrow-band telephone/modem-like tones.
4) Demodulate FSK payload
A custom script was used to demodulate the modem-like stream and carve bytes from the recovered UART frames.
Run:
python3 decode_fsk.py
This writes:
decoded_raw.bindecoded_carved.bin
Quick check:
strings -n 6 decoded_carved.bin | head -n 80
You should see HTTP headers, e.g.:
HTTP/1.0 200 OKServer: Apache/1.3.6 (Unix)Content-Type: text/html
5) Reconstruct HTML body
python3 -c "from pathlib import Path; b=Path('decoded_carved.bin').read_bytes(); i=b.find(b'\r\n\r\n'); Path('decoded.html').write_bytes(b[i+4:] if i!=-1 else b)"
Generated:
decoded.html
6) Extract embedded SVG strips
The recovered HTML contains three inline SVG pixel strips.
Run:
python3 extract_svgs.py
Generated:
strip1.svgstrip2.svgstrip3.svg
7) Render SVGs for manual reading
rsvg-convert -w 1200 strip1.svg -o strip1.png
rsvg-convert -w 1200 strip2.svg -o strip2.png
rsvg-convert -w 1200 strip3.svg -o strip3.png
(Optional OCR, can be noisy on tiny retro fonts):
tesseract strip1.png stdout --psm 7
tesseract strip2.png stdout --psm 7
tesseract strip3.png stdout --psm 7
8) Decode the SVG text
Reading the three pixel strips yields:
g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t
Apply flag format:
CIT{g3t_0ff_th3_ph0n3_1m_0n_th3_1ntern3t}
Notes
- A close typo is
1nern3t; the correct ending is1ntern3t. - The key insight is to treat the audio as data, not speech.