awk...wardddd
Challenge
Content preserved from the original writeup source. Minimal normalization was applied to fit platform format.
Solution
Original Writeup Content (Preserved)
awk...wardddd Writeup
Challenge
- Name: awk...wardddd
- Points: 100
- Difficulty: Medium-Hard
Prompt hint:
- "Most of the contents appear to be redundant or stale"
- "a few records still reflect the system’s original processing format"
- "Focus on what remains consistent"
Objective
Recover the real flag from a noisy recovered directory where many records are decoys.
Environment
- Root: s0rry_in_4dv4nc3
- High-noise data spread across: archive, logs, tmp, users
- File types include: .rec, .txt, .dat, .log, .cache, .tmp
Recon Strategy
The key was to avoid solving by extension and instead solve by consistency:
- Measure file-type distribution to understand noise shape.
- Sample file contents across extensions.
- Identify fixed record schema.
- Detect obvious decoys.
- Hunt for the smallest, most stable family of records.
- Reassemble ordered payload and decode.
Step 1: Validate General Structure
Most files, regardless of extension, contained key-value records with fields like:
- timestamp
- profile
- uid
- state
- part
- data
- note
- sometimes comment
Example structure found repeatedly:
timestamp=185793
profile=alpha
uid=6474
state=pending
part=13
data=xKCVMhWYXq=Q
note=-xJ639P7U6EhUQXVmmTziSFWd
This showed that extensions were mostly cosmetic and could not be trusted alone.
Step 2: Identify Decoy Pattern
Some records contained:
comment=VURDVEZ7ZmFrZV9mbGFnfQ==
Decoding gives:
`UDCTF{fake_flag}`
This is a deliberate trap.
Step 3: Follow the Hint (Consistency)
The useful anomaly was a tiny set of files following a stable naming convention:
- archive/sys_we9bk_04.rec
- archive/sys_eXAuo_05.rec
- logs/sys_gL6JX_02.rec
- logs/sys_tBxJ4_06.rec
- tmp/sys_t-Bfc_03.rec
- tmp/sys_lYOTr_07.rec
- users/sys_FtSns_01.rec
Unlike random names everywhere else, these all:
- use sys_* naming
- end in ordered part-like suffixes
- contain the same semantic tags:
- profile=delta
- state=active
- note=retained
These are the "original processing format" remnants referenced by the prompt.
Step 4: Extract Ordered Data Shards
Each sys record has:
- part=NN
data=<base64-like chunk>
Collected chunks ordered by part:
- part=01 -> VURDVEZ7
- part=02 -> dzNsbF83
- part=03 -> aDQ3X3c0
- part=04 -> NW4nN183
- part=05 -> MF9oNHJk
- part=06 -> X3c0NV8x
- part=07 -> Nz99
Concatenated payload:
VURDVEZ7dzNsbF83aDQ3X3c0NW4nN183MF9oNHJkX3c0NV8xNz99
Base64 decode result:
`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`
Final Flag
`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`
Reproducible Code
One-liner (bash)
Run from s0rry_in_4dv4nc3:
find . -type f -name 'sys_*.rec' | while read -r f; do
p=$(awk -F= '$1=="part"{print $2}' "$f")
d=$(awk -F= '$1=="data"{print $2}' "$f")
echo "$p $d"
done | sort | awk '{printf "%s",$2} END{print ""}' | base64 -D
Expected output:
`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`
Python Reassembler (clean and portable)
#!/usr/bin/env python3
from pathlib import Path
import base64
root = Path(".")
chunks = {}
for path in root.rglob("sys_*.rec"):
fields = {}
for line in path.read_text(encoding="utf-8").splitlines():
if "=" in line:
k, v = line.split("=", 1)
fields[k.strip()] = v.strip()
# optional sanity check on stable metadata
if fields.get("profile") != "delta":
continue
if fields.get("state") != "active":
continue
if fields.get("note") != "retained":
continue
part = int(fields["part"])
chunks[part] = fields["data"]
payload = "".join(chunks[i] for i in sorted(chunks))
flag = base64.b64decode(payload).decode("utf-8")
print("payload:", payload)
print("flag:", flag)
Why This Works
- The challenge intentionally floods the corpus with plausible-looking records.
- Extension-based filtering fails because noise shares the same schema.
- Decoy comments provide fake flag bait.
- The only robust signal is consistency of naming + metadata + ordered parts.
- Reassembling by part and decoding yields the authentic result.
Analyst Notes
If automating similar CTFs:
- rank groups by naming regularity and metadata stability
- prefer smallest coherent clusters over majority patterns
- verify byte-accurate decoded output when terminal prompt artifacts appear