Skip to main content

awk...wardddd

Challenge

Content preserved from the original writeup source. Minimal normalization was applied to fit platform format.

Solution

Original Writeup Content (Preserved)

awk...wardddd Writeup

Challenge

  • Name: awk...wardddd
  • Points: 100
  • Difficulty: Medium-Hard

Prompt hint:

  • "Most of the contents appear to be redundant or stale"
  • "a few records still reflect the system’s original processing format"
  • "Focus on what remains consistent"

Objective

Recover the real flag from a noisy recovered directory where many records are decoys.

Environment

  • Root: s0rry_in_4dv4nc3
  • High-noise data spread across: archive, logs, tmp, users
  • File types include: .rec, .txt, .dat, .log, .cache, .tmp

Recon Strategy

The key was to avoid solving by extension and instead solve by consistency:

  1. Measure file-type distribution to understand noise shape.
  2. Sample file contents across extensions.
  3. Identify fixed record schema.
  4. Detect obvious decoys.
  5. Hunt for the smallest, most stable family of records.
  6. Reassemble ordered payload and decode.

Step 1: Validate General Structure

Most files, regardless of extension, contained key-value records with fields like:

  • timestamp
  • profile
  • uid
  • state
  • part
  • data
  • note
  • sometimes comment

Example structure found repeatedly:

timestamp=185793
profile=alpha
uid=6474
state=pending
part=13
data=xKCVMhWYXq=Q
note=-xJ639P7U6EhUQXVmmTziSFWd

This showed that extensions were mostly cosmetic and could not be trusted alone.

Step 2: Identify Decoy Pattern

Some records contained:

comment=VURDVEZ7ZmFrZV9mbGFnfQ==

Decoding gives:

`UDCTF{fake_flag}`

This is a deliberate trap.

Step 3: Follow the Hint (Consistency)

The useful anomaly was a tiny set of files following a stable naming convention:

  • archive/sys_we9bk_04.rec
  • archive/sys_eXAuo_05.rec
  • logs/sys_gL6JX_02.rec
  • logs/sys_tBxJ4_06.rec
  • tmp/sys_t-Bfc_03.rec
  • tmp/sys_lYOTr_07.rec
  • users/sys_FtSns_01.rec

Unlike random names everywhere else, these all:

  • use sys_* naming
  • end in ordered part-like suffixes
  • contain the same semantic tags:
    • profile=delta
    • state=active
    • note=retained

These are the "original processing format" remnants referenced by the prompt.

Step 4: Extract Ordered Data Shards

Each sys record has:

  • part=NN
  • data=<base64-like chunk>

Collected chunks ordered by part:

  1. part=01 -> VURDVEZ7
  2. part=02 -> dzNsbF83
  3. part=03 -> aDQ3X3c0
  4. part=04 -> NW4nN183
  5. part=05 -> MF9oNHJk
  6. part=06 -> X3c0NV8x
  7. part=07 -> Nz99

Concatenated payload:

VURDVEZ7dzNsbF83aDQ3X3c0NW4nN183MF9oNHJkX3c0NV8xNz99

Base64 decode result:

`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`

Final Flag

`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`

Reproducible Code

One-liner (bash)

Run from s0rry_in_4dv4nc3:

find . -type f -name 'sys_*.rec' | while read -r f; do
p=$(awk -F= '$1=="part"{print $2}' "$f")
d=$(awk -F= '$1=="data"{print $2}' "$f")
echo "$p $d"
done | sort | awk '{printf "%s",$2} END{print ""}' | base64 -D

Expected output:

`UDCTF{w3ll_7h47_w45n'7_70_h4rd_w45_17?}`

Python Reassembler (clean and portable)

#!/usr/bin/env python3
from pathlib import Path
import base64

root = Path(".")
chunks = {}

for path in root.rglob("sys_*.rec"):
fields = {}
for line in path.read_text(encoding="utf-8").splitlines():
if "=" in line:
k, v = line.split("=", 1)
fields[k.strip()] = v.strip()

# optional sanity check on stable metadata
if fields.get("profile") != "delta":
continue
if fields.get("state") != "active":
continue
if fields.get("note") != "retained":
continue

part = int(fields["part"])
chunks[part] = fields["data"]

payload = "".join(chunks[i] for i in sorted(chunks))
flag = base64.b64decode(payload).decode("utf-8")

print("payload:", payload)
print("flag:", flag)

Why This Works

  • The challenge intentionally floods the corpus with plausible-looking records.
  • Extension-based filtering fails because noise shares the same schema.
  • Decoy comments provide fake flag bait.
  • The only robust signal is consistency of naming + metadata + ordered parts.
  • Reassembling by part and decoding yields the authentic result.

Analyst Notes

If automating similar CTFs:

  • rank groups by naming regularity and metadata stability
  • prefer smallest coherent clusters over majority patterns
  • verify byte-accurate decoded output when terminal prompt artifacts appear