Extracting Game Music from an Encrypted Archive

Extracting Game Music from an Encrypted Archive

April 19, 2026

Years ago, I was obsessed with a hidden-object puzzle game series. The kind you'd play on a lazy afternoon — beautiful hand-painted scenes, clever puzzles, and most importantly, a soundtrack that lived rent-free in my head for years afterward.

The problem? The music was never released separately. No OST on Steam, no album on Bandcamp, nothing. The tracks only existed inside the game's resource archive — a single game.dat file that bundled every asset the game needed.

I tried the obvious things first. Searched forums, checked modding communities, looked for existing extractors. Nothing worked for this particular format. So I opened the file in a hex editor and started reading bytes.

This is the story of how I got those tracks out.

The First Clue: Repeating Bytes

The file started with seemingly random data — but one pattern jumped out immediately. Starting at byte 5, there was a long run of identical bytes: 0xF7, over and over.

Offset 00: 37 BD 37 4D F7 F7 F7 F7 F7 ...

In any binary format, you expect padding or reserved fields to be filled with zeros. When those zeros are instead some other constant value, the explanation is almost always XOR encryption with a fixed key. XOR 0x00 with any key K, and you get K back.

Five consecutive 0xF7 bytes where nulls should be. The key had to be 0xF7.

Confirming the Hypothesis

I wrote three lines of Python to test:

raw = open('game.dat', 'rb').read(10)
dec = bytes(b ^ 0xF7 for b in raw)
print(dec.hex())
# c04ac0ba 0000000000

The first 4 bytes decoded to something that looked like a magic number (C0 4A C0 BA), followed by 5 clean null bytes. That's exactly what a file header should look like — a signature followed by padding. Hypothesis confirmed in under a minute.

Reading the File Index

With the XOR key known, I could decrypt the entire file in memory and look for structure. Right after the 9-byte header, I found a byte with value 0x14 (20 in decimal), followed by 20 bytes of readable ASCII text — a Windows-style file path.

That's a length-prefixed string. The pattern was:

  • 1 byte: filename length
  • N bytes: the filename itself (ASCII, backslash-separated paths)
  • 4 bytes: file size (little-endian)
  • 9 bytes: reserved (all zeros)

I kept reading these records one after another. Each entry described a file inside the archive — its name and its size. The index continued until the pattern broke: an invalid length byte, non-ASCII characters, or a path that didn't match the expected prefix.

Where's the Data?

The file data turned out to be packed in the simplest possible way: immediately after the last index entry, with no alignment padding, no gaps. Files were stored sequentially in the same order as the index.

To calculate any file's offset:

data_start = end_of_index
file_offset[i] = data_start + sum(sizes of all files before i)

A linear scan through the index was all it took.

The Moment of Truth

I wrote a quick extraction loop, pulled out the first few files, and checked their magic bytes:

  • Files ending in .ogg: started with OggS — valid Ogg Vorbis audio ✓
  • Files ending in .jpg: started with FF D8 FF — valid JPEG ✓
  • Files ending in .gif: started with GIF8 — valid GIF ✓

I double-clicked the first extracted .ogg file. The melody that had been stuck in my head for years played back perfectly.

What I Learned

The whole process took maybe two hours. Looking back, the approach generalizes well to many simple game archives:

  1. Look for the key in the noise. XOR encryption with a fixed key is trivially broken by examining null regions. The key literally announces itself.

  2. Decode the header first. If your decryption produces a plausible magic number followed by clean padding, you're on the right track.

  3. Follow the structure greedily. Try interpreting the next byte as a length. If the following N bytes are readable text, you've found the index format. No need for a disassembler or spec sheet.

  4. Validate with known signatures. Extracted files should have recognizable magic bytes. If they don't, something upstream is wrong.

  5. Assume simplicity first. Small game studios rarely implement complex archive formats. Try the dumbest possible layout (sequential packing, fixed-key XOR, no compression) before reaching for anything fancier.

The encryption wasn't meant to be serious protection — it was just enough to keep casual users from dragging and dropping files out of the archive. Against anyone willing to open a hex editor, it's transparent.

Appendix: The Extractor

For anyone in a similar situation — a beloved soundtrack locked inside a game.dat with XOR encryption — here's the script that worked for me:

"""
game.dat extractor for XOR-encrypted archives.
Key: 0xF7 | Format: 9-byte header + length-prefixed index + packed data
"""
import os, struct, sys
 
XOR_KEY = 0xF7
 
def xor_decrypt(data):
    return bytes(b ^ XOR_KEY for b in data)
 
def extract(dat_path, output_dir, filter_func=None):
    os.makedirs(output_dir, exist_ok=True)
 
    with open(dat_path, 'rb') as f:
        raw = f.read(2 * 1024 * 1024)
 
    dec = xor_decrypt(raw)
 
    # Parse index
    records = []
    pos = 9
    while pos < len(dec) - 14:
        name_len = dec[pos]
        if name_len == 0 or name_len > 200:
            break
        try:
            name = dec[pos + 1 : pos + 1 + name_len].decode("ascii")
        except (UnicodeDecodeError, ValueError):
            break
        file_size = struct.unpack_from("<I", dec, pos + 1 + name_len)[0]
        records.append((name, file_size))
        pos += 1 + name_len + 13
 
    data_start = pos
    print(f"Found {len(records)} files, data at offset {data_start:,}")
 
    current_offset = data_start
    extracted = 0
    with open(dat_path, "rb") as f:
        for name, size in records:
            if filter_func and not filter_func(name):
                current_offset += size
                continue
            f.seek(current_offset)
            out_data = xor_decrypt(f.read(size))
            # Strip root prefix, convert path separators
            rel = name.split("\", 1)[-1] if "\" in name else name
            out_path = os.path.join(output_dir, rel.replace("\", os.sep))
            os.makedirs(os.path.dirname(out_path), exist_ok=True)
            with open(out_path, "wb") as out:
                out.write(out_data)
            extracted += 1
            current_offset += size
 
    print(f"Extracted {extracted} files")
 
if __name__ == "__main__":
    dat = sys.argv[1] if len(sys.argv) > 1 else "game.dat"
    out = sys.argv[2] if len(sys.argv) > 2 else "extracted"
    mode = sys.argv[3] if len(sys.argv) > 3 else "all"
 
    filters = {
        "music": lambda n: "MUSIC" in n.upper(),
        "audio": lambda n: any(n.upper().endswith(e) for e in (".OGG", ".WAV", ".MP3")),
    }
    extract(dat, out, filters.get(mode))
python extract.py game.dat output        # everything
python extract.py game.dat music music   # just the soundtrack
python extract.py game.dat audio audio   # all audio files