Skip to main content
Code Review

Return to Question

added 32 characters in body; edited tags
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238

The directory tree is stored in three levels: extension, path, and filename. If any of these are empty (e.g. a file with no extension, or which lives in the root directory) it is represented by a space. For example, for this directory tree:

vpk_test
├── file.txt
├── no_ext
└── some_dir
 ├── another_file.txt
 └── image.jpg

For example, for this directory tree:

vpk_test
├── file.txt
├── no_ext
└── some_dir
 ├── another_file.txt
 └── image.jpg
0000000: 3412 aa55 0100 0000 8f00 0000 2000 2000 4..U........ . .
0000010: 6e6f 5f65 7874 0000 0000 0000 0000 0000 no_ext..........
0000020: 0000 0000 0000 00ff ff00 0074 7874 0020 ...........txt.
0000030: 0066 696c 6500 0000 0000 0000 0000 0000 .file...........
0000040: 0000 0000 0000 ffff 0073 6f6d 655f 6469 .........some_di
0000050: 7200 616e 6f74 6865 725f 6669 6c65 0000 r.another_file..
0000060: 0000 0000 0000 0000 0000 0000 0000 00ff ................
0000070: ff00 006a 7067 0073 6f6d 655f 6469 7200 ...jpg.some_dir.
0000080: 696d 6167 6500 0000 0000 0000 0000 0000 image...........
0000090: 0000 0000 0000 ffff 0000 00 ...........
0000000: 3412 aa55 0100 0000 8f00 0000 2000 2000 4..U........ . .
0000010: 6e6f 5f65 7874 0000 0000 0000 0000 0000 no_ext..........
0000020: 0000 0000 0000 00ff ff00 0074 7874 0020 ...........txt.
0000030: 0066 696c 6500 0000 0000 0000 0000 0000 .file...........
0000040: 0000 0000 0000 ffff 0073 6f6d 655f 6469 .........some_di
0000050: 7200 616e 6f74 6865 725f 6669 6c65 0000 r.another_file..
0000060: 0000 0000 0000 0000 0000 0000 0000 00ff ................
0000070: ff00 006a 7067 0073 6f6d 655f 6469 7200 ...jpg.some_dir.
0000080: 696d 6167 6500 0000 0000 0000 0000 0000 image...........
0000090: 0000 0000 0000 ffff 0000 00 ...........

The directory tree is stored in three levels: extension, path, and filename. If any of these are empty (e.g. a file with no extension, or which lives in the root directory) it is represented by a space. For example, for this directory tree:

vpk_test
├── file.txt
├── no_ext
└── some_dir
 ├── another_file.txt
 └── image.jpg
0000000: 3412 aa55 0100 0000 8f00 0000 2000 2000 4..U........ . .
0000010: 6e6f 5f65 7874 0000 0000 0000 0000 0000 no_ext..........
0000020: 0000 0000 0000 00ff ff00 0074 7874 0020 ...........txt.
0000030: 0066 696c 6500 0000 0000 0000 0000 0000 .file...........
0000040: 0000 0000 0000 ffff 0073 6f6d 655f 6469 .........some_di
0000050: 7200 616e 6f74 6865 725f 6669 6c65 0000 r.another_file..
0000060: 0000 0000 0000 0000 0000 0000 0000 00ff ................
0000070: ff00 006a 7067 0073 6f6d 655f 6469 7200 ...jpg.some_dir.
0000080: 696d 6167 6500 0000 0000 0000 0000 0000 image...........
0000090: 0000 0000 0000 ffff 0000 00 ...........

The directory tree is stored in three levels: extension, path, and filename. If any of these are empty (e.g. a file with no extension, or which lives in the root directory) it is represented by a space.

For example, for this directory tree:

vpk_test
├── file.txt
├── no_ext
└── some_dir
 ├── another_file.txt
 └── image.jpg
0000000: 3412 aa55 0100 0000 8f00 0000 2000 2000 4..U........ . .
0000010: 6e6f 5f65 7874 0000 0000 0000 0000 0000 no_ext..........
0000020: 0000 0000 0000 00ff ff00 0074 7874 0020 ...........txt.
0000030: 0066 696c 6500 0000 0000 0000 0000 0000 .file...........
0000040: 0000 0000 0000 ffff 0073 6f6d 655f 6469 .........some_di
0000050: 7200 616e 6f74 6865 725f 6669 6c65 0000 r.another_file..
0000060: 0000 0000 0000 0000 0000 0000 0000 00ff ................
0000070: ff00 006a 7067 0073 6f6d 655f 6469 7200 ...jpg.some_dir.
0000080: 696d 6167 6500 0000 0000 0000 0000 0000 image...........
0000090: 0000 0000 0000 ffff 0000 00 ...........
It's a valid request to ask for specific aspect for a review, no need to put it in parenthesis.
Source Link
Marc-Andre
  • 6.8k
  • 5
  • 39
  • 65

I'm new to Python, but have a lot of experience with C and C++. I would appreciate some feedback on my implementation of a packer for this archive format. (SpecificallySpecifically, any feedback on how idiomatic this code is for Python, and any suggestions about the algorithm used).

I'm new to Python, but have a lot of experience with C and C++. I would appreciate some feedback on my implementation of a packer for this archive format. (Specifically, any feedback on how idiomatic this code is for Python, and any suggestions about the algorithm used).

I'm new to Python, but have a lot of experience with C and C++. I would appreciate some feedback on my implementation of a packer for this archive format. Specifically, any feedback on how idiomatic this code is for Python, and any suggestions about the algorithm used.

Source Link

Python code to write simple VPK archive files

I'm new to Python, but have a lot of experience with C and C++. I would appreciate some feedback on my implementation of a packer for this archive format. (Specifically, any feedback on how idiomatic this code is for Python, and any suggestions about the algorithm used).

VPK is a relatively simple uncompressed archive format. It starts with this header:

struct VPKHeader {
 uint32_t signature; // Always 0x55aa1234
 uint32_t version; // Always 1
 uint32_t tree_len_bytes;
};

In this implementation, tree_len_bytes will always be the length of the entire output file minus the length of the header (12 bytes).

The directory tree is stored in three levels: extension, path, and filename. If any of these are empty (e.g. a file with no extension, or which lives in the root directory) it is represented by a space. For example, for this directory tree:

vpk_test
├── file.txt
├── no_ext
└── some_dir
 ├── another_file.txt
 └── image.jpg

The internal data structure would be:

{
 " ": { // no extension
 " ": [ // root directory
 "no_ext" // filename
 ]
 },
 "txt": {
 " ": [
 "file"
 ],
 "some_dir": [
 "another_file"
 ]
 },
 "jpg": {
 "some_dir": [
 "image"
 ]
 }
}

The binary format this structure is saved in is probably described most succinctly by the pseudocode for reading these files available here. (Briefly, extensions appear in plain text as ASCIIZ strings followed by any number of path strings which can each be followed by a number of filenames (and a brief struct follows each filename). Each group is terminated by an additional null byte.)

Here is the binary output archiving the example directory would produce, for reference:

0000000: 3412 aa55 0100 0000 8f00 0000 2000 2000 4..U........ . .
0000010: 6e6f 5f65 7874 0000 0000 0000 0000 0000 no_ext..........
0000020: 0000 0000 0000 00ff ff00 0074 7874 0020 ...........txt.
0000030: 0066 696c 6500 0000 0000 0000 0000 0000 .file...........
0000040: 0000 0000 0000 ffff 0073 6f6d 655f 6469 .........some_di
0000050: 7200 616e 6f74 6865 725f 6669 6c65 0000 r.another_file..
0000060: 0000 0000 0000 0000 0000 0000 0000 00ff ................
0000070: ff00 006a 7067 0073 6f6d 655f 6469 7200 ...jpg.some_dir.
0000080: 696d 6167 6500 0000 0000 0000 0000 0000 image...........
0000090: 0000 0000 0000 ffff 0000 00 ...........

Each filename is followed by one of these structures denoting a file entry:

struct VPKEntry {
 uint32_t CRC32;
 uint16_t unused; // Always zero in this implementation
 uint16_t index; // Always zero in this implementation
 uint32_t offset; // The offset into the data file where this file is stored
 uint32_t file_len;
 uint16_t end; // Always 0xffff
};

The file contents are stored in a separate file next to the directory file (hex dump above is the directory file, the archive file is just contiguous data from the included files).

And finally, here is my code. It accepts a source directory to archive and an output directory where pak01_dir.vpk (the directory file) and pak01_000.vpk (with the archived files' data) are to be placed.

import sys, os, struct, json, binascii
running_offset = 0
pak01_000 = None
def add_file(ext_path_file, ext, path, file):
 """Add an entry for a file to the extension-path-file map."""
 if path.startswith("./"):
 path = path[2:]
 if ext in ext_path_file:
 xpath = ext_path_file[ext]
 else:
 ext_path_file[ext] = { }
 xpath = ext_path_file[ext]
 if path in xpath:
 xpath[path].append(file)
 else:
 xpath[path] = [file]
def write_file_entry(pak01_dir, srcfile):
 """Write the structure following a file's name and save its data to the data file."""
 global running_offset
 global pak000
 if srcfile[:2] == " /":
 srcfile = "." + srcfile[1:]
 with open(srcfile, "rb") as src:
 data = src.read()
 pak01_dir.write(struct.pack('I', binascii.crc32(data) & 0xffffffff)) # CRC32
 pak01_dir.write(struct.pack('H', 0)) # Preload bytes
 pak01_dir.write(struct.pack('H', 0)) # Archive file index
 pak01_dir.write(struct.pack('I', running_offset)) # Offset into archive
 pak01_dir.write(struct.pack('I', len(data))) # File length
 pak01_dir.write(struct.pack('H', 0xffff))
 running_offset += len(data)
 pak01_000.write(data) # Add the file contents to the main pak
def make_vpk(srcdir, dstdir):
 """Creates a vpk from srcdir and places the output in dstdir."""
 global running_offset
 global pak01_000
 running_offset = 0
 dstdir = os.path.abspath(dstdir)
 os.chdir(srcdir)
 srcdir = "."
 with open(os.path.join(dstdir, "pak01_dir.vpk"), "wb") as pak01_dir:
 # Write VPK header
 pak01_dir.write(struct.pack('I', 0x55aa1234)) # Magic signature
 pak01_dir.write(struct.pack('I', 1)) # Version
 pak01_dir.write(struct.pack('I', 0)) # Directory length -- filled later
 # Prepare dictionary for VPK directory
 ext_path_file = {}
 for root, dirs, files in os.walk(srcdir):
 for f in files:
 path = os.path.join(root, f)
 ext = os.path.splitext(path)[1]
 if ext == "":
 ext = " "
 if ext[0] == ".":
 ext = ext[1:]
 if root == "" or root == ".":
 root = " "
 add_file(ext_path_file, ext, root, f)
 print "VPK Structure:"
 print json.dumps(ext_path_file, indent=4)
 # Write VPK directory and pak000
 pak01_000 = open(os.path.join(dstdir, "pak01_000.vpk"), "wb")
 for ext, path_map in ext_path_file.iteritems():
 pak01_dir.write(ext)
 pak01_dir.write(struct.pack('B', 0))
 for path, filenames in path_map.iteritems():
 pak01_dir.write(path)
 pak01_dir.write(struct.pack('B', 0))
 for filename in filenames:
 if ext == " ":
 filename_noext = filename
 else:
 filename_noext = filename[:-(len(ext) + 1)]
 pak01_dir.write(filename_noext)
 pak01_dir.write(struct.pack('B', 0))
 real_path = os.path.join(path, filename)
 write_file_entry(pak01_dir, real_path)
 pak01_dir.write(struct.pack('B', 0))
 pak01_dir.write(struct.pack('B', 0))
 pak01_dir.write(struct.pack('B', 0))
 pak01_000.close()
 # Fix VPK header directory length
 size = pak01_dir.tell()
 pak01_dir.seek(8)
 pak01_dir.write(struct.pack('I', size - 3 * 4))
if __name__ == "__main__":
 if len(sys.argv) != 3:
 print "Usage: python vpk.py source-dir out-dir"
 exit(1)
 srcdir = sys.argv[1]
 dstdir = sys.argv[2]
 make_vpk(srcdir, dstdir)
lang-py

AltStyle によって変換されたページ (->オリジナル) /