SR3 zone file format

Discussion in 'Guides and Tutorials' started by [V] Knobby, Jul 15, 2013.

  1. [V] Knobby

    [V] Knobby Volition Staff

    The zone file is split up into a header(czh file) and the zone data itself (czn file). We did this to save memory since we only need the header portion for building streaming containers and at load time. We load and then dump this header data.

    There is a v_file_header on top of the czh that describes the things that the zone references. This will be things like textures and meshes and they end up in the str2 file.

    v_file_header:
    uint16 signature - 0x3854
    uint16 version - 4
    uint32 reference_data_size
    uint32 reference_data_start(offset)
    uint32 reference_count - number of references in the file
    uint8 pad[16] - zero'd before write, padding so we can add to the header

    Reference data always follows the header. That data starts at reference_data_start offset from the top of the v_file_header and to get to the next reference(string) you just advance past the string. Double null termination on this string list, but the size is known as well.

    Immediately following the v_file_header is the world zone header:
    uint32 signature - 'Z3RS'
    uint32 version - 29
    v_file_header *ptr - 8 bytes on PC, points back up into the v_file_header since we use that at load time to find filenames(saving some space)
    fl_vector file_reference_offset - applied to all file references to move them to this zone
    wz_file_reference *m_file_references
    uint16 num_file_references
    uint8 zone_type
    uint8 unused
    interior_trigger *ptr - run-time data
    uint16 number of triggers - run-time data
    uint16 extra objects - only used in test levels(not sure for what)
    uint8 pad[24] - zero'd before write, padding so we can add to the header

    In the zone file itself we find a chunked format. It is a series of sections of data each with a header that describes the content. A section header looks like this:
    uint32 id - highest bit is a flag that tells us if the section has a gpu size. This flag is masked off after storing if the gpu size is to be expected.
    uint32 cpu size - cpu size of this section (does not include the header which is variable size)
    if (gpu flag set) {
    uint32 gpu_size
    }

    struct wz_file_reference {
    et_int16 m_pos_x;
    et_int16 m_pos_y;
    et_int16 m_pos_z;
    et_int16 m_pitch;
    et_int16 m_bank;
    et_int16 m_heading;

    // offset into v_file_header string pool for the filename
    et_uint16 m_str_offset;
    }

    Section ids:
    0x2233 - crunched reference geometry - transforms and things for level meshes
    0x2234 - objects - nav points, environmental effects, and many, many more things
    0x2235 - navmesh
    0x2236 - traffic data
    0x2237 - world editor generated geometry - things directly made from the editor like terrain
    0x2238 - sidewalk data
    0x2239 - section trailer (??)
    0x2240 - light clip meshes
    0x2241 - traffic signal data
    0x2242 - mover constraint data
    0x2243 - zone triggers(interiors, missions)
    0x2244 - heightmap
    0x2245 - cobject rbb tree - cobjects are things that are not a full on object like tables and chairs
    0x2246 - undergrowth - foliage
    0x2247 - water volumes
    0x2248 - wave killers
    0x2249 - water surfaces
    0x2250 - parking data
    0x2251 - rain killers
    0x2252 - level mesh supplemental lod data
    0x2253 - cobject grid data - object fading
    0x2254 - ae rbb (??)
    0x2255 - havok pathfinding data(SR4 only?)

    Object Section Data
    Object header data on top:
    uint32 signature - 0x574F4246
    uint32 version - 5
    int32 num_objects
    int32 num_handles
    uint32 flags
    uint64 *handle list - run-time
    uint8 *object data - run-time
    uint32 object data size - run-time

    Header is followed immediately by handle list(64 bit handles), which is followed by object data.

    The data is a series of property blocks:
    uint32 *handle - offset on disk
    uint32 *parent handle - offset on disk
    uint32 object type hash
    uint16 number_of _properties
    uint16 buffer size
    uint16 name offset - offset into the property list for the name of the object
    uint16 padding
    rest of the data is the property list itself. This is different data for each object, but the property list is just:
    uint16 type
    uint16 size
    uint32 name crc
    followed by data. We store enumerations, flag data, strings, binary buffers, transforms, and other things in here and each object type looks for specific data based on that object.

    type values are:
    0 - string
    1 - data
    2 - compressed transform - pos only(fl_vector)
    3 - compressed transform with quaternion orient (fl_vector pos followed by fl_quaternion for orient)

    fl_quaternion is a fancy want to compress data. It is a x, y, z, w float that represents a roation.
     
    Last edited: Jul 7, 2015
  2. Quantum

    Quantum Administrator Staff Member

    Hi, Knobby. Thanks for all this great information!
    I've been writing a parser for the SR3 zone file header and data files, and I have a few questions. These questions are all based on my parsing and observations of the files "sr3_city~fm06`.czh_pc" and "sr3_city~fm06`.czn_pc":
    1. What is the format of the "fl_vector" structure which is in the world zone header? It appears to contain four 32-bit floating point numbers (16 bytes total), but I'm not sure if this is correct. I am thinking it's x, y, z, and maybe w, but I'm not sure.
    2. In reference to the "uint32 id" (section ID) that begins each zone file section/chunk: After stripping off the high bit, I'm left with a number that seems to range from 0x2233 on up rather than a section ID from 0 to 23 as referenced in your list. After I subtract 0x2233 from the value, it seems to be a valid section ID from your list. Is this correct?
    3. Immediately following the world zone header, there appears to be a list of "num_file_references" 14-byte entries which doesn't appear in your description above. What is the format of each 14-byte entry?
    4. What is the enumeration for the "uint16 type" values in the property list? I've determined that type 0 represents a character string, but I' don't know what the others represent.
    5. What name does the "uint32 name crc" in each property refer to? Is this referenced somewhere else?
    Thanks so much for your help!

    EDIT: I've uploaded the output of my parsing program so you can see the progress I've made so far.
     

    Attached Files:

    Last edited: Jul 6, 2015
    flow754 and [V] IdolNinja like this.
  3. [V] Knobby

    [V] Knobby Volition Staff

    This is actually just 3 floats for x, y, z offset. We do have some vectors that are 4 element, but this one is just three. You could be seeing padding in the struct here maybe?
    Found out the ids I gave are just used as an index into an array of section offsets. You are correct that these start at 0x2233, but they are non contiguous, so I'll update my post above with the information.
    Completely missed this. There is a pointer above the num_file_references which gets set to the data right after the header. This is an array of all the meshes in the zone with their locations. We mostly use this data for tools and building the zone containers. For example, we put the high lod mesh into the high lod zone that corresponds to the location in the file. Since we have 4 possible high lod zones per medium lod zone, we need to know where it is so we know the proper one to send it to at crunch time. I'll update and add that struct as well.
    I'll update the above list, but it looks like have only have 4 types. string, data, compressed transform with just pos and compressed transform with quaternion orient.
    To save space we often will use a crc instead of a name to identify something. We can still do things like find_prop_block("property_name") or find by crc, but internally everything is smaller.
     
  4. Quantum

    Quantum Administrator Staff Member

    Thank you so much, Knobby! This information is very helpful! I'm sure I will have more questions, but for now I have two:
    1. In the world zone header, what are the enumeration values for the "uint8 zone_type" field?
    2. The coordinates in the world zone mesh file references are of type "et_int16" (16 bit values). I am displaying them as integers, but is there a factor I can apply that will translate them to standard game coordinate units?
    Thanks again!

    EDIT: I edited this post to add a question, and I've uploaded the output of my updated parser program. I still haven't tried it on more files yet, but I will when I get a chance.
     

    Attached Files:

    Last edited: Jul 7, 2015
  5. [V] Knobby

    [V] Knobby Volition Staff

    // Types of zones
    enum world_zone_types {
    WZT_UNKNOWN = 0,
    WZT_ALWAYS_LOADED,
    WZT_STREAMING,
    WZT_STREAMING_AL,
    WZT_TEST_LEVEL,
    WZT_MISSION,
    WZT_ACTIVITY,
    WZT_INTERIOR,
    WZT_INTERIOR_AL,
    WZT_TEST_LEVEL_AL,
    WZT_MISSION_AL,
    WZT_HIGH_LOD,
    NUM_WORLD_ZONE_TYPES
    };

    Some information about these: We have 2 distinct types of worlds that we load in SR3/4. We have a test level, which is a single zone of arbitrary size and then we have a streaming level which is broken up into pieces. We also have a global always loaded(WZT_ALWAYS_LOADED) named after the world and then individual always loaded files for each zone(WZT_STREAMING_AL). Missions and activities have a big "zone" that is loaded when the mission or activity is active. They can also have an always loaded portion.

    We use the following to get an absolute position. Since the position is inside the zone, we add the reference position to it to get world offset:

    // transform of the instance
    fl_matrix43 transform;
    ref->m_transform.get(transform);
    transform.m_translation += header->m_file_reference_offset
     
  6. Quantum

    Quantum Administrator Staff Member

    Thanks again, Knobby!
    In the file I decoded, the fl_vector file_reference_offset contains (x=-249.6486, y=375.8364, z=-786.5045).
    The first wz_file_reference contains (m_pos_x=-1605, m_pos_y=19174, m_pos_z=804).
    The values in wz_file_reference seem much larger than I would expect, so I assume the units are different (especially since they are integers). There's probably a multiplier to convert fl_vector units to wz_file_reference units. Could you tell me what that multiplier is?

    My goal is to write a conversion tool to convert zone files to XML files which can be edited with any text editor, and then convert the XML files back to zone files. I know it won't be able to parse everything in a zone file, but hopefully people will find a tool like that useful.

    EDIT: One more question I just thought of: is this file format the same across SR3, SR4, and SRGOOH? Thanks!
     
    Last edited: Jul 8, 2015
  7. [V] Knobby

    [V] Knobby Volition Staff

    The compressed pos can be decoded as such : (float)pos / (float)(1 << 6). If you want to convert the orient, use (float)x / (float)(1 << 12).

    I like it and I'm sure people we use it even if you're only able to read certain types of things. For example, it would be great to create a new zone in the AL that has some objects in it for like activity starts.

    I'm almost certain it is the same for all. I don't think we changed any of it. Outside of binary data that might be in there like havok data it should be the same. Objects like nav points and such should be identical.
     
  8. Quantum

    Quantum Administrator Staff Member

    I really appreciate all your help, Knobby. I created a topic where I've uploaded my tool which parses and displays the contents of a Zone file here:
    https://www.saintsrowmods.com/forum/threads/saints-row-3-4-gooh-zone-file-tools.12631/
    I tested it with various other zone files, and it seems to successfully parse to files I've tried. I'm still working on adding your latest information, and I've started the XML converter tool too.
    I think I've got enough information to make good progress now, but I'm sure I'll have more questions. Thanks again!
     
  9. Quantum

    Quantum Administrator Staff Member

    Hi Knobby! I've made significant progress on my ".czn_pc" <==> "XML" converter program, and I have a couple more questions.

    1. I know property names are stored as CRCs ("uint32 name crc") to save space (as you told me in this post), but is there any way I can obtain a list of the actual text names for the CRCs in the zone file? If I can get a list of text names, I can build a dictionary to perform reverse lookups and display the friendly text name of each property.

    Minimaul, please chime in here if you can help out for this next question:

    2. Related to question #1: What is the algorithm you use to calculate the CRC for property names? I found some CRC calculations in Minimaul's code here: https://github.com/tomjepp/ThomasJepp.SaintsRow/blob/master/SaintsRow/Hashes.cs. If one of these is appropriate, which one is it -- "CrcVolition()" or "HashVolition()"?

    3. In your spec, there's the "uint16 name offset - offset into the property list for the name of the object". In many cases this points to a valid string value in one of the properties, which I display in my "SRReadZone" tool, but in some zone files this is frequently 0 (zero). Does a zero value represent an unnamed object or is there another algorithm I can use to determine the object name (e.g. the first string property in the property list for that object)?

    EDIT: (added question #4)
    4. The "uint32 object type hash": Is there a way to reverse lookup that so I can convert it to a human readable string? (similar to questions #1 and #2)

    EDIT #2: (added question #5. I know, now I'm just being a pest! :p)
    5. Is the "handle list" at the beginning of the object data simply a sorted list of all the handles that appear in the objects themselves? I know your spec says each object handle is "uint32 *handle - offset on disk", but in my observation it's actually a 64-bit handle which matches one of the handles in the handle list. So I'm guessing I may be able to generate the handle list from the object data when I write a new ".czn_pc" file, so I don't have to store the handle list separately. Answered this myself: Yes: it's a list of handles in the objects, but No: it's not sorted. I can generate the list of handles from objects programmatically, but it won't always be in the same order as the original file. Is that a problem, or doesn't the order matter?

    Thanks for all your help, Knobby (and Minimaul)! :)

    P.S. I know double posts are against the rules on this forum, but I'm guessing enough time has passed since my last post and this one is a new topic (so it's okay? :cool:). Minimaul, if you want me to combine this with my last post, I will do that.
     
    Last edited: Jul 21, 2015
  10. Minimaul

    Minimaul Site owner Staff Member

    It's nearly always CrcVolition.

    It's fine in this instance - you're not posting again straight away, and you've added significant new content to the thread rather than just a "bump".
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice