It's worth noting that the saved game structure has never been made public and so figuring out what it all means involves a lot of guesswork and experimentation. For this reason, editing .sav files beyond some very simple changes is not really possible as you would generally need to update all of the relevant primary keys in the file in order to reflect and added/deleted entries. This is not possible without knowing the full structure of each part of the .sav file. Hence this guide is focussed on exporting data from saved games (such as match stats) rather than editing. But this guide would of course be a helpful starting point for those looking to potentially edit data in saved games as the same principles would apply.
The below is written based on my findings from experimenting with EHM 2007 saved games. However, it looks like the overall structure of the EHM 1 saved game files is the same, so the below should equally apply.
Getting Started
You will need the following tools in order to get started:
- A copy of EHM Editor: https://www.ehmtheblueline.com/editor (at time of writing, only EHM Editor v1 supports saved games; EHM Editor v2 does not yet);
- A hex editor of your choice. There are lots of free options out there. I purchased a copy of HHD Hex Editor Neo as it was cheap and has a lot of useful features. There is also a free version of HHD; and
- Something along the lines of the Calculator that comes with Windows or access to an online hex to decimal convertor (https://www.thecalculatorsite.com/math/ ... ch as here).
A very basic understanding of how databases work is ideal but not essential. The minimum you need to know is that databases use what is known as a primary key:
As you will see from the EHM 20007 database structure, primary keys are used throughout the database and I expect the same is the case for all parts of the saved game given that this is an elementary part of database design (from what I can tell, the .sav saved game is just one big database of sorts). An example of a primary key is Club ID. Every club in the database/saved game must have a Club ID. This ID is then referenced in other parts of the database; e.g. to assign a particular club as a player's Club Contracted or Club Playing For field, you would enter the relevant club's Club ID. So when you are looking through the various parts of the saved game, it is generally things such as Staff ID, Club ID and Club Competition ID which you're looking out for.https://www.techtarget.com/searchdatamanagement/definition/primary-key wrote:A primary key, also called a primary keyword, is a column in a relational database table that's distinctive for each record. It's a unique identifier, such as a driver's license number, telephone number with area code or vehicle identification number (VIN). A relational database must have only one primary key. Every row of data must have a primary key value and none of the rows can be null.
Knowledge of Data/Coding
A very basic understanding of how binary data is stored in files is also ideal but not essential. The minimum you need to know is that each byte of binary data is represented in hexadecimal format (also known simply as hex). There are different data types which are of varying length in bytes. The main ones are as follows:
True/False:
Bool: 1 byte (0 = false / 1 = true / any non-zero number = true)
Whole numbers:
Char: 1 byte
Short: 2 bytes
Integer (aka Int): 4 bytes
Long Integer (aka Long Int)): 8 bytes
Decimal numbers:
Float: 4 bytes
Double: 8 bytes
Given that each byte can represent a finite number of possible values, the larger data types can represent a wider range of numbers. It is possible for chars, shorts, ints and long ints be unsigned (meaning that the lowest possible value is zero) or signed (meaning that the lowest possible value is a negative number). Details of ranges for each data type (both signed and unsigned) can be found here: https://www.tutorialspoint.com/cplusplu ... _types.htm
As you will see from the EHM 20007 database structure, primary keys (e.g. Club ID, Nation ID, Staff ID, etc) are signed ints. The first ID in each table is always zero. Usually a -1 (or sometimes a -2) means no ID - for example, a free agent would have their Club Contracted and Club Playing For set to -1 to denote that they are not contracted to or playing for any club.
It is important to note that the .sav file seems to use little endian (as opposed to big endian) format. So take for example the number 45,102 which is B02E in hex code. As an integer (being 4 bytes) this can be represented as 2E B0 00 00 (little endian) or 00 00 B0 2E (big endian). As we are using little endian format, we'd go with 2E B0 00 00. So to decode this, you'd need to read it backwards. If you open the Calculator in Windows and click on the menu and select Programmer, you will see that there is the option to enter hex format. Click on hex and then enter the little endian pairs in reverse and you will see that there is a line which shows the decimal equivalent. Let's take the example of 2E B0 00 00: Enter each pair from right to left into the calculator, so enter "00", "00", "B0", "2E" (you will find that the calculator ignores the initial zeros which is fine). You'll see that the DEC line in the calculator shows 45,102 as the result:

This is a really important concept to understand in order to be able to "read" the hex code in the .sav files and identify patterns. If this isn't clear then trying Googling "endianess" or "little vs big endian" for more examples.
In addition to the above, you will find that text strings are either represented as:
- an array of ASCII chars (i.e. one byte per character); or
- a UTF-style string (i.e. two bytes per character)
Code: Select all
44 61 76 65
Code: Select all
44 61 76 65 00 00 00 00 00 00
Code: Select all
04 00 00 00 44 00 61 00 76 00 65 00 0b 00 00 00
04 00 00 00 = denotes that the string is 4 characters long
44 00 = D
61 00 = a
76 00 = v
65 00 = e
0b 00 = denotes that this is the end of the string (also known as a null character)
Helpfully most hex editors will show decoded char arrays and strings alongside the raw hex, which makes things much easier. Here's an example from the first_names.dat sub-file of a saved game:

The Saved Game Structure
The easiest way to see what is within a saved game is to open it with the EHM Editor. Note that the Editor and this guide only applies to uncompressed saved games (i.e. with the Save Compressed setting disabled in EHM). I have never looked at how compressed saved games are compressed, so you will need to disable the Save Compressed setting in EHM when saving the game you want to look at.
Having opened the saved game in the Editor, click on Data -> Saved Game Index. This lists out the constituent parts of the saved game. You will see that the saved game consists of a number of files stored within the .sav file. I will refer to these as sub-files within this guide for clarity. If you click on File -> Unpack, you can extract the sub-files into a folder of your choosing. This is the easiest way of accessing the sub-files.
The alternative way of accessing the sub-files is to write your own code in something like C++, C# or Python. The first 12 bytes of the .sav file consists of a header as follows:
Code: Select all
int compressed_flag (0 = uncompressed / 1 = compressed)
int header_flag (not sure what this means);
int sub_file_count (denotes the number of sub_files contained with the .sav file)
Code: Select all
unsigned int file_pos (indicates the sub_file's position/address (in bytes) within the .sav file - e.g. 0 = first byte of the file, 1 = second byte of the file)
unsigned int file_size (indicates the size of the sub_file (in bytes))
char sub_file_name[260] (an ASCII char array (260 bytes in length) denoting the file name of the sub_file);
So in pseudo-code, the function to parse the .sav file would be as follows:
Code: Select all
std::fstream file ("game.sav", std::ios::in | std::ios::binary);
// STEP 1: Read the header
file.read(&compressed_flag, sizeof(compressed_flag));
file.read(&header_flag, sizeof(header_flag));
file.read(&sub_file_count, sizeof(sub_file_count));
// Abort if compressed
if(compressed_flag != 0)
return;
// STEP 2: Read each index entry
// Some vector containing each index entry (this would be a class/struct containing unsigned int file_pos, unsigned int file_size and char[260] sub_file_name))
std::vector<IndexEntry> index;
for(int i = 0; i < sub_file_count; ++i) {
IndexEntry index_entry;
file.read(&index_entry.file_pos, sizeof(index_entry.file_pos));
file.read(&index_entry.file_size, sizeof(index_entry.file_size));
file.read(&index_entry.sub_file_name, 260);
}
// STEP 3: Read each sub-file by iterating over each index entry
// Personally I'd use something like for(const auto &index_item : index) but I've used a simpler form for clarity.
for(int i = 0; i < index.size(); ++i) {
const auto &index_item = index[i]; // Reference to the relevant entry of the index
// The following code would read the binary data of each sub-file into a temporary array/buffer
char buffer[index_item.file_size];
file.seekg(index_entry.file_pos, ios::beg); // Navigate to the position of the sub-file
file.read(&buffer, index_item.file_size); // Read the sub-file into the buffer
}
The Sub-Files
Once you have extracted the sub-files from the .sav file then this is where the guesswork and experimentation takes place. The sub-files consist of a database.zdb file and a number of .dat and .tmp files. The database.zdb file is basically a copy of the starting database.db file which then appears to be modified by EHM as the game progresses. The EHM Editor parses the database.zdb file when loading a .sav game into the Editor and so the various editing screens within the Editor are populated using the data from the database.zdb file. So you can use the Editor to check what primary keys are assigned to various items (e.g. to check what Club ID is assigned to Anaheim Ducks, etc). It seems that the game doesn't store things such as player career history in the database.zdb file (I suppose it must be stored in another sub-file) and so you will that no player career history entries will appear in the Editor.
My guess is that the .dat sub-files contain permanent data (such as player stats, club histories, club competition histories, etc) whereas the .tmp files presumably store more temporary data relating to playable leagues, etc.
So how do you figure out what is in a sub-file? By opening it up in the hex editor of your choice and trying to identify patterns. I have found that the first 4 bytes of some sub-files is an int specifying the number of entries stored within that sub-file (but some sub-files do not appear to have this). That int is then followed by each entry (aka record). Typically each entry starts with an entry/record ID (i.e. primary key) which is usually an int (i.e. 4 bytes) but can sometimes be a char or possibly IIRC a short. Sometimes however the primary key is located later on within each entry, so it isn't absolutely always the first few bytes of an entry.
My starting point is to look at the first four bytes of the sub-file and convert this to decimal format. If it is a zero then it is probably a record ID (because the first record ID is always zero - i.e. 00 00 00 00 in hex as an int). If it is something else then it might be a record count (i.e. indicating the number of entries in the sub-file). There is always a possibility that the record count and/or record IDs might be a char (e.g. zero = 00 in hex) or short (e.g. zero = 00 00), so it's always worth trying that if no obvious pattern appears from looking at them as ints.
Assuming that the initial 1/2/4 bytes denote a record count as a char/short/int then you can try the following calculation: Take the file size in bytes of the sub-file and subtract the size of the record count (e.g. if the record count is 4 bytes then subtract 4 from your file size). Then divide that figure by the record count. So if you had an int record count of 4 and your file is 48 bytes in size: 48 - 4 = 44. 44 / 4 = 11 bytes. This may indicate the number of bytes per record, assuming that each record is of a fixed size. You can then open up the sub-file in your hex editor, delete the initial record count (e.g. the first 4 bytes) and set your hex editor to arrange the remaining data so that it displays one record per row. This option isn't always possible for every hex editor, but certainly HHD does this (which is one reason I use it). So again taking my example, if I delete the initial four bytes, I'm left with 44 bytes of data and I would then set HHD to show 11 bytes of data per row. This will nicely show one record per row which makes it much easier to interpret patterns.
The above of course assumes that each record has the same number of bytes per entry. There are some sub-files which appear to have variable sizes per entry or possibly just one very complicated entry. They are going to be particularly challenging, if not impossible, to decipher.
As for the decoding each record of a sub-file, it really is just a case of guesswork and trying to identify patterns. I have found that a good starting point is to try to identify potential references to primary keys (which will be ints - i.e. 4 bytes) and cross-reference these to the ID values shown in the Editor when viewing the saved game. For example, if you're looking at club histories there will probably be references to Club IDs and Club Competition IDs. Similarly, player stats will likely include a Staff ID. However, it might be possible that the record ID doubles-up as reference to the Club ID/Staff ID, etc.
Worked Example: HostCountry.tmp
Opening up HostCountry.tmp from my example 1974/75 saved game looks like this (as an aside, note the right hand margin where there is a very clear repating pattern which suggests that there is a fixed size of record in this sub-file):

You will see that the first four bytes are 67 00 00 00. Lets assume that is an int denoting the record count, so that's 103 in decimal format. According to Windows the file size is 6,596 bytes (right-click on the file, click on Properties and look for the Size property. Ignore the Size on Disk property). 6,596 - 4 = 6,592 bytes. Dividing this figure by 103 = 64. So it looks like 64 bytes. Deleting the first four bytes from the file and setting HHD to show 64 bytes per row shows this(to do this in HHD click on View -> Columns -> Custom and enter 64):

Now there definitely appears to be a pattern emerging, so we can be pretty confident we've figured out the size of each record. The next task is to figure out the fields in each record. It doesn't look like there is a record ID at the start of each record as otherwise the first row would be 00 00 00 00, the second row 01 00 00 00, the third row 02 00 00 00, etc. The HostCountry.tmp sub-file will inevitably include a Nation ID to indicate the host country and a Club Competition ID to indicate the club competition. There's probably also a short to indicate the season/year.
Let's take the following record as an example:

Seeing as this is a 1974/75 database, it's worth checking what 1974 is in hex (you can use the Windows Calculator to do this). As a short 1974 is B6 07. So if we see anything which looks like B6 07 or thereabouts then it's probably a year. As it happens, the above example includes a D6 07 (at byte positions 06 and 07) which is 2006 in decimal. I suspect that the immediately preceding F9 might be a char representing the day of the year (e.g. 0 = 1 Jan, 1 = 2 Jan, etc). So F9 = 249. Seeing as zero represents the first day of the year, 249 represents the 250th day of the year = 7 September.
It looks like there are a few sections which might be ints starting at positions 00, 08 and 0C (amongst others). So that's the following ints in hex which convert to decimal as follows:
42 02 00 00 = 578
35 00 00 00 = 53
61 00 00 00 = 97
By exporting model spreadsheets from the Editor for Club Competitions and Nations in the Editor we can look up Club Competition IDs and Nation IDs (go to the Club Competitions screen in the Editor and click on Export -> Model and then do the same from the Nations screen). Interestingly, Club Competition ID 578 = World Junior Championships U-20 Div 1, Nation ID 53 = Denmark and Nation ID 97 = Italy. So it's looking like we've identified the fields for the club competition, date/year and two or more host countries. Obviously if you sim to 2006 (or find an earlier example) in-game then you can verify the findings.
Further Reading
This thread has some discussion and findings from back in 2013 when we were looking at EHM 2007 saved games: viewtopic.php?t=10423&start=25