Tree
A Tree object in Git represents a directory. Its contents include other tree objects and blob objects, allowing Git to keep track of the names and hierarchical structure of files and folders in a project.
Structure of a Tree Object
A Tree object is structured as follows:
- Object Type: Always
tree
. - Size: The size of the file content in bytes.
- Null Byte Separator: A null byte (
\x00
) separating the header from the content. - Content: The main body containing the file structure.
- Null Byte Terminator: A null byte indicating the end of the content.
Body
The body of the tree file is formatted as follows:
<fileType> <fileName>\x00<sha1Bytes>\x00
Notes on Size Calculation
- The header title (
blob
) and the final null byte do not count towards the total size. - The first null byte (
\x00
) is included in the size calculation.
Key Details
- SHA-1 Representation: The SHA-1 is stored as raw bytes rather than as a hexadecimal string.
- Continuous Data: There is no explicit separation between each TreeLine; the next
fileType
directly follows the SHA-1 of the previous entry. - File Types: There are four file types commonly used:
100644
: Regular file (blob).100755
: Executable file (blob).040000
: Directory (tree) (although having this 0 at the beginning, must of the time this is dropped of the mode).
Example of a Tree File
A tree object might appear as:
tree 122\x00100644file1.txt\x00<20-byte SHA-1>100755script.sh\x00<20-byte SHA-1>040000subdir\x00<20-byte SHA-1>\x00
Readable Breakdown
For clarity, the example above can be visualized as:
tree 122\x00
100644 file1.txt\x00<20-byte SHA-1>
100755 script.sh\x00<20-byte SHA-1>
040000 subdir\x00<20-byte SHA-1>
However, it is crucial to understand that:
- No Line Separators: The data is stored as a continuous stream without line breaks.
- Separations: The only actual separator (
\x20
) is the space between the title (tree
) and the size of the content.
Decoding the Tree Object
To interpret a Tree object:
- Read the Header: Identify the object type (
tree
) and content size. - Parse the Body:
- Extract each file entry based on the file type, file name, and SHA-1.
- Use the null bytes as markers to differentiate between components.
This structure ensures Git's ability to efficiently manage project directories and maintain their integrity through the SHA-1 hashes for version tracking.