Any FAT partition has two main parts: system area and data area. System area contains FAT boot record (every file system has a boot record), 1st FAT and 2nd FAT. FAT12 and FAT16 also have Root directory in the system area. Data area consists of Root directory in case of FAT32 and file and subdirectory data in clusters.
For the cluster addressing there were 12 bits available, which is 2^12 clusters at most (4096 clusters).
For the cluster addressing there were 16 bits available, which is 2^16 clusters at most (65536 clusters).
For the cluster addressing there were 28 bits available (not 32), which is 2^28 clusters at most (268 435 456 clusters). 4 bytes were reserved. Unlike in FAT12 and FAT16 the Root directory is in the data area, giving more space for data.
Uses all 32 bits for cluster addressing. Thus, the maximum number of clusters is 2^32 which gives us 4 294 967 296 clusters to call. First supported by Windows CE 2006. Max volume size was 64 zettabytes ❓. File size limit - 16 exabytes ❓. Cross-platform. Used for large external media.
Volume Boot Record
⚠️ Not the MBR!!!!
Located at sector 0 of the volume (❗️ not the physical sector 0). Starts with a jump instruction 3 bytes long at offset
0x0 (relative to VBR). Contains information about the volume (offset relative to the VBR start - size - name - description). Relevant information below:
0x3- 8 - OEM ID. Most likely MS-DOS5.0 for Win2000 and above.
0x0B- 2- Bytes per sector (512 usually)
0x0D- 1 - Sectors per cluster
0x0E- 2 - Reserved sectors
0x10- 1 - Number of FATs (2, one of them is for backup purposes)
1C- 4 - Hidden sectors 🙈 (preceding the volume)
- Total sectors (size of the volume)
0x16- 2 - Sectors per FAT (FAT12 and 16)
0x24- 4 - Sectors per FAT (FAT32)
0x2C- 4 - Starting cluster of the root dir (2nd usually)
0x32- 2 - Back-up boot sector location (6th usually)
0x43- 4 - Volume serial number. In case of thumbdrive, this serial number can be used to track the device across the PC and other systems.
0x47- 11 - Volume name/label 🏷 (not user defined, “NONAME” usually)
0x52- 8 - FS type
Here is the full information available:
The name speaks for itself. It’s the highest node in the dir structure of this volume, consists of 32 byte dir entries. Lists files and dirs in the root directory. FS stops reading these entries when it sees anything starting with
0x0. So, data written here won’t be seen by the OS and this is one of the ways to hide data. Types of root dir entries:
volume name (user created)
short file name (8 uppercase letters + “.” + 3 letters for extension)
8 bytes for the file name (uppercase), always includes a
~at the -2 offset.
3 bytes for extension
1 byte for attributes (hidden (
0x02), read-only (
0x01), system (
0x04), volume label (
0x08), directory (
0x10) and archive (
0x20)). These attributes can be combined. Flags occupy just one byte and when there are more than one flag, their values are combined (like the access flags on Unix systems). See the attributes on the picture below. Only the bit for
volumeis set, which, given its position, has the value
1 reserved byte -
0x00for the long file name (not-8.3 compliant) and
0x10for short file name (8.3 compliant) . In my case, however, it was always
1 byte for created time in 10 milliseconds
4 bytes for created date and time. See how the data is converted in the Timestamps section below.
Two bytes for Last accessed date (⚠️ no time!)
Pointer to the first cluster of the file/directory (high word). If the file is somewhere close to the disk start, it will be equal to
Four bytes for modified date and time
Pointer to the first cluster of the file/directory (low word).
Four bytes for the file size in bytes. ⚠️ It’s always
00 00 00 00for directories!!!
long file name. Can consist of several entries (each 32-bits). If the file name is more than 8 characters, there will be more than one long entry. These are called a set. The last one contains the last characters + extension.
- 1 byte for a status byte. When the entry is the last in the set (1 set for each file), its sequence number starts with
4. Otherwise, it indiacates the entries number.
- 10 bytes for the file name (Unicode chars), i.e. for 5 characters of the name since one Unicode character occupies 2 bytes. If the file name is not long enough, unused bytes are filled with
- 1 byte that’s always
0x0Fand indicates a long file name.
- 1 byte reserved. In case it’s
0x00- it’s a long file name. If it’s
0x10- short file name. So, along with the previous byte it can be used to determine that’s the long name entry . In my case, however, it was always
- 1 byte for error correction (checksum)
- 12 bytes for the next part of the file name (6 Unicode characters). If the file name is not long enough, unused bytes are filled with
- 2 bytes of zeroes
- 4 bytes for the next part of the file name (2 Unicode characters). If the file name is not long enough, unused bytes are filled with
- 1 byte for a status byte. When the entry is the last in the set (1 set for each file), its sequence number starts with
See below the example of a long entry set for a file with the name
asdjasdlkjasldkjsalkdaskljdjaljdajd.txt. The lowest part - the short file name. Then there are three long file name entries. The first byte in the yellow area is
0x01, meaning, it’s the first entry in the set. The next (green one), has the first byte (sequence byte) set to
0x02, indicating it’s the second entry in the set.The last one (on the top, colored in red) has the status byte
0x43. The first nibble (
4) indicates that this is the last long entry in the set. The second nibble (
3) is the sequence number.
If the entry is a folder, its size will be
00 00 00 00 (the last 4 bytes). In its short file name entry find the first cluster, go this cluster and you’ll see another list of files but that are in this folder. Find the entry for the file you are looking for.
FAT high word. Start at 65536 and continue as power of two.
❓ Hot to get there?
From the FAT32 Boot Record get the following information:
- bytes per sector (2 bytes at offset
0x0B) - marked with orange 🍊
- sectors per cluster (1 byte at offset
0x0D) - marked with green 🍏
- number of reserved sectors (2 bytes at offset
0x0E) - marked with dark blue 🫐
- number of FATs (one byte at offset
0x10) - marked with coral 🦞
- sector per FAT (four bytes ar offset
0x24) - marked with blue 🌊
All this information is needed to get the offset to the start of data area (in sectors!!!!). Since root directory is the first in the data area (FAT32 only), this will give us what we are looking for. Root directory is usually at the 2nd cluster relative to the start of the VBR, but better check at the offset
0x2C (keep in mind the endianness). Then, we need to calculate the number of bytes to the data area/root directory.
int root = (number_of_FATs * sectors_per_fat) + reserved_sectors
Here is an example:
The values on the right pane represent raw data in hex, and on the left - its human-readable interpretation by Active@Disk Editor using a FAT32 Boot Record template. We get the value at
0x0B which is
0x200 (when coverted to little-endian) and which equals 512 in decimal (marked in red). These are bytes per sector 🍊.
We then go to offset
0x0D which is
0x8 and 8 in decimal (marked in green). That’s sectors per cluster 🍏. Reading at offset
0x0E 🫐 I get the number of reserved sectors (4110 in decimal). The number of FATs 🦞 is 2 (standard for FAT32) at offset
0x10 and the number of sectors per FAT 🌊 is 2041 in decimal.
Let’s use the data in the formula:
(2 * 2041) + 4110 = 8192. This is the starting sector of data area and usually it’s the same with the root directory. Now, to get the numbrer of bytes to this root directory, we need to multiply
8192 by bytes per sector (512) = 4194304 in decimal (
0x400000 in hex). Our VBR starts at
0x00010000 in hex or
65536 in decimal. Adding 4194304 to 65536 (or
0x10000) will give us the offset to root directory from the start of VBR ( ⚠️ not GUID/MBR header). This value is 4259840 (
0x410000). Let’s go there in hex editor or if you are using Active@Disk Editor, using Go to offset button on the top pane. Voila!
You may ask, what’s the use of this information (knowing exactly how to find the offset of the root directory). There might be cases when we need to perform manual file recovery using a hex editor (remember, Active@Disk Editor is for Windows and Linux only) or with limited tools. Also, I find it useful to understand what the tool does, since tools fail too sometimes, as well as humans. We need to verify each other from time to time.
❓ One question that remains for me here: why the hell is the root cluster
0x2? The second cluster is
2 * (8 * 512) = 8192bytes away from the VBR start and there is no root dir there. However, when clicking on the field in Active@Disk Editor, you’ll get to the root dir.
Directories (not Root)
Entries for directories have
00 00 00 00 as file size,
0x10 as attributes value and
20 20 20 as extension. They point to some location, where files are listed. So, basically, a folder is just a pointer.
Root directory is the first in the tree, but not the only one. Each directory on the drive will have its own “table of contents” and its structure is a little different. It also consists of file name entries, each - 32 bytes long for FAT32.
The first 32 bytes start with a
0x2E byte (
. in ASCII). The rest of the information has the same structure as an ordinary SFN (short file name). In this case, about this directory itself.
The next 32 bytes start with a double
0x2E byte (
. in ASCII), i.e.
... The first cluster number would be
00 00 if the parent is the root directory. The rest of the information has the same structure as an ordinary SFN (short file name). In this case, about the parent directory.
🗒 If you are using a Terminal or console or PowerShell and
cdsomewhere from time to time, these both entries are not mysterious for you. For those who are not -
.denotes the current directory,
..the parent one. For example, you have the following folder structure:
< current directory
If we opened folder4, the value of
.(current directory) is the address of folder4. While remaining in this folder,
..(parent directory) for folder4 is folder1.
..(parent directory) for folder1 is root. So,
File Allocation Table (FAT)
Keeps track of clusters in use and free ones. There are FAT1 and FAT2 (the same). FAT2 for backup. Both are located in the system area. Also, singly-linked list, each entry points to the next cluster of a file (in case a file is fragmented, i.e. occupies more than one cluster).
0x00000000 - the cluster is free,
0xffffff0f - end of file, then pointer to the next cluster (if any).
The first four bytes in the FAT32 (two bytes in FAT12 and FAT16) is the media descriptor (
0xF8FFFFFF, usually indicates a fixed disk). The next four bytes are for the FAT type (
0xFFFFFFFF in case of FAT32). Then each four bytes tell us about each cluster ordered sequencially, i.e. CL-2 (contains the root directory in FAT32), then CL-3, then CL-4 etc. It can have three possible values:
0x00000000 - if the cluster is free,
0xffffff0f - if the cluster is the last in the chain (end of file, EOF marker) or a pointer to the next cluster (if any, convert to little-endian and then to decimal to find it).
🛠 Using Active@Disk Editor, press
Navigate > Primary FAT32 > FAT1 button on the top pane. Here is the example from my flash drive:
Discarding the media descriptor and FAT version (the first 8 bytes) in the picture above we can see several clusters, most of them containing EOF marker (
0xFFFFFF0F). That means, most of the files occupy one cluster only. Clusters 2-12 are marked with green and orange rectangles for better accessibly. Cluster 13 is marked in red and its value is
330) after converting to little-endian. That means that cluster 13 is not the last one in the chain and the next cluster is cluster number 330.
Now, this is what a contiguous file would look like in FAT table (a file that occupies more than 1 cluster and the clusters following one another on disk):
Let’s read each 4 bytes starting from the cluster with the value
0x70 00 00 00. This is the first cluster in the chain that points to the next cluster (
0x70 cluster, i.e. cluster 112 in decimal). Cluster 112 (
0x70), the next one, points to the next cluster in the chain (
0x71, i.e. 113) and so on and so forth. But as you may notice it’s easy to note such chains.
To make it easier to read FAT32 table (which has 4 bytes for each entry), you may change view preferences in Active@Disk Editor (
File > Preferences > Disk Editor > Bytes per line > 4).
Local times, not UTC! For Last Accessed we only have date, no time!
Four bytes for date and time created (first two bytes for time 🕰 and the second two bytes for date 📆). The time bytes are first converted from the little-endian notation (two bytes flipped). In the example below, for the yellow short entry the created date and time
0x55 0x6E 0x31 0x53. Bytes
0x55 0x6E are for the time and
0x31 0x53 are for the date. Let’s take the time bytes
0x55 0x6E. Flip them to convert from the little-endian notation:
06E 0x55. Convert each nibble to a binary value:
b0101. Now, write them in a row and separate with the following template in mind: 5 bits - 6 bits - 5 bits:
b01101 b110010 b10101. The first five bits are for hours, the next 6 bits are for minutes and the last ones are for seconds. Then, each value is converted into a decimal separately to get us 13 hrs, 50 mins and 21 seconds in the end (see the above picture only shows hours and minutes). For the date 📆 value the template is 7-4-5 but the process is the same.
FAT File Creation And Deletion
Steps that are taken when a file is created:
- A directory entry is created is written to the parent directory
- Data is written to the first available cluster
- Entries in the FAT1 and FAT2 are made for all the clusters used
Steps that are taken when a file is deleted:
- The first character of the directory entry is changed to
- The clusters in the FAT1 and FAT2 are filled with zeroes.
- The data area remains unchanged (⚠️ data is still out there!).
Also consists of System (Boot sector, backup boot sector, FAT1) and Data areas. exFAT doesn’t have FAT2.
Located at sector
0 of the volume. It contains the information about the volume (as usual).
⚠️ Offsets are relative to the start of the volume.
In general, it looks very close to FAT32. Something new -
bytes per sector shift (between 9 and 12) and
bytes per cluster shift (~sectors per cluster, 0-25). The value in this field is the power to which we need raise 2 to get the result. For example, if the field
bytes per sector shift is set to 9, we raise 2 to the power of 9 (
2^9 = 512, which means each sector is 512 bytes long). The same math is applied to the
bytes per cluster shift field.
32-bit entries. Media descriptor is the same:
0xFF FF FF F8. Only tracks file fragmentation and doesn’t track file allocation (Bitmap is used for that instead)!
Types of directory entries:
- volume label (critical primary),
0x83. User-created name for the volume. ⚠️ Must be there!
- file directory (critical primary) 🍇. Tracks attributes, MAC times (UTC), ⚠️ but doesn’t point to the parent directory (no
..entry like in FAT32).
0x85- in use,
0x05- free. ⚠️All files will have this entry!
- stream extension directory (critical secondary) 🍇. Size and start of the file. ⚠️ Size of the filename is here (in characters)! Starts with
0xC0if in use,
Not FAT chain- if set, it’s not a fragmented file. Also, there are two interesting values:
Valid data length(init size) and
Data length. If, say a file was downloaded, FS will allocate certain amount of space. But if that download was interrupted, then the file won’t occupy all the space, thus these two values will be different.
- file name (critical secondary) 🍇. Unicode for file name.
0xC1if in use,
0x41if not. Up to 15 Unicode chars. Might be more than one (~like long file names in FAT, can be more than one entry in case it’s a long name).
- system files (critical primary)
0x81. Usually starts at cluster 2.
0x82. Usually starts at cluster 4.
Entries mared with 🍇 are those, that make up a directory set. Below is the breakdown of a directory set. The first byte (
0x85 in this case) indicates that the file is currently in use. It could be
0x05 if it were deleted.
The next byte is the secondary count, which indicated how many other directory entries we have in this file directory entry set (❓). The next two bytes are for error checking and then the next two
0x3A 00 are attributes. The we have MAC times. We have last accesssed time, which we didn’t in FAT32.
exFAT has two additional files that FAT32 does not. They probably come from the NTFS…:
Upcase (table of Unicode chars, used to convert characters for searching).
⚠️ Both files have an entry in the root directory, but don’t have a filename.
UTC! The timezone offset is in 15mins increments (see a breakdown below). So, we have created, modified and last accessed date/times. Each of these timestamps will have a corresponding UTC offset in the file directory entry. It makes sense that these are always the same. In my case,
-2 in decimal, which would make a
UTC-2 (London timezone). We now know (in this case) that MAC times are for London timezone.
⚠️ We have last accesssed time, which we didn’t in FAT32.
Below is the breakdown of how to convert a UTC byte into a human-readable value.
exFAT File Creation and Deletion
- directory set created
- bitmap for allocated clusters set to
- FAT updated (if fragmented)
- data written to the allocated clusters
🧪 Are the timestamps updated for a deleted file in the directory entry? exFAT, FAT32 and NTFS as well.
- first bytes of each entry in the directory entry set is set in a way to show the file is not in use (
05for a file directory entry,
41for filename entries).
- bitmap entries for this file clusters are set to
- FAT may or may not be zeroed out
- file/dir contents remains there until and if is overwritten
⚠️ If the parent folder is deleted, the child entries remain unchanged.