Home | Products | Download | Purchase | Support

DIY DataRecovery.nl

Undelete: deleted file recovery

This article describes mechanisms behind file deletion and illustrates how deleted files can be recovered or undeleted.
The focus of the article is on the Microsoft FAT and NTFS file systems.

File deletion

The main purpose of any file system is organizing data on a disk. Files should be safely stored and retrieved. File systems keep structures to remember where free space is available, whether clusters are bad and where files and directories are located. 

The cluster is the smallest addressable unit that can be allocated to a file. Cluster sizes can vary from 1 to 128 sectors (depending on the size of the partition). Fixed variables like the cluster size are set when a partition is formatted and kept in the boot sector of a volume. A volume is a formatted portion of a hard disk that is recognized by Windows as a separate drive.

Files are organized in directories or folders. On-disk structures are maintained to record the names, creation-, modification- and access dates and various other attributes and properties of directories and files.

Typically when a file is deleted only these structures are updated: actual file contents remain untouched on the disk.

File deletion and file undelete strategies for FAT based file systems:

If a file is deleted on the FAT file system the first character of a file name in the directory entry is replaced by a special character (E5h) causing the operating system (Windows, DOS etc.) to ignore the file. Also, all clusters allocated to the file are marked as 'available' in the File Allocation Table (FAT for short). In the following example Myfile.doc is deleted. Assuming a 4 Kb cluster size (=8 sectors) then 12 clusters previously allocated to Myfile.doc are now marked 'available'.

From this point on the directory entry and clusters previously allocated to the file can be used to store new data! If that happens the file can no longer be undeleted or recovered.

Directory entry on FAT file system (simplified)
File Name Extension Attributes Date Start Cluster File Size
~yfile doc archive 16.10.06 7000 45596 b


File Allocation Table (simplified)
6980           6986 6987 6988 EOF 6990 6991 EOF just some files
7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 EOF   our deleted file
7040 7041 7042 7064               1st part of fragmented deleted file
7060         7065 7066 7067 EOF     2nd part of fragmented deleted file

(EOF = End Of File)

File undelete software scans the disk structures and locates directory entries marked by the special delete character. 

Old style undelete utilities like Norton Undelete would now prompt the user to enter the first character of the file name of the deleted file. It would then patch (edit) the directory entry and the FAT to undelete the file 'in place'.

Modern undelete software does not modify the file system structures as it is considered bad practice to do so. Instead it will prompt the user for a location where it can recover the file to. Information from the directory entry is used to retrieve file name information. Data as it is found in clusters previously allocated to the deleted file is copied in binary form to a new file: the file is now recovered.

A file can only be recovered intact if the file's data was stored in consecutive clusters: in the FAT, entries for 'clusters previously allocated to a deleted file' are reset to 'available' so the undelete software has no way to tell which clusters (apart from the very first one) were allocated to the file.

File deletion and file undelete strategies for the NTFS file system:

The central file system control structure of the NTFS file system is the Master File Table, or MFT for short. For each file on a NTFS volume a MFT entry, also called a File Record Segment (or FRS) is created. All information needed to locate data allocated to the file can be retrieved from this MFT entry. Because of this fragmented files that were deleted can usually be recovered intact.

A file consists of multiple attributes: A file name, a data attribute etc. Attributes are dynamic; they do not have fixed sizes. To determine attributes for a file and their locations in the MFT entry an attribute list is kept. 

For small files the data can even be stored in the MFT entry directly. Larger files are stored outside the MFT entry and can be located using so called run lists. A run list describes a length and a start cluster: a run. For multiple fragments a run is kept for each fragment.

When a file is deleted a special file in the MFT entry is no longer marked 'used' while the entry remains largely intact and actual data allocated to a file remains untouched.

From this point on the MFT entry and clusters previously allocated to the file can be used to store new data! If that happens the file can no longer be undeleted or recovered.

MFT Entry or File Record Segment (simplified)
Standard information Attribute List File Name attribute Data attribute  ...  ...

File undelete software scans the MFT for entries that are flagged 'no longer in use'. From an entry the file name and the data runs can be determined. Although the NTFS file system is more complex and dynamic than FAT based file system, recoverability of data in general is far better.

DIY DataRecovery. All rights reserved | about