NTFS In Detail High-Performance Database Research Center

NTFS In Detail High-Performance Database Research Center
School of Computer and Information Sciences Francisco R. Ortega, Ph.D.

The Sleuth Kit File System Forensic Analysis The Sleuth Kit
Chapter 11, 12, 13 The Sleuth Kit From site: “is a collection of command line tools and a C library that allows you to analyze disk images and recover files from them. It is used behind the scenes in Autopsy and many other open source and commercial forensics tools.”

The Sleuth Kit

Important Links http://wiki.sleuthkit.org/
Tools overview

Examples of Sleuth Kit istat -f ntfs ntfs.dd 49
icat -f ntfs ntfs.dd 49 Outputs Same as icat -f ntfs ntfs.dd icat -f ntfs ntfs.dd | xxd | more Entry 49 attribute 48-2 (we will do demonstrations) fls -r -f ntf ntfs.dd | more Recursively finds all the files. See attributes:

Let’s review NTFS in mode detail
Everything is file file system administrative the administrative data can be located anywhere in the volume, like a normal file can. The entire file system is considered a data area, and any sector can be allocated to a file. The only consistent layout is that the first sectors of the volume contain the boot sector and boot code.

File Record

Relationship Boot Sector and $MFT
$MFT fragmented in different clusters

MFT entry The size of each MFT entry is defined in the boot sector
but all versions from Microsoft have used a size of 1,024 bytes. The first 42 bytes of the data structure contain 12 fields, and the remaining 982 bytes are unstructured and can be filled with attributes

MFT entry The first field in each MFT entry is the signature, and a standard entry will have the ASCII string "FILE." If an error is found in the entry, it may have the string "BAAD." There is also a flag field that identifies if the entry is being used and if the entry is for a directory. The allocation status of an MFT entry also can be determined from the $BITMAP attribute in the $MFT file

MFT entry If a file cannot fit its attributes into one entry,
it can use multiple entries. When this occurs, the first entry is called the base file record, or base MFT entry, and each of the subsequent entries contains the address of the base entry in one of its fixed fields.

MFT entry Each MFT entry is sequentially addressed using a 48-bit value and the first entry has an address of 0. The maximum MFT address changes as the MFT grows and is determined by dividing the size of $MFT by the size of each entry. Microsoft calls this sequential address the file number. sizeof($MFT)/sizeof(MFT_ENTRY)

MFT entry Every MFT entry also has a 16-bit sequence number that is incremented when the entry is allocated.

MFT entry Example For example, consider MFT entry 313 with a sequence number of 1. The file that allocated entry 313 is deleted, and the entry is reallocated to a new file. When the entry is reallocated, it has a new sequence number of 2.

MFT entry The MFT entry and sequence number are combined, with the sequence number in the upper 16-bits

MFT entry NTFS uses the file reference address to refer to MFT entries because the sequence number makes it easier to determine when the file system is in a corrupt state. The sequence number can be useful and it may be explored later.

MFT Entry The attribute header identifies the type of attribute, its size, and its name. It also has flags to identify if the value is compressed or encrypted. The attribute type is a numerical identifier based on the data type An MFT entry can have multiple attributes of the same type.

MFT Entry Some of the attributes can be assigned a name and it is stored in UTF-16 Unicode in the attribute header. An attribute also has an identifier value assigned to it that is unique to that MFT entry. If an entry has more than one attribute of the same type, this identifier can be used to differentiate between them.

MFT Entry The content of the attribute can have any format and any size. A file could be several MB or GB in size. It is not practical to store this amount of data in an MFT entry, which is only 1,024 bytes. To solve this problem, NTFS provides two locations A resident attribute stores its content in the MFT entry with the attribute header. A non-resident attribute stores its content in an external cluster in the file system.

MFT attributes A file can have up to 65,536 attributes (because of the 16-bit identifier), so it may need more than one MFT entry to store all the attribute headers (even non-resident attributes need their header to be in the MTF entry). When additional MFT entries are allocated to a file, the original MFT entry becomes the base MFT entry. The non-base entries will have the base entry's address in one of their MFT entry fields.

MFT attributes The base MFT entry will have an $ATTRIBUTE_LIST type attribute that contains a list with each of the file's attributes and the MFT address in which it can be found. The non-base MFT entries do not have the $FILE_NAME and $STANDARD_INFORMATION attributes in them.

MFT attributes NTFS can reduce the space needed by a file by saving some of the non-resident $DATA attribute values as sparse. A sparse attribute is one where clusters that contain all zeros are not written to disk. Instead, a special run is created for the zero clusters. Typically, a run contains the starting cluster location and the size, but a sparse run contains only the size and not a starting location. There is also a flag that indicates if an attribute is sparse. For example, consider a file that should occupy 12 clusters. The first five clusters are non- zero, the next three clusters contain zeros, and the last four clusters are non-zero. When stored as a normal attribute, one run of length 12 may be created for the file, as shown in Figure 11.8(A). When stored as a sparse attribute, three runs are created and only nine clusters are allocated, which can be seen in Figure 11.8(B).

MFT attributes For example, consider a file that should occupy 12 clusters. The first five clusters are non- zero, the next three clusters contain zeros, and the last four clusters are non-zero. When stored as a normal attribute, one run of length 12 may be created for the file When stored as a sparse attribute, three runs are created and only nine clusters are allocated

Compression NTFS allows attributes to be written in a compressed format, although the actual algorithm is not given. Note that this is a file system-level compression and not an external application- level compression that can be achieved by using zip or gzip. Microsoft says that only the $DATA attribute should be compressed, and only when it is non-resident. NTFS uses both sparse runs and compressed data to reduce the amount of space needed. The attribute header flag identifies whether it is compressed, and the flags in the $STANDARD_INFORMATION and $FILE_NAME attribute also show if the file contains compressed attributes.

Compression NTFS uses both sparse runs and compressed data to reduce the amount of space needed. The attribute header flag identifies whether it is compressed, and the flags in the $STANDARD_INFORMATION and $FILE_NAME attribute also show if the file contains compressed attributes.

Compression Before the attribute contents are compressed, the data are broken up into equal sized chunks called compression units. The size of the compression unit is given in the attribute header. There are three situations that can occur with each compression unit:

Compression All the clusters contain zeros
in which case a run of sparse data is made for the size of the compression unit and no disk space is allocated. When compressed, the resulting data needs the same number of clusters for storage (i.e., the data did not compress much). In this case, the compression unit is not compressed, and a run is made for the original data. When compressed, the resulting data uses fewer clusters. In this case, the data is compressed and stored in a run on the disk. A sparse run follows the compressed run to make the total run length equal to the number of clusters in a compression unit.

Compression Before the attribute contents are compressed, the data are broken up into equal sized chunks called compression units. The size of the compression unit is given in the attribute header. There are three situations that can occur with each compression unit:

Simple Example (Compression)
Assume that the compression unit size is 16 clusters and we have a $DATA attribute that is 64 clusters in length We divide the content into four compression units and examine each. The first unit compresses to 16 clusters, so it is not compressed. The second unit is all zeros, so a sparse run of 16 clusters is made for it, and no clusters are allocated. The third unit compresses to 10 clusters, so the compressed data is written to disk in a run of 10

Simple Example (Compression)
The first unit compresses to 16 clusters, so it is not compressed. The second unit is all zeros, so a sparse run of 16 clusters is made for it, and no clusters are allocated. The third unit compresses to 10 clusters, so the compressed data is written to disk in a run of 10 clusters and a sparse run of six clusters is added to account for the compressed data. The final unit compresses to 16 clusters, so it is not compressed and a run of 16 clusters is created.

Another Example This layout is not initially organized using compression units. To process this file, we need to first organize all the data in the six runs and then organize the data into compression units of 16 clusters. After merging the fragmented runs, we see that there is one run of content, one sparse run, more content, and another sparse run.

Another Example (2) The merged data are organized into compression units, and we see that the first two units have no sparse runs and are not compressed. The third and fifth units have a sparse run and are compressed. The fourth unit is sparse, and the corresponding data are all zeros.

Encryption In theory, any attribute could be encrypted
but Windows allows only $DATA attributes to be encrypted. When an attribute is encrypted, only the content is encrypted and the attribute header is not. A $LOGGED_UTILITY_STREAM attribute is created for the file, and it contains the keys needed to decrypt the data.

Encryption In Windows, a user can choose to encrypt a specific file or a directory. An encrypted directory does not have any encrypted data, but any file or directory that is created in the directory will be encrypted. An encrypted file or directory has a special flag set in the $STANDARD_INFORMATION attribute, and each attribute that is encrypted will have a special flag set in its attribute header.

Encryption (How?) When an NTFS $DATA attribute is encrypted, its contents are encrypted with a symmetric algorithm called DESX. One random key is generated for each MFT entry with encrypted data, and it is called the file encryption key (FEK). If there are multiple $DATA attributes in the MFT entry, they are all encrypted with the same FEK.

Encryption (How?) The FEK is stored in an encrypted state in the $LOGGED_UTILITY_STREAM attribute. The attribute contains a list of data decryption fields (DDF) and data recovery fields (DRF). A DDF is created for every user who has access to the file, and it contains the user's Security ID (SID), encryption information, and the FEK encrypted with the user's public key.

Encryption (How?) A data recovery field is created for each method of data recovery, and it contains the FEK encrypted with a data recovery public key that is used when an administrator, or other authorized user, needs access to the data

Decryption To decrypt a $DATA attribute, the $LOGGED_UTILITY_STREAM attribute is processed and the user's DDF entry is located. The user's private key is used to decrypt the FEK, and the FEK is used to decrypt the $DATA attribute. When access is revoked from a user, her key is removed from the list.

Decryption A user's private key is stored in the Windows registry and encrypted with a symmetric algorithm that uses her login password as the key. Therefore, the user's password and the registry are needed to decrypt any encrypted files

Indexes NTFS uses index data structures in many situations, and this section describes them. An index in NTFS is a collection of attributes that is stored in a sorted order. The most common usage of an index is in a directory because directories contain $FILE_NAME attributes.

Indexes Prior to version 3.0 of NTFS (which came with Windows 2000), only the $FILE_NAME attribute was in an index Now there are several other uses of indexes and they contain different attributes. For example, security information is stored in an index, as is quota information.

B-Trees An NTFS index sorts attributes into a tree, specifically a B-tree. A tree is a group of data structures called nodes that are linked together such that there is a head node and it branches out to the other nodes. Next, we will explore a basic review of B+ Trees It seems that NTFS uses B+ but other B-trees exist B-Trees, B*-Trees

B+ trees - Motivation For clustering index, data records are scattered: 1 3 6 7 9 13 <6 >6 <9 >9

Solution: B+ - trees facilitate sequential ops
They string all leaf nodes together AND replicate keys from non-leaf nodes, to make sure every key appears at the leaf level (vital, for clustering index!)

B+ trees 6 9 <6 >=9 >=6 <9 1 3 6 7 9 13

B+trees More details: next (and textbook) In short: on split
at leaf level: COPY middle key upstairs at non-leaf level: push middle key upstairs (as in plain B-tree)

Example B+ Tree Search begins at root, and key comparisons direct it to a leaf (as in ISAM). Search for 5*, 15*, all data entries >= 24* ... Root 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 13 Based on the search for 15*, we know it is not in the tree! 10

B+ Trees in Practice Typical order: 100. Typical fill-factor: 67%.
average fanout = 2*100*0.67 = 134 Typical capacities: Height 4: 1334 = 312,900,721 entries Height 3: 1333 = 2,406,104 entries

B+ Trees in Practice Can often keep top levels in buffer pool:
Level 1 = page = KB Level 2 = pages = MB Level 3 = 17,956 pages = 140 MB

Inserting a Data Entry into a B+ Tree
Find correct leaf L. Put data entry onto L. If L has enough space, done! Else, must split L (into L and a new node L2) Redistribute entries evenly, copy up middle key. parent node may overflow but then: push up middle key. Splits “grow” tree; root split increases height. 6

Example B+ Tree - Inserting 8*
Root 17 24 2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 13 23* 13

Root 13 17 24 14* 16* 2* 3* 5* 7* 19* 20* 22* 23* 24* 27* 29* 5* 13 17 24 >=5 <5 2* 3* 14* 16* 19* 20* 22* 23* 24* 27* 29* 5* 7* 8* 13

Root 5 13 17 24 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 23* 24* 27* 29* 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 23* 13

Root 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 5 13 17 24 23* 2* 3* 14* 16* 19* 20* 24* 27* 29* 7* 5* 8* 21* 22* 23* 13

Example B+ Tree Old root
5 13 17 24 Old root 2* 3* Root 17 21 24 14* 16* 19* 20* 21* 22* 23* 24* 27* 29* 13 5 7* 5* 8* Notice that root was split, increasing height. Could use defer-split here. (Pros/Cons?) 13

Example: Data vs. Index Page Split
leaf: ‘copy’ non-leaf: ‘push’ why not non-leaves? 2* 3* 5* 7* 8* Data Page Split … 5 2* 3* 5* 7* 8* Index Page Split 5 13 17 21 24 17 5 13 21 24 12

Now you try… Root 30 … (not shown) 5 13 20 21* 22* 23* 2* 3* 5* 7* 8* 11* 14* 16* Insert the following data entries (in order): 28*, 6*, 25*

Now you try… After inserting 28* Root 30 … (not shown) 5 13 20 21* 22* 23* 2* 3* 5* 7* 8* 11* 14* 16* 28* Insert the following data entries (in order): 28*, 6*, 25*

Answer… After inserting 28*, 6* … 30 7 5 13 20 2* 3* 5* 6* 7* 8* 11*
14* 16* 21* 22* 23* 28* 5* 7* 8* 11*

Answer… After inserting 28*, 6* insert 25*:
30 7 5 13 20 … 2* 3* 5* 6* 7* 8* 11* 14* 16* 21* 22* 23* 28* insert 25*: Q1: which pages will be affected: Q2: how will the root look like after that?

Answer… After inserting 28*, 6* insert 25*:
30 7 5 13 20 … 2* 3* 5* 6* 7* 8* 11* 14* 16* 21* 22* 23* 28* insert 25*: Q1: which pages will be affected: Q2: how will the root look like after that? A1: red arrows A2: (13; 30; _ ; _ )

Answer… 25* causes propagated split! After inserting 25* … 13 30 7 5
20 … 5 7 20 23 2* 3* 5* 6* 7* 8* 11* 14* 16* 21* 22* 23* 25* 28* 25* causes propagated split!

Deleting a Data Entry from a B+ Tree
Start at root, find leaf L where entry belongs. Remove the entry. If L is at least half-full, done! If L underflows Try to re-distribute, borrowing from sibling (adjacent node with same parent as L). If re-distribution fails, merge L and sibling. update parent and possibly merge, recursively 14

Example: Delete 19* & 20* Root 17 Deleting 19* is easy: 2 1 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 20* 22* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* 2* 3* Root 17 30 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 24* 27 27* 29* 3 Deleting 20* -> re-distribution (notice: 27 copied up) 15

... And Then Deleting 24* Must merge leaves … but are we done?? 3 4
2* 3* Root 17 30 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 24* 27 27* 29* 3 4 2* 3* Root 17 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 27* 30 29* Must merge leaves … but are we done??

... Merge Non-Leaf Nodes, Shrink Tree
2* 3* Root 17 14* 16* 33* 34* 38* 39* 13 5 7* 5* 8* 22* 27* 30 29* 4 2* 3* 7* 14* 16* 22* 27* 29* 33* 34* 38* 39* 5* 8* Root 30 13 5 17 16

Example of Non-leaf Re-distribution
Tree is shown below during deletion of 24*. Now, we can (and must) re-distribute keys Root 13 5 17 20 22 30 14* 16* 17* 18* 20* 33* 34* 38* 39* 22* 27* 29* 21* 7* 5* 8* 3* 2* 17

After Re-distribution
need only re-distribute ‘20’; did ‘17’, too why would we want to re-distributed more keys? 14* 16* 33* 34* 38* 39* 22* 27* 29* 17* 18* 20* 21* 7* 5* 8* 2* 3* Root 13 5 17 30 20 22 18

Main observations for deletion
If a key value appears twice (leaf + nonleaf), the above algorithms delete it from the leaf, only why not non-leaf, too? Root 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 5 13 17 24 23*

‘lazy deletions’ - in fact, some vendors just mark entries as deleted (~ underflow), and reorganize/compact later Root 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 5 13 17 24 23*

‘lazy deletions’ - in fact, some vendors just mark entries as deleted (~ underflow), and reorganize/compact later Root 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 5 13 17 24 23* Q: Now, what?

‘lazy deletions’ - in fact, some vendors just mark entries as deleted (~ underflow), and reorganize/compact later Root 2* 3* 14* 16* 19* 20* 22* 24* 27* 29* 7* 5* 8* 5 13 17 24 23* Q: Now, what? A: ‘Merge’

Recap: main ideas on overflow, split (and ‘push’, or ‘copy’)
or consider deferred split on underflow, borrow keys; or merge or let it underflow...

Outline Motivation ISAM B-trees (not in book) B+ trees duplicates
B+ trees in practice prefix compression; bulk-loading; ‘order’

B+ trees with duplicates
Everything so far: assumed unique key values How to extend B+-trees for duplicates? Alt. 2: <key, rid> Alt. 3: <key, {rid list}> 2 approaches, roughly equivalent

approach#1: repeat the key values, and extend B+ tree algo’s appropriately - eg. many ‘14’s 13 14 24 2* 3* 5* 7* 13* 14* 14* 14* 22* 23* 24* 27* 14* 14* 29*

approach#1: subtle problem with deletion: treat rid as part of the key, thus making it unique 13 14 24 2* 3* 5* 7* 13* 14* 14* 14* 22* 23* 24* 27* 14* 14* 29*

approach#2: store each key value: once but store the {rid list} as variable-length field (and use overflow pages, if needed) 13 14 24 2* 3* 5* 7* 13* 14* 24* 27* {rid list} 22* 29* 23* {rid list, cont’d}

Outline Motivation ISAM B-trees (not in book) B+ trees duplicates
B+ trees in practice prefix compression; bulk-loading; ‘order’

Prefix Key Compression
Important to increase fan-out. (Why?) Key values in index entries only `direct traffic’; can often compress them. Papadopoulos Pernikovskaya

Prefix Key Compression
Important to increase fan-out. (Why?) Key values in index entries only `direct traffic’; can often compress them. Per Pap <room for more separators/keys>

Bulk Loading of a B+ Tree
In an empty tree, insert many keys Why not one-at-a-time? 20

Bulk Loading of a B+ Tree
Initialization: Sort all data entries scan list; whenever enough for a page, pack <repeat for upper level - even faster than book’s algo> Root Sorted pages of data entries; not yet in B+ tree 3* 4* 6* 9* 10* 11* 12* 13* 20* 22* 23* 31* 35* 36* 38* 41* 44* 20

Let’s get back at NTFS

How are they implemented
Each entry in the tree uses a data structure called an index entry to store the values in each node. There are many types of index entries, but they all have the same standard header fields For example, a directory index entry contains a few header values and a $FILE_NAME attribute. The index entries are organized into nodes of the tree and stored in a list. An empty entry is used to signal the end of the list.

Let’s get back at NTFS

Index The index nodes can be stored in two types of MFT entry attributes. The $INDEX_ROOT attribute is always resident and can store only one node that contains a small number of index entries. The $INDEX_ROOT attribute is always the root of the index tree.

Index Larger indexes allocate a non-resident $INDEX_ALLOCATION attribute which can contain as many nodes as needed. The content of this attribute is a large buffer that contains one or more index records. An index record has a static size, typically 4,096 bytes, and it contains a list of index entries. Each index record is given an address, starting with 0.

Tools If you are interested in viewing the different attributes you have on a Windows system, you can use the nfi.exe tool from Microsoft [Microsoft 2000]. It displays the MFT contents of a live system, including the attribute names and the cluster addresses. This is not useful for a forensic investigation because the system must be live, but it is useful for learning about NTFS.

Tools Besides Sleuth Kit
See additional tools in Chapter 1 of Book (below)

NTFS In Detail High-Performance Database Research Center

Similar presentations

Presentation on theme: "NTFS In Detail High-Performance Database Research Center"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

NTFS In Detail High-Performance Database Research Center

Similar presentations

Presentation on theme: "NTFS In Detail High-Performance Database Research Center"— Presentation transcript:

Similar presentations

About project

Feedback