NTFS Documentation

Thanks

    repeating groups?
    link padding8, padding and other table features to help/tables
    consistant use of padding/alignment fields

    MFT Zone Reservation IS NOT STORED ON DISK
    MFT Zone (reserved space for MFT)
      1 = 12.5%
      2 = 25.0%
      3 = 37.5%
      4 = 50.0%
      Where is this stored on disk?
      volume?  mft?  boot?
      This is the 'system files' space at
      the beginning of the disk.
      NtfsMftZoneReservation

    link in to mft and bitmap

(a) it always points to where the name would be (0x1A)
0x04 record allocation (8 byte alignment)
(c) always seems to be zero, check
(c) no it's only shown the first time for a given attribute type
not sure about sorting by sequence number.  VCN definitely

    8 VCN lowest_vcn;
    Lowest virtual cluster number of this portion of the attribute value. This is usually 0. It
    is non-zero for the case where one attribute does not fit into one mft record and thus
    several mft records are allocated to hold this attribute. In the latter case, each mft
    record holds one extent of the attribute and there is one attribute list entry for each
    extent. NOTE: This is DEFINITELY a signed value! The windows driver uses cmp, followed
    by jg when comparing this, thus it treats it as signed.

    24 __u16 instance;
    If lowest_vcn = 0, the instance of the attribute being referenced; otherwise 0.

    The attribute list is used in case where a file need extension FILE records in the
    MFT to be fully described, in order to find any file attribute of this file.
    This file attribute may be non-resident because its stream is likely to grow.

    The extents of one non-resident attribute (if present) immediately follow
    after the initial extent. They are ordered by lowest_vcn and have their instance set to zero.

Standard Attribute Header?

    The SID structure is a variable-length structure used to uniquely identify
    users or groups. SID stands for security identifier.

    The standard textual representation of the SID is of the form:
        S-R-I-S-S...
    Where:
       - The first "S" is the literal character 'S' identifying the following
        digits as a SID.
       - R is the revision level of the SID expressed as a sequence of digits
     either in decimal or hexadecimal (if the later, prefixed by "0x").
       - I is the 48-bit identifier_authority, expressed as digits as R above.
       - S... is one or more sub_authority values, expressed as digits as above.

    Example SID; the domain-relative SID of the local Administrators group on
    Windows NT/2k:
        S-1-5-32-544

    This translates to a SID with:
        revision = 1,
        sub_authority_count = 2,
        identifier_authority = {0,0,0,0,0,5},   SECURITY_NT_AUTHORITY
        sub_authority[0] = 32,                  SECURITY_BUILTIN_DOMAIN_RID
        sub_authority[1] = 544                  DOMAIN_ALIAS_RID_ADMINS

    ACE Types
    ACCESS_MIN_MS_ACE_TYPE           = 0
    ACCESS_ALLOWED_ACE_TYPE          = 0
    ACCESS_DENIED_ACE_TYPE           = 1
    SYSTEM_AUDIT_ACE_TYPE            = 2
    SYSTEM_ALARM_ACE_TYPE            = 3 Not implemented as of Win2k.
    ACCESS_MAX_MS_V2_ACE_TYPE        = 3

    ACCESS_ALLOWED_COMPOUND_ACE_TYPE = 4
    ACCESS_MAX_MS_V3_ACE_TYPE        = 4

    The following are Win2k only.
    ACCESS_MIN_MS_OBJECT_ACE_TYPE    = 5
    ACCESS_ALLOWED_OBJECT_ACE_TYPE   = 5
    ACCESS_DENIED_OBJECT_ACE_TYPE    = 6
    SYSTEM_AUDIT_OBJECT_ACE_TYPE     = 7
    SYSTEM_ALARM_OBJECT_ACE_TYPE     = 8
    ACCESS_MAX_MS_OBJECT_ACE_TYPE    = 8

    ACCESS_MAX_MS_V4_ACE_TYPE        = 8

    This one is for WinNT&2k.
    ACCESS_MAX_MS_ACE_TYPE           = 8

    The ACE flags (8-bit) for audit and inheritance

    SUCCESSFUL_ACCESS_ACE_FLAG is only used with system audit and alarm ACE
    types to indicate that a message is generated (in Windows!) for successful
    accesses.

    FAILED_ACCESS_ACE_FLAG is only used with system audit and alarm ACE types
    to indicate that a message is generated (in Windows!) for failed accesses.

    The inheritance flags.
    OBJECT_INHERIT_ACE           = 0x01
    CONTAINER_INHERIT_ACE        = 0x02
    NO_PROPAGATE_INHERIT_ACE     = 0x04
    INHERIT_ONLY_ACE             = 0x08
    INHERITED_ACE                = 0x10  Win2k only
    VALID_INHERIT_FLAGS          = 0x1f

    The audit flags.
    SUCCESSFUL_ACCESS_ACE_FLAG   = 0x40
    FAILED_ACCESS_ACE_FLAG       = 0x80

    The access mask defines the access rights.

    The standard rights.
    DELETE                   = 0x00010000
    READ_CONTROL             = 0x00020000
    WRITE_DAC                = 0x00040000
    WRITE_OWNER              = 0x00080000
    SYNCHRONIZE              = 0x00100000

    STANDARD_RIGHTS_REQUIRED = 0x000f0000

    STANDARD_RIGHTS_READ     = 0x00020000
    STANDARD_RIGHTS_WRITE    = 0x00020000
    STANDARD_RIGHTS_EXECUTE  = 0x00020000

    STANDARD_RIGHTS_ALL      = 0x001f0000

    The access system ACL and maximum allowed access types.
    ACCESS_SYSTEM_SECURITY   = 0x01000000
    MAXIMUM_ALLOWED          = 0x02000000

    The generic rights.
    GENERIC_ALL              = 0x10000000
    GENERIC_EXECUTE          = 0x20000000
    GENERIC_WRITE            = 0x40000000
    GENERIC_READ             = 0x80000000

    The object ACE flags (32-bit).
    ACE_OBJECT_TYPE_PRESENT            = 1
    ACE_INHERITED_OBJECT_TYPE_PRESENT  = 2

    ACL_CONSTANTS
    Current revision.
    ACL_REVISION         = 2
    ACL_REVISION_DS      = 4

    History of revisions.
    ACL_REVISION1        = 1
    MIN_ACL_REVISION     = 2
    ACL_REVISION2        = 2
    ACL_REVISION3        = 3
    ACL_REVISION4        = 4
    MAX_ACL_REVISION     = 4

   Absolute security descriptor. Does not contain the owner and group SIDs, nor
   the sacl and dacl ACLs inside the security descriptor. Instead, it contains
   pointers to these structures in memory. Obviously, absolute security
   descriptors are only useful for in memory representations of security
   descriptors. On disk, a self-relative security descriptor is used.

   Attribute: Security descriptor (0x50). A standard self-relative security
   descriptor.

   NOTE: Always resident.
   NOTE: Not used in NTFS 3.0+, as security descriptors are stored centrally
   in FILE_$Secure and the correct descriptor is found using the security_id
   from the standard information attribute.

   On NTFS 3.0+, all security descriptors are stored in FILE_$Secure. Only one
   referenced instance of each unique security descriptor is stored.

   FILE_$Secure contains no unnamed data attribute, i.e. it has zero length. It
   does, however, contain two indexes ($SDH and $SII) as well as a named data
   stream ($SDS).

   Every unique security descriptor is assigned a unique security identifier
   (security_id, not to be confused with a SID). The security_id is unique for
   the NTFS volume and is used as an index into the $SII index, which maps
   security_ids to the security descriptor's storage location within the $SDS
   data attribute. The $SII index is sorted by ascending security_id.

   A simple hash is computed from each security descriptor. This hash is used
   as an index into the $SDH index, which maps security descriptor hashes to
   the security descriptor's storage location within the $SDS data attribute.
   The $SDH index is sorted by security descriptor hash and is stored in a B+
   tree. When searching $SDH (with the intent of determining whether or not a
   new security descriptor is already present in the $SDS data stream), if a
   matching hash is found, but the security descriptors do not match, the
   search in the $SDH index is continued, searching for a next matching hash.

   When a precise match is found, the security_id coresponding to the security
   descriptor in the $SDS attribute is read from the found $SDH index entry and
   is stored in the $STANDARD_INFORMATION attribute of the file/directory to
   which the security descriptor is being applied. The $STANDARD_INFORMATION
   attribute is present in all base mft records (i.e. in all files and
   directories).

   If a match is not found, the security descriptor is assigned a new unique
   security_id and is added to the $SDS data attribute. Then, entries
   referencing the this security descriptor in the $SDS data attribute are
   added to the $SDH and $SII indexes.

   Note: Entries are never deleted from FILE_$Secure, even if nothing
   references an entry any more.

   The $SDS data stream contains the security descriptors, aligned on 16-byte
   boundaries, sorted by security_id in a B+ tree. Security descriptors cannot
   cross 256kib boundaries (this restriction is imposed by the Windows cache
   manager). Each security descriptor is contained in a SDS_ENTRY structure.
   Also, each security descriptor is stored twice in the $SDS stream with a
   fixed offset of 0x40000 bytes (256kib, the Windows cache manager's max size)
   between them; i.e. if a SDS_ENTRY specifies an offset of 0x51d0, then the
   the first copy of the security descriptor will be at offset 0x51d0 in the
   $SDS data stream and the second copy will be at offset 0x451d0.

   $SII index. The collation type is COLLATION_NTOFS_ULONG.
   $SDH index. The collation rule is COLLATION_NTOFS_SECURITY_HASH.

    must have (at least empty) unnamed data attr

    Always resident.

    link up below

    silly to have a flag of 0x00, remove

    This is the header for indexes, describing the INDEX_ENTRY records, which
    follow the INDEX_HEADER. Together the index header and the index entries
    make up a complete index.

    This is followed by a sequence of index entries (INDEX_ENTRY structures)
    as described by the index header.

    When a directory is small enough to fit inside the index root then this
    is the only attribute describing the directory. When the directory is too
    large to fit in the index root, on the other hand, two aditional attributes
    are present: an index allocation attribute, containing sub-nodes of the B+
    directory tree (see below), and a bitmap attribute, describing which virtual
    cluster numbers (vcns) in the index allocation attribute are in use by an
    index block.

    NOTE: The root directory (FILE_$root) contains an entry for itself.

    struct {
            ATTR_TYPES type;
            Type of the indexed attribute. Is $FILENAME for directories, zero
            for view indexes. No other values allowed.
            COLLATION_RULES collation_rule;        Collation rule used to sort the
            index entries. If type is $FILENAME, this must be COLLATION_FILENAME.

            __u32 bytes_per_index_block;
            Byte size of each index block (in the index allocation attribute).

            __u8 clusters_per_index_block;
            Cluster size of each index block (in the index allocation attribute), when
            an index block is >= than a cluster, otherwise this will be the log of
            the size (like how the encoding of the mft record size and the index
            record size found in the boot sector work). Has to be a power of 2.
    }  INDEX_ROOT;

    which elements are shared between indexes?
    not relevant for index root

    this attribute is never resident - would use index root instead

    split into two tables, at least

    Always non-resident (doesn't make sense to be resident anyway!).

    This is an array of index blocks. Each index block starts with an
    INDEX_BLOCK structure containing an index header, followed by a sequence of
    index entries (INDEX_ENTRY structures), as described by the INDEX_HEADER.

    When creating the index block, we place the update sequence array at this
    offset, i.e. before we start with the index entries. This also makes sense,
    otherwise we could run into problems with the update sequence array
    containing in itself the last two bytes of a sector which would mean that
    multi sector transfer protection wouldn't work. As you can't protect data
    by overwriting it since you then can't get it back...
    When reading use the data from the ntfs record header.

(a) The structure of the Reparse Data depends on the Reparse Type. There are
    three defined Reparse Data (SymLinks, VolLinks and RSS) + the Generic Reparse. 

These are just the predefined reparse flags

    The reparse point tag defines the type of the reparse point. It also
    includes several flags, which further describe the reparse point.

    The reparse point tag is an unsigned 32-bit value divided in three parts:

    1. The least significant 16 bits (i.e. bits 0 to 15) specifiy the type of
       the reparse point.
    2. The 13 bits after this (i.e. bits 16 to 28) are reserved for future use.
    3. The most significant three bits are flags describing the reparse point.
       They are defined as follows:
         bit 29: Name surrogate bit. If set, the filename is an alias for
                 another object in the system.
         bit 30: High-latecny bit. If set, accessing the first byte of data will
                 be slow. (E.g. the data is stored on a tape drive.)
         bit 31: Microsoft bit. If set, the tag is owned by Microsoft. User
                 defined tags have to use zero here.

    The system file FILE_$Extend/$Reparse contains an index named $R listing
    all reparse points on the volume. The index entry keys are as defined
    below. Note, that there is no index data associated with the index entries.

    The index entries are sorted by the index key file_id. The collation rule is
    COLLATION_NTOFS_ULONGS. FIXME: Verify whether the reparse_tag is not the
    primary key / is not a key at all. (AIA)

As an attribute it's no different to a named data attribute
Contents depend on the name of the $DATA stream

    Operations on this attribute are logged to the journal ($LogFile) like
    normal metadata changes.

    Used by the Encrypting File System (EFS). All encrypted files have this
    attribute with the name $EFS.

    Can be anything the creator chooses.
    EFS uses it as follows:
    FIXME: Type this info, verifying it along the way. (AIA)

  offset(length)   contents
  0(4)             Magic number 'RCRD'
  1E(12)           Fixup

  struct {
    NTFS_RECORD;              The magic is "RSTR".
    __u64 chkdsk_lsn;         The check disk log file sequence 
                              number for this restart page. 
                              Only used when the magic is changed 
                              to "CHKD". = 0
    __u32 system_page_size;   Byte size of system pages, has to be
                                        >= 512 and a power of 2. Use this
                              to calculate the required size of the
                              usa and add this to the
                              ntfs.usa_offset value. Then verify
                              that the result is less than the 
                              value of the restart_offset. = 0x1000
    __u32 log_page_size;      Byte size of log file records, 
                              has to be >= 512 and a power of 2.
                              = 0x1000
    __u16 restart_offset;     Byte offset from the start of the
                              record to the restart record. 
                              Value has to be aligned to 8-byte
                              boundary. = 0x30
    __s16 minor_ver;          Log file minor version. Only check if
                              major version is 1. (=1 but >=1 is
                              treated the same and <=0 is also
                              ok)
    __u16 major_ver;          Log file major version (=1 but =0 is
                              ok)
  } RESTART_PAGE_HEADER;

  struct {
    __u64 current_lsn;        Log file record. = 0x700000, 0x700808
    __u16 log_clients;        Number of log client records 
                              following the restart_area. = 1
    __u16 client_free_list;   How many clients are free(?). If !=
                              0xffff, check that log_clients >
                              client_free_list. = 0xffff
    __u16 client_in_use_list; How many clients are in use(?). 
                              If != 0xffff check that log_clients
                                        > client_in_use_list. = 0
    __u16 flags;              ??? = 0
    __u32 seq_number_bits;    ??? = 0x2c or 0x2d
    __u16 restart_area_length;Length of the restart area. 
                              Following checks required if version
                              matches. Otherwise, skip them.
                              restart_offset + restart_area_length
                              has to be <lt;= system_page_size.
                              Also, restart_area_length has to be
                                        >= client_array_offset +
                              (log_clients * 0xa0). = 0xd0
    __u16 client_array_offset;Offset from the start of this record
                              to the first client record if versions
                              are matched. The offset is otherwise
                              assumed to be (sizeof(RESTART_AREA) +
                              7) & ~7, i.e. rounded up to first
                              8-byte boundary. Either way, the
                              offset to the client array has to be
                              aligned to an 8-byte boundary. Also,
                              restart_offset + offset to the client
                              array have to be <lt;= 510. Also,
                              the offset to the client array +
                              (log_clients * 0xa0) have to be
                                        <lt;= SystemPageSize. = 0x30
    __u64 file_size;          Byte size of the log file. If the
                              restart_offset + the offset of the
                              file_size are > 510 then corruption
                              has occured. This is the very first
                              check when starting with the
                              restart_area as if it fails it means
                              that some of the above values will be
                              corrupted by the multi sector transfer
                              protection! If the structure is
                              deprotected then these checks are
                              futile of course.
                              Calculate the file_size bits and check
                              that seq_number_bits == 0x43 -
                              file_size bits. = 0x400000
    __u32 last_lsn_data_length;??? = 0, 0x40
    __u16 record_length;       Byte size of this record. If the
                               version matches then check that the
                               value of record_length is a multiple
                               of 8, i.e. (record_length + 7) &
                               ~7 == record_length. = 0x30
    __u16 log_page_data_offset;??? = 0x40
  }  RESTART_AREA;

  struct {
    __u64 oldest_lsn;          Oldest log file sequence number for
                               this client record. = 0xbd16951d
    __u64 client_restart_lsn;  ??? = 0x700000, 0x700827, 0x700d07
    __u16 prev_client;         ??? = 0x808, 0xd07, 0xd5d
    __u16 next_client;         ??? = 0x70
    __u16 seq_number;          ??? = 0, 4 size uncertain, Regis
                               calls this "volume clear flag" and
                               gives a size of one byte.
    __u16 client_name;         ??? = empty string??? size uncertain
  }  RESTART_CLIENT;

   struct {
      NTFS_RECORD;                        The magic is "RCRD".
      union {
         __u64 last_lsn;
         __u32 file_offset;
      }  copy;
      __u32 flags;
      __u16 page_count;
      __u16 page_position;
      union {
         struct {
            __u64 next_record_offset;
            __u64 last_end_lsn;
         }  packed;
      }  header;
   }  RECORD_PAGE_HEADER;

   enum {
      LOG_RECORD_MULTI_PAGE = 1,        ???
      LOG_RECORD_SIZE_PLACE_HOLDER = 0xffff,
            This has nothing to do with the log record. 
            It is only so gcc knows to make the flags 16-bit.
   }  LOG_RECORD_FLAGS;

   struct {
      __u64 this_lsn;
      __u64 client_previous_lsn;
      __u64 client_undo_next_lsn;
      __u32 client_data_length;
      struct {
         __u16 seq_number;
         __u16 client_index;
      }  client_id;
      __u32 record_type;
      __u32 transaction_id;
      LOG_RECORD_FLAGS flags;
      __u16 reserved_or_alignment[3];
  *** Now are at ofs 0x30 into struct. ***
      __u16 redo_operation;
      __u16 undo_operation;
      __u16 redo_offset;
      __u16 redo_length;
      __u16 undo_offset;
      __u16 undo_length;
      __u16 target_attribute;
      __u16 lcns_to_follow;             Number of lcn_list entries
                                        following this entry.
      __u16 record_offset;
      __u16 attribute_offset;
      __u32 alignment_or_reserved;
      __u32 target_vcn;
      __u32 alignment_or_reserved1;
      struct {              Only present if lcns_to_follow is not 0.
         __u32 lcn;
         __u32 alignment_or_reserved;
      }  lcn_list[0];
   }  LOG_RECORD;

Attribute end marker 0xFFFFFFFF

  sorted by security id
  Self-relative? == has 2 * SID
  generally a large file, not all used
  there may be missing entries - test
  large block of ids at start, then junk, then another block at 256KB

  Last padding is always 4 bytes and always appears to be the Unicode string "II".

  The Security Id Index ($SII)

  This file is sorted by the hash.
  The security descriptors are stored in the $SDS data stream.
  surprisingly the offset (64 bit isn't 8 byte aligned)

Flags?

header & repeating group

  sid may be missing (quota flags = default limit => no SID, just padding)
  padding may not be necessary
  index key - xref to which index?
  change time - date/time
  exceeded time - 10/4/01 (not +5 days)
  in the last (null) entry, the padding at 0x0C = 0x02

  The $Q index contains one entry for each existing user_id on the
  volume. The index key is the user_id of the user/group owning this
  quota control entry, i.e. the key is the owner_id. The user_id of
  the owner of a file, i.e. the owner_id, is found in the standard
  information attribute. The collation rule for $Q is
  COLLATION_NTOFS_ULONG.

  The $O index contains one entry for each user/group who has been
  assigned a quota on that volume. The index key holds the SID of
  the user_id the entry belongs to, i.e. the owner_id. The collation
  rule for $O is COLLATION_NTOFS_SID.

  The $O index entry data is the user_id of the user corresponding
  to the SID.
  This user_id is used as an index into $Q to find the quota control
  entry associated with the SID.

0xA000003 flags - see $REPARSE_POINT
No data!

repeating group

    name isn't null terminated

    FIXME
    0x40 __s64 compressed_size;
    Byte size of the attribute value after compression.
    Only present when compressed. Always is a multiple of the cluster
    size. Represents the actual amount of disk space being used on the disk.

  fixed order
  height balanced
  during add/remove of keys
  minimal disturbance
  pointers downwards only

      index root
      index allocation
      dummy keys
      data in non-leaf keys
      on-disk pointer only point down

  What we have so far

      ...

  Overview

      ...

  Add Rules

      Find the first key that is larger than the new key
      (this will be a necessarily be a leaf)
      Insert the new key before this key (in the same node)
      While the node is full
          Split the current node in two
          Promote the median key to the parent
          Now consider the parent
      End

  Delete Rules

      Delete the key
      If the key had children
          Find the successor and move it to this node
          Now consider the successor's old node
      End
      While the node isn't full enough
          If a sibling has enough keys
            steal one
          Else
            Combine with one of the sibling
          End
      End

  flatcap :     hi _Oracle_
  _Oracle_:     hi there
  flatcap :     anything I can do for you?
  _Oracle_:     I was wondering about the B+ trees of ntfs
  _Oracle_:     they seem to be a bit awkward, or at least - not what I expected :)
  flatcap :     they _do_ seem strange, but they are perfect for filesystems
  _Oracle_:     no, i meant their on-disk representation
  _Oracle_:     they have a dummy node of sorts?
  flatcap :     the trees in ntfs aren't proper b+trees
  flatcap :     a dummy key
  _Oracle_:     that's exactly what I was hoping to hear!
  flatcap :     (thinking is still a bit hard this morning, bear with me :-)
  _Oracle_:     no problem ;-)
  flatcap :     the trees consist of a node, which contains keys
  flatcap :     the keys in a real (ideal world) b+tree are just separators, and the data is only stored in the leaves
  _Oracle_:     right
  _Oracle_:     btw - how big is a node under ntfs? i mean, how many keys fit in there?
  flatcap :     the INDX record is 4k, an you can get 10's of filenames in it
  flatcap :     but..., that depends on the lengths of the filenames
  _Oracle_:     i thought the number of keys in a node was a fixed property of a b+ tree?
  flatcap :     hehe, usually, yes
  flatcap :     the keys of ntfs actually contain data and also a pointer to their children
  _Oracle_:     so i noticed
  AntonA  :     one should add that INDX records of 2k size have also been seen in the wild - by me (-:
  _Oracle_:     really? 
  _Oracle_:     what OS?
  AntonA  :     NT4
  flatcap :     because there's one more child than key, there has to be a dummy key (no data, but has children)
  _Oracle_:     interesting...
  AntonA  :     some of my directories (e.g. c:\winnt and c:\program files) have 2k INDX size while other dirs have 4k.
  _Oracle_:     so the dummy key is always the "largest"?
  flatcap :     yes
  _Oracle_:     i see...
  _Oracle_:     so if the non-leaf nodes have data of themselves, wouldn't that make the tree a b-tree?
  flatcap :     I've just written a test program to help me understand the trees, which is on bitkeeper
  _Oracle_:     I'd love to see that
  flatcap :     I read a lots of webpages and I think that the nearest term is a b*tree
  _Oracle_:     and how is it different from a b-tree?
  flatcap :     a b-tree maintains a minimum of 1/2 full nodes (except for the root node)
  flatcap :     a b*tree changes the rules slightly and maintains 2/3 full
  _Oracle_:     so it just changes the rules of combining two nodes to one and such?
  flatcap :     exactly
  _Oracle_:     hmmm...
  _Oracle_:     let me think about that for a moment :)
  flatcap :     in a true b+tree, the data keys (leaves) should also have pointers to the next (for quick sequential accesses), but that's probably just maintained in memory
  flatcap :     I'm going to write up everything I know about ntfs trees soon
  _Oracle_:     let me see if i got that...
  _Oracle_:     the index root points to the root INDX record
  flatcap :     you can see my test prog at:  http://linux-ntfs.bkbits.net:8080/tng-support/src/tree
  _Oracle_:     each INDX record contains keys that have pointers to the files themselves and to the keys with lower values
  flatcap :     yes
  _Oracle_:     I see
  flatcap :     the index root lives in the MFT record
  _Oracle_:     Yeah, this I managed to discover :)
  flatcap :     all the rest (index allocations) are non-res
  _Oracle_:     and the number of keys in a single INDX record is completely flexible?
  AntonA  :     yes
  flatcap :     yes, but there's a minimum
  AntonA  :     a minimum?
  flatcap :     yes, that's part of the tree algorithm
  AntonA  :     surely the minimum is a non-data containing terminator entry?
  _Oracle_:     what's the minimum?
  flatcap :     the minimum for a b+tree is 1/2 full, b* 2/3 full
  flatcap :     only the root node may contain fewer
  _Oracle_:     oh.
  _Oracle_:     yeah
  AntonA  :     and the last node...
  flatcap :     the keys are moved about to keep this true
  flatcap :     even the last node will have the "right number" in it
  AntonA  :     that would mean that in a really large directory deleting one file could take hours?
  flatcap :     no, you might think that, but the balancing doesn't affect many other nodes
  flatcap :     if the tree is 4 deep (NTFS equiv say 10^5 files), you'd only be altering 4 index records
  flatcap :     I'll draw lots of pictures when I have a moment (probably tomorrow)
  _Oracle_:     that should be interesting to read!
  flatcap :     are you on our dev mailing list, _Oracle_
  _Oracle_:     What mailing list? (er... no.)
  AntonA  :     the major question that springs to my mind is what would windows ntfs do if it saw an imbalanced tree (because we messed up or because we simply chose to ignore balancing)
  flatcap :     hehe, I hate to think :-)
  _Oracle_:     I wouldn't want to be there, that's for sure
  flatcap :     chkdsk would probably try and rebalance it and you might find that ntfs.sys would just balance it out as files were created/deleted
  _Oracle_:     how do i join the list?
  flatcap :     http://lists.sourceforge.net/lists/listinfo/linux-ntfs-dev
  AntonA  :     um, it would be a lot easier to get directory operations working while ignoring the existence of the tree (obviously operating on them correctly so we don't kill the fs)
  flatcap :     I'll mail the list and answer questions there
  AntonA  :     if windows is able to pickup the pieces without complaint / failure, it would be worth considering as a first pass of implementation at least.
  flatcap :     yes possibly, but I think I know enough now to build something close enough
  flatcap :     (I just wanted a big project where I could start without tripping over you :-)
  AntonA  :     cool
  _Oracle_:     I've got a few more questions if you have the time
  AntonA  :     As I said before. I am not going anywhere near directories. (-:
  flatcap :     sure
  _Oracle_:     Smaller ones, though

    0x13 ULONGS refers to GUIDs TEST

    #include <ntfs.h>\n
    #include <stdio.h>\n

    for(i=clear_pos-1,lmask=0xFFF,dshift=12;i>=0x10;i>>=1){
            lmask >>= 1; /* bit mask for length */
            dshift——;    /* shift width for delta */
    }

                                >\n(18,10)stdio

    00000100 > \n 0A 90 s t d i o

    00000010 ' ' FC 0F

    FIXME: Compression unit's size 2^4 in attribute header.
    The compression method is based on independently compressing blocks of X
    clusters, where X is determined from the compression_unit value found in the
    non-resident attribute record header (more precisely: X = 2^compression_unit
    clusters). On Windows NT/2k, X always is 16 clusters (compression_unit = 4).

      1) The data in the block is all zero (a sparse block):
        This is stored as a sparse block in the run list, i.e. the run list
        entry has length = X and lcn = -1. The mapping pairs array actually
        uses a delta_lcn value length of 0, i.e. delta_lcn is not present at
        all, which is then interpreted by the driver as lcn = -1.
        NOTE: Even uncompressed files can be sparse on NTFS 3.0 volumes, then
        the same principles apply as above, except that the length is not
        restricted to being any particular value.

      2) The data in the block is not compressed:
        This happens when compression doesn't reduce the size of the block
        in clusters. I.e. if compression has a small effect so that the
        compressed data still occupies X clusters, then the uncompressed data
        is stored in the block.
        This case is recognised by the fact that the run list entry has
        length = X and lcn >= 0. The mapping pairs array stores this as
        normal with a run length of X and some specific delta_lcn, i.e.
        delta_lcn has to be present.

      3) The data in the block is compressed:
        The common case. This case is recognised by the fact that the run
        list entry has length L < X and lcn >= 0. The mapping pairs array
        stores this as normal with a run length of X and some specific
        delta_lcn, i.e. delta_lcn has to be present. This run list entry is
        immediately followed by a sparse entry with length = X - L and
        lcn = -1. The latter entry is to make up the vcn counting to the
        full compression block size X.

    In fact, life is more complicated because adjacent entries of the same type
    can be coalesced. This means that one has to keep track of the number of
    clusters handled and work on a basis of X clusters at a time being one
    block. An example: if length L > X this means that this particular run list
    entry contains a block of length X and part of one or more blocks of length
    L - X. Another example: if length L < X, this does not necessarily mean that
    the block is compressed as it might be that the lcn changes inside the block
    and hence the following run list entry describes the continuation of the
    potentially compressed block. The block would be compressed if the
    following run list entry describes at least X - L sparse clusters, thus
    making up the compression block length as described in point 3 above. (Of
    course, there can be several run list entries with small lengths so that the
    sparse entry does not follow the first data containing entry with
    length < X.)

    NOTE: At the end of the compressed attribute value, there most likely is not
    just the right amount of data to make up a compression block, thus this data
    is not even attempted to be compressed. It is just stored as is.

    (1000 A) (0 6)       //(rel.VCN length)

    (1000 10)

    (1000 A) (1040 6)

    s=""
    for i in range(0,16):   #adjust to clusters >512 if necessary

    s=s+chr(i)+chr(j)
    open("uncompressable","w").write(s)

Runlist:
    21 14 00 01 11 10 18 11 05 15 01 27 11 20 05

Decode
    0x14   at   0x100   21 0x100, 0x14
    0x10   at  + 0x18   11  0x18, 0x10
    0x05   at  + 0x15   11  0x15, 0x05
    0x27   at  + none   01  0x27, none
    0x20   at  + 0x05   11  0x05, 0x20

Absolute LCNs
    0x14   at   0x100
    0x10   at   0x118
    0x05   at   0x12D
    0x27   at   none
    0x20   at   0x132

Regroup
    0x10   at   0x100

    0x04   at   0x110
    0x0C   at   0x118

    0x04   at   0x118
    0x05   at   0x12D
    0x07   at   none

    0x10   at   none

    0x10   at   none

    0x10   at   0x132

    0x10   at   0x142

Compression unit beginning at VCN 0x0
 0x10 clusters at LCN 0x100
 Unit not compressed

Compression unit beginning at VCN 0x10
 0x4 clusters at LCN 0x110
 0xC clusters at LCN 0x118
 Unit not compressed

Compression unit beginning at VCN 0x20
 0x4 clusters at LCN 0x124
 0x5 clusters at LCN 0x12D
 0x7 unused clusters: compressed unit

Compression unit beginning at VCN 0x30
 0x10 zeroed clusters: sparse unit

Compression unit beginning at VCN 0x40
 0x10 zeroed clusters: sparse unit

Compression unit beginning at VCN 0x50
 0x10 clusters at LCN 0x132
 Unit not compressed

Compression unit beginning at VCN 0x60
 0x10 clusters at LCN 0x142
 Unit not compressed

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

file.txt 31KB bytes (disk has a 1KB cluster size)

it's stored at clusters 10-26, 45-49, 100-108

17 clusters at LCN 10
5  clusters at LCN 45
9  clusters at LCN 100

next make the offsets relative

17 clusters at LCN 10
5  clusters at LCN 45
9  clusters at LCN 100

is encoded as

11

working in unit of 16 clusters
relative offsets (including -ve)
compressed sparse
variable length structures
stored as:
save space implies wherever MFT places
data it's best not to spread it too far.

-ve implies an offset of +129 would have to use two bytes
therefore -10 = 0xF6
0x80 = -128
0XFF7F = -129

21 14 00 01 11 10 18 11 05 15 01 27 11 20 05

    Example: 21 20 ED 05 22 48 07 48 22 21 28 C8 DB
    First run: 20 clusters starting from 5ED (5ED to 60D)
    2nd run: 748 clusters starting from 5ED+2248 (2835 to 2F7D)
    3rd run: 28 clusters starting from 2835+DBC8 (3FD to 425)

    Run 1 length 30 offset 60 (first run relative to 0)
    Run 2 length 10 offset 100 + 60
    Run 3 length 20 offset 160 - 20 (EO == -20)
                 ==
                 80

    21 09 F5 47  9 clusters from 47F5
    01 07        7 clusters from nowhere (0)
    11 07 09     7 clusters from 47F5 + 9
      ====
      0x17

    123456789ABCDEFG1234... VCN
    RRRRRRRRRZZZZZZZRRRR... Real/Zero

    VCN0123...
       XXXXXXXXXXOOOOO  X=DATA O=SPACE

    21 0A 10 F6   10 clusters of compressed data at F610
    01 06         6 clusters of nothing to round up this block to 16

    21 10 10 F6   16 clusters of compressed data at F610

    FIXME:
    In fact, life is more complicated because adjacent entries of the same type
    can be coalesced. This means that one has to keep track of the number of
    clusters handled and work on a basis of X clusters at a time being one
    block. An example: if length L > X this means that this particular run list
    entry contains a block of length X and part of one or more blocks of length
    L - X. Another example: if length L > X, this does not necessarily mean that
    the block is compressed as it might be that the lcn changes inside the block
    and hence the following run list entry describes the continuation of the
    potentially compressed block. The block would be compressed if the
    following run list entry describes at least X - L sparse clusters, thus
    making up the compression block length as described in point 3 above. (Of
    course, there can be several run list entries with small lengths so that the
    sparse entry does not follow the first data containing entry with
    length < X.)

    NOTE: At the end of the compressed attribute value, there most likely is not
    just the right amount of data to make up a compression block, thus this data
    is not even attempted to be compressed. It is just stored as is.

A directory can even have a named data stream

    May not exist on Win2K (std info, $secure)

    unnamed data stream compulsory (chkdsk will put it back if missing)
    named data streams optional (any limit to the number?)

    access with "jim.txt:stream"

link table to notes

    The sequence number is a circular counter (skipping 0) describing how many
    times the referenced mft record has been (re)used. This has to match the
    sequence number of the mft record being referenced, otherwise the reference
    is considered stale and removed (FIXME: only ntfsck or the driver itself?).

    If the sequence number is zero it is assumed that no sequence number
    consistency checking should be performed.

    FIXME: The mft zone is defined as the first 12% of the volume. This space is
    reserved so that the mft can grow contiguously and hence doesn't become
    fragmented. Volume free space includes the empty part of the mft zone and
    when the volume's free 88% are used up, the mft zone is shrunk by a factor
    of 2, thus making more space available for more files/data. This process is
    repeated everytime there is no more free space except for the mft zone until
    there really is no more free space.

    The mft record header present at the beginning of every record in the mft.
    This is followed by a sequence of variable length attribute records which
    is terminated by an attribute of type $END which is a truncated attribute
    in that it only consists of the attribute type code $END and none of the
    other members of the attribute structure are present.

    When (re)using the mft record, we place the update sequence array at this
    offset, i.e. before we start with the attributes. This also makes sense,
    otherwise we could run into problems with the update sequence array
    containing in itself the last two bytes of a sector which would mean that
    multi sector transfer protection wouldn't work. As you can't protect data
    by overwriting it since you then can't get it back...
    When reading we obviously use the data from the ntfs record header.

    Size defined in $Boot.
    A FILE record is 1 KB large or the cluster size if larger (as far as Helen is
    concerned, its maximum size is 4 KB, but Windows NT 4 limit is 64 KB). It falls into
    2 parts:

    seq num = inode for 0x00 < i < 0x10 (inode 0 (MFT) has seq num of 1)

    see also attribute id page and file reference page

    flags 1 in use, 2 dir, 4 ???, 8??? (4+8 ARE used)

    mft references (aka file references or file record segment references) are
    used whenever a structure needs to refer to a record in the mft.

    A reference consists of a 48-bit index into the mft and a 16-bit sequence
    number used to detect stale references.

    when is the seq num incremented

(a) These values are relative to 0x18
(b) Has children

This is only applicable to a file index ($I30)

    indx help describe as "index = key + data"

    given an INDX record, it's difficult to work out what's
    being indexed (that info is in the index root)

    N.B. the filename is not null terminated
    surely the flags can't be 8 bytes long
    table for the flags
    VCN of ib only exists when flags&1
    last entry has a size of 0x10 (just large enough
    for the flags (and a mft ref of zero))

    Index entry flags (16-bit).

    INDEX_ENTRY_NODE = cpu_to_le16(1), This entry contains a sub-node,
                      i.e. a reference to an index
                      block in form of a virtual
                      cluster number (see below).
    INDEX_ENTRY_END  = cpu_to_le16(2), This signifies the last entry in
                      an index block. The index entry
                      does not represent a file but it
                      can point to a sub-node.

    This is an index entry. A sequence of such entries follows each INDEX_HEADER
    structure. Together they make up a complete index. The index follows either
    an index root attribute or an index allocation attribute.

    NOTE: Before NTFS 3.0 only filename attributes were indexed.

    NTFS represents POSIX-style hard links as files with multiple filename
    NTFS represents hard links with multiple filenames.
    This is different to one file with names in different namespaces.
    Delete a name from a hard linked file and only the name will be removed.

Offset(length)         Description
0(4)                   Magic number 'RSTR'
1E(12)                 Fixup
30(4)                  LSNa
58(4)                  LSNb
60(4)                  LSNc (==LSNa?)
6C(1)                  Volume clear flag
78(8)                  Unicode string 'NTFS'

    link back to sec page

    S-1-5-21-646518322-1873620750-619646970-1110
    S for security id
    1 Revision level
    5 Identifier Authority (48 bit) 5 = logon id
    21 Sub-authority (21 = nt non unique)
    646518322        SA
    1873620750        SA domain id
    619646970        SA
    1110        user id

    These relative identifiers (RIDs) are used with the above identifier
    authorities to make up universal well-known SIDs.

    Note: The relative identifier (RID) refers to the portion of a SID, which
    identifies a user or group in relation to the authority that issued the SID.
    For example, the universal well-known SID Creator Owner ID (S-1-3-0) is
    made up of the identifier authority SECURITY_CREATOR_SID_AUTHORITY (3) and
    the relative identifier SECURITY_CREATOR_OWNER_RID (0).

    fix the data runs page for NT4 (old style)
    13 b8 ae 04 ff 00     old
    03 b8 ae 04 00        new
    bad clus on NT4 sparse data runs use -1!

        FIXME
        "BAAD" == corrupt record
        "CHKD" == chkdsk ???
        "FILE" == mft entry
        "HOLE" == ??? (NTFS 3.0+?)
        "INDX" == index buffer
        RSTR & ???

        Dynamic disk SDS, win2k

        FRS = MFT File Record

      The valid format for a GUID is {XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}

      Globally Unique Identifier (GUID)

      GUID structures store globally unique identifiers (GUID). A GUID is a
      128-bit value consisting of one group of eight hexadecimal digits, followed
      by three groups of four hexadecimal digits each, followed by one group of
      twelve hexadecimal digits. GUIDs are Microsoft's implementation of the
      distributed computing environment (DCE) universally unique identifier (UUID).
      Example of a GUID:
              1F010768-5A73-BC91-0010A52216A7

      order stored on disk?

      01020304-0506-0708-090A0B0C0D0E0F010

      0x00  04030201
      0x04  0605
      0x06  0807
      0x08  090A0B0C0D0E0F010

        meta-data

        Data about data. In data processing, meta-data is definitional data
        that provides information about or documentation of other data managed
        within an application or environment.

        For example, meta data would document data about data elements or
        attributes, (name, size, data type, etc) and data about records or
        data structures (length, fields, columns, etc) and data about data
        (where it is located, how it is associated, ownership, etc.). Meta
        data may include descriptive information about the context, quality
        and condition, or characteristics of the data.

        multiple sectors, fixup, safety checks

        partition table...
        SFS Win2K dynamic disk

        standardise 4 time fields name & description concept page?
        refer to 4 times as:
          C creation
          A alter (modification)
          M mft (mft changed)
          R read (last access)

        FIXME:
        NOTE: There is conflicting information about the meaning of each of the time
        fields but the meaning as defined below has been verified to be
        correct by practical experimentation on Windows NT4 SP6a and is hence
        assumed to be the one and only correct interpretation.

        creation_time
        Time file was created. Updated when a filename is changed(?).

        last_data_change_time
        Time the data attribute was last modified.

        last_mft_change_time
        Time this mft record was last modified.

        last_access_time
        Approximate time when the file was last accessed (obviously this is not
        updated on read-only volumes). In Windows this is only updated when
        accessed if some time delta has passed since the last update.

        N.B. There is conflicting information about the meaning of each of the time
        fields but the meaning as defined below has been verified to be
        correct by practical experimentation on Windows NT4 SP6a and is hence
        assumed to be the one and only correct interpretation.

        The Update Sequence Array (usa) is an array of the __u16 values which belong
        to the end of each sector protected by the update sequence record in which
        this array is contained. Note that the first entry is the Update Sequence
        Number (usn), a cyclic counter of how many times the protected record has
        been written to disk. The values 0 and -1 (ie. 0xffff) are not used. All
        last __u16's of each sector have to be equal to the usn (during reading) or
        are set to it (during writing). If they are not, an incomplete multi sector
        transfer has occured when the data was written.
        The maximum size for the update sequence array is fixed to:
                maximum size = usa_ofs + (usa_count * 2) = 510 bytes
        The 510 bytes comes from the fact that the last __u16 in the array has to
        (obviously) finish before the last __u16 of the first 512-byte sector.
        This formula can be used as a consistency check in that usa_ofs +
        (usa_count * 2) has to be less than or equal to 510.

used for logging

Key	Name	Description
12	Fixed	This field is twelve bytes long. Its size is constant.
P8	Padding	P8 means pad the field to an 8 byte boundary. The size of this field could be 0 - 7 bytes. P4 means 4 byte alignment, etc (a)
V	Variable	The length of this field depends on its contents. An example is a SID. To know its length, you must decode the structure.
S	X-Ref	A cross-reference shows that the size is defined elsewhere in the table. The size can be represented by any letter, except P or V.

OS	NTFS	Description
blank	all	Used by all versions of Windows
NT	1.2	Only used in Windows NT
2K	3.0	Windows 2000 and later
XP	3.1	New to Windows XP

Flag	Description
0x0001	Read-Only
0x0002	Hidden
0x0004	System
0x0020	Archive
0x0040	Device
0x0080	Normal
0x0100	Temporary
0x0200	Sparse File
0x0400	Reparse Point
0x0800	Compressed
0x1000	Offline
0x2000	Not Content Indexed
0x4000	Encrypted

Offset	Size	Description
~	~	Standard Attribute Header
0x00	8	File reference to the parent directory.
0x08	8	C Time - File Creation
0x10	8	A Time - File Altered
0x18	8	M Time - MFT Changed
0x20	8	R Time - File Read
0x28	8	Allocated size of the file
0x30	8	Real size of the file
0x38	4	Flags, e.g. Directory, compressed, hidden
0x3c	4	Used by EAs and Reparse
0x40	1	Filename length in characters (L)
0x41	1	Filename namespace 0x42 2L File name in Unicode (not null terminated)

Flag	Description
0x0001	Read-Only
0x0002	Hidden
0x0004	System
0x0020	Archive
0x0040	Device
0x0080	Normal
0x0100	Temporary
0x0200	Sparse File
0x0400	Reparse Point
0x0800	Compressed
0x1000	Offline
0x2000	Not Content Indexed
0x4000	Encrypted
0x10000000	Directory (copy from corresponding bit in MFT record)
0x20000000	Index View (copy from corresponding bit in MFT record)

Offset	Size	Description
0x00	2	Offset to data
0x02	2	Size of data
0x04	4	Key	SID
0x08	4	Data	Owner Id
0x0C	4	Data	Hash

Type	OS	Name
0x10		$STANDARD_INFORMATION
0x20		$ATTRIBUTE_LIST
0x30		$FILE_NAME
0x40	NT	$VOLUME_VERSION
0x40	2K	$OBJECT_ID
0x50		$SECURITY_DESCRIPTOR
0x60		$VOLUME_NAME
0x70		$VOLUME_INFORMATION
0x80		$DATA
0x90		$INDEX_ROOT
0xA0		$INDEX_ALLOCATION
0xB0		$BITMAP
0xC0	NT	$SYMBOLIC_LINK
0xC0	2K	$REPARSE_POINT
0xD0		$EA_INFORMATION
0xE0		$EA
0xF0	NT	$PROPERTY_SET
0x100	2K	$LOGGED_UTILITY_STREAM

Offset	Size	Description
~	~	Standard Attribute Header
0x00	4	Type
0x04	2	Record length
0x06	1	Name length (N)
0x07	1	Offset to Name (a)
0x08	8	Starting VCN (b)
0x10	8	Base File Reference of the attribute
0x18	2	Attribute Id (c)
0x1A	2N	Name in Unicode (if N >0)

Offset	Size	Name	Description
~	~	Standard Attribute Header
0x00	16	GUID Object Id	Unique Id assigned to file
0x10	16	GUID Birth Volume Id	Volume where file was created
0x20	16	GUID Birth Object Id	Original Object Id of file
0x30	16	GUID Domain Id	Domain in which object was created

Component			Description
Header			Offsets to various structures
Audit ACL	ACE	SID	ACEs for the Audit ACL
Permissions ACL	ACE	SID	ACEs for the Permissions ACL
	ACE	SID
	ACE	SID
SID (User)			The owner of this object
SID (Group)			The owner of this object

Offset	Size	Description
0x00	1	Revision (a)
0x01	1	Padding
0x02	2	Control Flags (b)
0x04	4	Offset to User SID
0x08	4	Offset to Group SID
0x0C	4	Offset to SACL
0x10	4	Offset to DACL

Offset	Size	Description
0x00	1	ACL Revision
0x01	1	Padding (0x00)
0x02	2	ACL size
0x04	2	ACE count
0x06	2	Padding (0x0000)

Value	Description
0x01	Object inherits ACE
0x02	Container inherits ACE
0x04	Don't propagate 'Inherit ACE'
0x08	Inherit only ACE

Bit(Range)	Meaning	Description / Examples
0 - 15	Object Specific Access Rights	Read data, Execute, Append data
16 - 22	Standard Access Rights	Delete, Write ACL, Write Owner
23	Can access security ACL
24 - 27	Reserved
28	Generic ALL (Read, Write, Execute)	Everything below
29	Generic Execute	All things necessary to execute a program
30	Generic Write	All things necessary to write to a file
31	Generic Read	All things necessary to read a file

S	Security
p	Revision number (currently 1)
q	NT Authority. This number is divided into 6 bytes (48 bit big-endian number).
r-v	NT Sub-authorities (there can be many of these)

Value	Description
0x0001	Dirty
0x0002	Resize LogFile
0x0004	Upgrade on Mount
0x0008	Mounted on NT4
0x0010	Delete USN underway
0x0020	Repair Object Ids
0x8000	Modified by chkdsk

Flag	Description
0x00	Small Index (fits in Index Root)
0x01	Large index (Index Allocation needed)

Flag	Description
0x20000000	Is alias
0x40000000	Is high latency
0x80000000	Is Microsoft
0x68000005	NSS
0x68000006	NSS recover
0x68000007	SIS
0x68000008	DFS
0x88000003	Mount point
0xA8000004	HSM
0xE8000000	Symbolic link

Flag	Description
0x0001	Owner Defaulted
0x0002	Group Defaulted
0x0004	DACL Present
0x0008	DACL Defaulted
0x0010	SACL Present
0x0020	SACL Defaulted
0x0100	DACL Auto Inherit Req
0x0200	SACL Auto Inherit Req
0x0400	DACL Auto Inherited
0x0800	SACL Auto Inherited
0x1000	DACL Protected
0x2000	SACL Protected
0x4000	RM Control Valid
0x8000	Self Relative

Offset	Size	Description
~	~	Standard Attribute Header
0x00	8	Always zero?
0x08	1	Major version number
0x09	1	Minor version number
0x0A	2	Flags
0x0C	4	Always zero?

Offset	Size	Description
~	~	Standard Attribute Header
0x00	4	Attribute Type
0x04	4	Collation Rule
0x08	4	Size of Index Allocation Entry (bytes)
0x0C	1	Clusters per Index Record
0x0D	3	Padding (Align to 8 bytes)

Offset	Size	Description
0x00	4	Offset to first Index Entry
0x04	4	Total size of the Index Entries
0x08	4	Allocated size of the Index Entries
0x0C	1	Flags
0x0D	3	Padding (align to 8 bytes)

Name	Index Of	Used By
$I30	Filenames	Directories
$SDH	Security Descriptors	$Secure
$SII	Security Ids	$Secure
$O	Object Ids	$ObjId
$O	Owner Ids	$Quota
$Q	Quotas	$Quota
$R	Reparse Points	$Reparse

Offset	Size	Description
~	~	Standard Attribute Header
The next field is only valid when the last entry flag is not set
0x00	8	File reference
0x08	2	L = Length of the index entry
0x0A	2	M = Length of the stream
0x0C	1	Flags
The next field is only present when the last entry flag is not set
0x10	M	Stream
The next field is only present when the sub-node flag is set
L - 8	8	VCN of the sub-node in the index allocation attribute

Flag	Description
0x01	Index entry points to a sub-node
0x02	Last index entry in the node

Offset	Size	Description
~	~	Standard Attribute Header
0x00	4	Reparse Type (and Flags)
0x04	2	Reparse Data Length
0x06	2	Padding (align to 8 bytes)
0x08	V	Reparse Data (a)

Offset	Size	Description
0x00	2	Substitute Name Offset
0x02	2	Substitute Name Length
0x04	2	Print Name Offset
0x08	2	Print Name Length
0x10	V	Path Buffer

Offset	Size	Description
~	~	Standard Attribute Header
0x00	2	Size of the packed Extended Attributes
0x02	2	Number of Extended Attributes which have NEED_EA set
0x04	4	Size of the unpacked Extended Attributes

NTFS Documentation

Richard Russon

Yuval Fledel

Chapter 1. Prologue

NTFS Documentation Preface

About the NTFS Documentation

Overview

Documentation Layout

Accuracy

Contact Points

License

Thanks

Tables Legend

Overview

Footnotes

Size Fields

Indexes

Operating System

Volume Layout

Overview

Notes

Other information

MFT Zone

Chapter 2. NTFS Attributes

Overview

Notes

Other Information

Attribute - $STANDARD_INFORMATION (0x10)

Overview

Layout of the Attribute (Resident)

File Permissions

Notes

Other Information

Questions

Attribute - $ATTRIBUTE_LIST (0x20)

Overview

Layout of the Attribute

Notes

$AttrDef

Other Information

To Do

Attribute - $FILE_NAME (0x30)

Overview

Layout of the Attribute (Resident)

File Reference

File Size

Flags

Notes

Other Information

Attribute - $OBJECT_ID (0x40)

Overview

Layout of the Attribute

Birth Volume Id

Birth Object Id

Domain Id

Notes

Other Information

Attribute - $SECURITY_DESCRIPTOR (0x50)

Overview

Layout of the Attribute

Notes

Size

Layout of the stream

Questions

To Do

Header

ACL

ACE

Types

Flags

Access Mask / Access Rights

SID (Security Identifier)

Security Descriptor Control Flags

OWNER DEFAULTED

GROUP DEFAULTED

DACL PRESENT

DACL DEFAULTED

SACL PRESENT

SACL DEFAULTED

SELF RELATIVE

Offset	Size	Description
~	~	Standard Attribute Header
0x00	4	Offset to next Extended Attribute
0x04	1	Flags
0x05	1	Name Length (N)
0x06	2	Value Length (V)
0x08	N	Name
N+0x08	V	Value

Inode	Filename	OS	Description
0	$MFT		Master File Table - An index of every file
1	$MFTMirr		A backup copy of the first 4 records of the MFT
2	$LogFile		Transactional logging file
3	$Volume		Serial number, creation time, dirty flag
4	$AttrDef		Attribute definitions
5	. (dot)		Root directory of the disk
6	$Bitmap		Contains volume's cluster map (in-use vs. free)
7	$Boot		Boot record of the volume
8	$BadClus		Lists bad clusters on the volume
9	$Quota	NT	Quota information
9	$Secure	2K	Security descriptors used by the volume
10	$UpCase		Table of uppercase characters used for collating
11	$Extend	2K	A directory: $ObjId, $Quota, $Reparse, $UsnJrnl
12-15	<Unused>		Marked as in use but empty
16-23	<Unused>		Marked as unused
Any	$ObjId	2K	Unique Ids given to every file
Any	$Quota	2K	Quota information
Any	$Reparse	2K	Reparse point information
Any	$UsnJrnl	2K	Journalling of Encryption
>24	A_File		An ordinary file
>24	A_Dir		An ordinary directory
...	...		...

Offset	Size	Description
0x00	128	Label in Unicode
0x80	4	Type
0x84	4	Display rule
0x88	4	Collation rule
0x8C	4	Flags
0x90	8	Minimum size
0x98	8	Maximum size

Flag	Description
0x00	Binary
0x01	Filename
0x02	Unicode String
0x10	Unsigned Long
0x11	SID
0x12	Security Hash
0x13	Multiple Unsigned Longs

Flag	Description
0x02	Indexed
0x40	Resident (always)
0x80	Non-Resident (allowed to be)

Offset	Size	Description
0x00	4	Size of entry
0x04	4	Flags? (bitfield?)
0x08	2	Offset to UNC Path
0x0A	2	Size of UNC Path
0x0C	2	Offset to data
0x0E	2	Size of data

Offset	Size	Description
0x0000	3	Jump to the boot loader routine
0x0003	8	System Id: "NTFS "
0x000B	2	Bytes per sector
0x000D	1	Sectors per cluster
0x000E	7	Unused
0x0015	1	Media descriptor (a)
0x0016	2	Unused
0x0018	2	Sectors per track
0x001A	2	Number of heads
0x001C	8	Unused
0x0024	4	Usually 80 00 80 00 (b)
0x0028	8	Number of sectors in the volume
0x0030	8	LCN of VCN 0 of the $MFT
0x0038	8	LCN of VCN 0 of the $MFTMirr
0x0040	4	Clusters per MFT Record (c)
0x0044	4	Clusters per Index Record (c)
0x0048	8	Volume serial number
~	~	~
0x0200		Windows NT Loader

Offset	Size	Description
0x00	4	Hash of Security Descriptor
0x04	4	Security Id
0x08	8	Offset of this entry in this file
0x10	4	Size of this entry
0x04	V	Self-relative Security Descriptor
V+0x04	P16	Padding

Offset	Size	Value	Description
~	~	~	Standard Index Header
0x00	2	0x18	Offset to data
0x02	2	0x14	Size of data
0x04	4	0x00	Padding
0x08	2	0x30	Size of Index Entry
0x0A	2	0x08	Size of Index Key
0x0C	2		Flags
0x0E	2	0x00	Padding
0x10	4		Key	Hash of Security Descriptor
0x14	4		Key	Security Id
0x18	4		Data	Hash of Security Descriptor
0x1C	4		Data	Security Id
0x20	8		Data	Offset to Security Descriptor (in $SDS)
0x28	4		Data	Size of Security Descriptor (in $SDS)
0x2C	P8		Data	Padding

Flag	Description
0x0001	Default Limits
0x0002	Limit Reached
0x0004	Id Deleted
0x0010	Tracking Enabled
0x0020	Enforcement Enabled
0x0040	Tracking Requested
0x0080	Log Threshold
0x0100	Log Limit
0x0200	Out Of Date
0x0400	Corrupt
0x0800	Pending Deletes