Microsoft Access MDB Jet 4 & Jet 5

File Formats

Microsoft Access MDB Jet 4 & Jet 5

Postby Admin on Thu Sep 11, 2008 8:38 pm

Access MDB File Format
This file documents the Microsoft Access MDB file format for Jet3 and Jet4 databases.

General Notes
-------------

Access (Jet) does not in general initialize pages to zero before writing them,
so the file will contains a lot of unititialized data. This makes the task of
figuring out the format a bit more difficult than it otherwise would be.

This document will, generally speaking, provide all offsets and constants in
hex format.

Most multibyte pointer and integers are stored in little endian (LSB-MSB) order.
There is an exception in the case of indexes, see the section on index pages for
details.

Terminology
-----------

This section contains a mix of information about data structures used in the MDB
file format along with general database terminology needed to explain these
structures.

Page - A fixed size region within the file on a 2 or 4K boundry. All
data in the file exists inside pages.
Catalog Table - Tables in Access generally starting with "MSys". See the TDEF
(table definition) pages for "System Table" field.
Catalog Entry - A row from the MSysObjects table describing another database
object. The MSysObjects table definition page is always at
page 2 of the database, and a phony tdef structure is
bootstrapped to initially read the database.
Page Split - A process in which a row is added to a page with no space left.
A second page is allocated and rows on the original page are
split between the two pages and then indexes are updated. Pages
can use a variety of algorithms for splitting the rows, the
most popular being a 50/50 split in which rows are divided
evenly between pages.
Overflow Page - Instead of doing a full page split with associated index writes,
a pointer to an "overflow" page can be stored at the original
row's location. Compacting a database would normally rewrite
overflow pages back into regular pages.
Leaf Page - The lowest page on an index tree. In Access, leaf pages are of
a different type than other index pages.
UCS-2 - a two byte unicode encoding used in Jet4 files.
Covered Query - a query that can be satisfied by reading only index pages. For
instance if the query
"SELECT count(*) from Table1 where Column3 = 4" were run and
Column3 was indexed, the query could be satisfied by reading
only indexes. Because of the way Access hashes text columns
in indexes, covered queries on text columns are not possible.

Pages
-----

At it's topmost level MDB files are organized into a series of fixed sized
pages. These are 2K in size for Jet3 (Access 97) and 4K for Jet4 (Access
2000/2002). All data in MDB files exists within pages, of which there are
a number of types.

The first byte of each page idenitifies the page type as follows.

0x00 Database definition page. (Always page 0)
0x01 Data page
0x02 Table definition
0x03 Intermediate Index pages
0x04 Leaf Index pages
0x05 Page Usage Bitmaps (extended page usage)

Database Definition Page
------------------------

Each MDB database has a single definition page located at beginning of the file.
Not a lot is known about this page, and it is one of the least documented page
types. However, it contains things like Jet version, encryption keys, and name
of the creating program.

Offset 0x14 contains the Jet version of this database 0x00 for 3, 0x01 for 4
This is used by the mdb-ver utility to determine the Jet version.

More >> http://www.etechrecovery.com/001-AccessFileFormat.asp
Admin
Site Admin
 
Posts: 60
Joined: Thu Sep 11, 2008 6:19 pm

Return to File Format Database

Who is online

Users browsing this forum: No registered users

cron