Home > Delphi, firemonkey, Object Pascal > Building a Delphi Database engine, part one

Building a Delphi Database engine, part one

Databases come in all shapes and sizes. From blistering fast in-memory datasets intended to hold megabytes, to massive distributed systems designed to push terabytes around like lego. Is it even worth looking into a database engine these days?

Firebird, the tardis of database engines!

There are tons of both commerical and free DB engines for Delphi

Let me start by saying: NO. If you need a native database written in object pascal, there are several awesome engines available. Engines that have been in production for ages, that have been tried and tested by time with excellent supports, speed and track records. My personal favorite is ElevateDB which is the successor to DBIsam, an engine I used in pretty much all my projects before 64 bit became the norm. ElevateDB handles both 32 and 64 bit and is by far my database of choice.

The purpose of this article is not to create an alternative or anything along those lines, quite the opposite. It’s purely to demonstrate some of the ideas behind a database – and just how far we have come from the old “file of record” that is synonymous with the first databases. Like C/C++ Delphi has been around for a while so you can still find support for these older features, which I think is great because there are still places where such a “file of record” can be extremely handy. But again, that is not what we are going to do here.

The reason I mentioned “file of record” is because, even though we don’t use that any more – it does summarize the most fundamental idea of a database. What exactly is a database anyways? Is it really this black box?

A database file (to deal with that first) is supposed to contain the following:

  • One of more table definitions
  • One or more tables (as in data)
  • Indexes and lookup tables (word indexes)
  • Stored procedures
  • Views and triggers

So the question becomes, how exactly do we stuff all these different things into a single file? Does it have a filesystem? Are each of these things defined in a record of sorts? And what about this mysterious “page size” thingy, what is that about?

A file of blocks

All databases have the same problem, but each solves these problems differently. Most major databases are in fact not a single file anymore. Some even place the above in completely separate files. For me personally I don’t see any gain in having a single file with everything stuffed into it – but for sake of argument we will be looking at that in this article. It has some educational value.

The way databases are organized is directly linked to the problem above, namely that we need to store different types of data – of arbitrary length – into a single file. In other words you can’t use a fixed record because you cannot possibly know how many fields a table will have, nor can you predict the number of tables or procedures. So we have to create a system that can take any length of data and “somehow” be able to place that inside the file.

db_file_blocksSo ok, what if we divide the file into blocks, each capable of storing a piece of data? So if a single block is not enough, the data will be spread out over multiple blocks?

Indeed. And that is also where the page-size value comes in. The pagesize defines the capacity of a block, which indirectly affects how much data each block occupies. Pretty cool right?

But conceptually dividing a file into blocks doesn’t solve all our problems. How exactly will we know what blocks represents a record or a form definition? I mean, if some piece of data is spread over 10 blocks or 20 blocks, how do we know that these represents a single piece of data? How do we navigate from block #1 in a sequence, to block #2?

Well, each block in the file can house raw data, we have to remember that the whole “block” idea is just conceptual and a way that we approach a file in our code. When we write blocks to the file, we have to do that in a particular way, so it’s not just a raw slice of a stream or array of bytes.

We need a block header to recognize that indeed, this is a known block; we need the block number of the next block that holds more data – and it’s probably wise to sign each block with the block-number for the first pice.

So from a pseudo-code point of view, we end up with something like:

block_layout_example

Since our blocks will be hosted inside a stream (so no “file of record” like I said), we have to use the “packed” keyword. Delphi like any other language always tries to align records to a boundary, so even if the record is just 4 bytes long (for example), it will be aligned to eight bytes (you can control the boundary via the compiler options). That can cause problems when we calculate the offset to our blocks so we mark both the record and data as “packed”, which tells the compiler to drop alignment.

Let’s look at what each field in the record (descriptor) means:

  • fbHeader, a unique number that we check for before reading a block. If this number is missing or doesn’t match, something is wrong and the data is not initialized (or it’s not a db file). Lets use $CAFEBABE since it’s both unique and fun at the same time!
  • fbFirst, the block number of the first block in a sequence (piece of coherent data)
  • fbNext, the next block in the sequence after the current
  • fbUsed: the number of bytes uses in the fbData array. At the end of a sequence it might just use half of what the block can store – so we make sure we know how much to extract from fbData in each segment
  • fbData: a packed byte array housing “pagesize” number of bytes

A sequence of blocks

The block system solves the problem of storing variable length data by dividing the data into suitable chunks. As shown above we store the data with just enough information to find the next block, and next block, until we have read back the whole sequence.

So the block-sequence is pretty much a file. It’s actually very much a file because this is incidentally how harddisks and floppy disks organize files. It had a more complex layout of course, but the idea is the same. A “track-disk-device” stores blocks of data organized in tracks and sectors (sequences). Not a 1:1 comparison but neat none the less.

But ok, we now have some idea about storing larger pieces of data as a chunk of blocks. But why not just store the data directly? Why not just append each record to a file and to hell with block chunks – wouldnt that be faster?

db_file_sequence

Well yes, but how would you recycle the space? Lets say you have a database with 10.000 records, each with different sizes, and you want to delete record number 5500. If you just append stuff, how would you recycle that space? There is no way of predicting the size of the next sequence, so you could end up with large segments of empty space that could never be recycled.

By conceptually dividing the file into predictable blocks, and then storing data in chunks where each block “knows” it’s next of kin – and holds a reference to its root (the fbFirst field), we can suddenly solve the problem of recycling!

Ok, lets sum up what we have so far:

  • We have solved storing arbitrary length data by dividing the data into conceptual blocks. These blocks dont have to be next to each other.
  • We have solved reading X number of blocks to re-create the initial data
  • We call the collection of blocks that makes up a piece of data a “sequence”
  • The immediate benefit of block-by-block storage is that space can be recycled. Blocks dont have to be next to each other, a record can be made up by blocks scattered all over the place but still remain coherent.

Not bad! But we are still not home free, there is another challenge that looms above, namely how can we know if a block is available or occupied?

Keeping track of blocks

This is actually a pretty cool place in the buildup of an engine, because the way we read, write and figure out what blocks can be recycled – very much impacts the speed of high-level functions like inserts and navigation. This is where I would introduce memory mapped files before I moved on, but like I mentioned earlier -we will skip memory mapping because the article would quickly morph into a small essay. I don’t want memory mapping and MMU theory to overshadow the fundamental principles that I’m trying to pass on.

We have now reached the point where we ask the question “how do we quickly keep track of available free blocks?”, which is an excellent question with more than a single answer. Some database vendors use a separate file where each byte represents a block. My first experiment was to use a text file, thinking that functions like Pos() would help me locate blocks faster. It was a nice idea but we need more grunt than that.

What I landed on after some experimentation, which is a good compromise between size and speed, was to use a bit buffer to keep track of things. So a single bit is either taken (1) or available (0). You can quickly search for available bits because if a byte has any other value than $FF (255) you know there is a free bit there. It’s also very modest with regards to size, and you can keep track of 10.000 blocks with only 1250 bytes.

The code for the bit-buffer was written to be easy to use. In a high-end engine you would not waste the cpu time by isolating small calculations in a single method, but try to inline as much as possible. But for educational purposes my simple implementation will be more than adequate.

Note: I will be setting up a github folder for this project, so for our next article you can just fork that. WordPress has a tendency to mess up Delphi code, so if the code looks weird don’t worry, it will all be neatly organized into a project shortly.

The file header

Before we look at the bit code to keep track of blocks, you might be thinking “what good is it to keep track of blocks if we have nowhere to store that information?“. You are quite right, and this is where the file header comes in.

The file header is the only fixed part of a database file. Like I mentioned earlier there are engines that stuffs everything into a single file, but in most cases where performance is the highest priority – you want to use several files and avoid mixing apples and oranges. I would store the block-map (bit buffer) in a separate file – because that maps easily into memory under normal use. I would also store the table definitions, indexes and more as separate files. If nothing else than to make repairing and compacting a database sane. But i promised to do a single file model (me and my big fingers), so we will be storing the meta-data inside the database file, so let’s do just that.

The file-header is just a segment of the database-file (the start of the file) that contains some vital information. When we calculate the stream offset to each block (for either reading or writing), we simply add the size of the header to that. We don’t want to accidentally overwrite that part of the file.

Depending on how we evolve the reading and writing of data sequences, the header doesn’t have to contain that much data. You probably want to keep track of the page-size, followed by the start block for the table definitions. So when you open a database you would immediately start by reading the block-sequence containing all the definitions the file contains. How we organize the data internally is irrelevant for the block-file and IO scaffolding. It’s job is simple: read or write blocks, calculate offsets, avoid killing the header, pay attention to identifiers.

Some coders store the db schemas etc. at the end of the file, so when you close the DB the information is appended to the filestream – and the offset is stored in the header. This is less messy, but also opens up for corruption. If the DB is not properly closed you risk the DB information never being streamed out.  Which is again another nail in the coffin for single-file databases (at least from my personal view, there are many ways to Rome and database theory can drive you nuts at times).

So I end this first article of our series with that. Hopefully I have given you enough ideas to form a mental image of how the underlying structure of a database file is organized. There are always exceptions to the rule and a wealth of different database models exists. So please keep in mind that this article has just scratched the surface on a slab of concrete.

unit qtx.utils.bits;

interface

uses
  System.SysUtils, System.Classes;

type

  (* Exception classes *)
  EQTXBitBuffer = Class(Exception);

  TBitOffsetArray = packed array of NativeUInt;

  (* About TQTXBitBuffer:
    This class allows you to manage a large number of bits,
    much like TBits in VCL and LCL.
    However, it is not limited by the shortcomings of the initial TBits.

    - The bitbuffer can be saved
    - The bitbuffer can be loaded
    - The class exposes a linear memory model
    - The class expose methods (class functions) that allows you to
    perform operations on pre-allocated memory (memory you manage in
    your application).

    Uses of TQTXBitbuffer:
    Bit-buffers are typically used to represent something else,
    like records in a database-file. A bit-map is often used in Db engines
    to represent what hapes are used (bit set to 1), and pages that can
    be re-cycled or compacted away later. *)

  TQTXBitBuffer = Class(TObject)
  Private
    FData: PByte;
    FDataLng: NativeInt;
    FDataLen: NativeInt;
    FBitsMax: NativeUInt;
    FReadyByte: NativeUInt;
    FAddr: PByte;
    BitOfs: 0 .. 255;
    FByte: byte;
    function GetByte(const Index: NativeInt): byte;
    procedure SetByte(const Index: NativeInt; const Value: byte);
    function GetBit(const Index: NativeUInt): boolean;
    procedure SetBit(const Index: NativeUInt; const Value: boolean);
  Public
    property Data: PByte read FData;
    property Size: NativeInt read FDataLen;
    property Count: NativeUInt read FBitsMax;
    property Bytes[const Index: NativeInt]: byte Read GetByte write SetByte;
    property bits[const Index: NativeUInt]: boolean Read GetBit
      write SetBit; default;

    procedure Allocate(MaxBits: NativeUInt);
    procedure Release;
    function Empty: boolean;
    procedure Zero;

    function ToString(const Boundary: integer = 16): string; reintroduce;

    class function BitsOf(Const aBytes: NativeInt): NativeUInt;
    class function BytesOf(aBits: NativeUInt): NativeUInt;

    class function BitsSetInByte(const Value: byte): NativeInt; inline;
    class Function BitGet(Const Index: NativeInt; const Buffer): boolean;
    class procedure BitSet(Const Index: NativeInt; var Buffer;
      const Value: boolean);

    procedure SaveToStream(const stream: TStream); virtual;
    procedure LoadFromStream(const stream: TStream); virtual;

    procedure SetBitRange(First, Last: NativeUInt; const Bitvalue: boolean);
    procedure SetBits(const Value: TBitOffsetArray; const Bitvalue: boolean);
    function FindIdleBit(var Value: NativeUInt;
      const FromStart: boolean = false): boolean;

    destructor Destroy; Override;
  End;

implementation

resourcestring
  ERR_BitBuffer_InvalidBitIndex = 'Invalid bit index, expected 0..%d not %d';

  ERR_BitBuffer_InvalidByteIndex = 'Invalid byte index, expected 0..%d not %d';

  ERR_BitBuffer_BitBufferEmpty = 'Bitbuffer is empty error';

  ERR_ERR_BitBuffer_INVALIDOFFSET =
    'Invalid bit offset, expected 0..%d, not %d';

var
  CNT_BitBuffer_ByteTable:  array [0..255] of NativeInt =
  (0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
  1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
  2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
  3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
  4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8);

function QTXToNearest(const Value, Factor: NativeUInt): NativeUInt;
var
  LTemp: NativeUInt;
Begin
  Result := Value;
  LTemp := Value mod Factor;
  If (LTemp > 0) then
    inc(Result, Factor - LTemp);
end;

// ##########################################################################
// TQTXBitBuffer
// ##########################################################################

Destructor TQTXBitBuffer.Destroy;
Begin
  If not Empty then
    Release;
  inherited;
end;

function TQTXBitBuffer.ToString(const Boundary: integer = 16): string;
const
  CNT_SYM: array [boolean] of string = ('0', '1');
var
  x: NativeUInt;
  LCount: NativeUInt;
begin
  LCount := Count;
  if LCount > 0 then
  begin
    LCount := QTXToNearest(LCount, Boundary);
    x := 0;
    while x < LCount do
    begin
      if x = 0) then
  Begin
    LAddr := @Buffer;
    inc(LAddr, Index shr 3);

    LByte := LAddr^;
    BitOfs := Index mod 8;
    LCurrent := (LByte and (1 shl (BitOfs mod 8)))  0;

    case Value of
      true:
        begin
          (* set bit if not already set *)
          If not LCurrent then
            LByte := (LByte or (1 shl (BitOfs mod 8)));
          LAddr^ := LByte;
        end;
      false:
        begin
          (* clear bit if already set *)
          If LCurrent then
            LByte := (LByte and not(1 shl (BitOfs mod 8)));
          LAddr^ := LByte;
        end;
    end;

  end
  else
    Raise EQTXBitBuffer.CreateFmt(ERR_ERR_BitBuffer_INVALIDOFFSET,
      [maxint - 1, index]);
end;

procedure TQTXBitBuffer.SaveToStream(const stream: TStream);
var
  LWriter: TWriter;
begin
  LWriter := TWriter.Create(stream, 1024);
  try
    LWriter.WriteInteger(FDataLen);
    LWriter.Write(FData^, FDataLen);
  finally
    LWriter.FlushBuffer;
    LWriter.Free;
  end;
end;

Procedure TQTXBitBuffer.LoadFromStream(const stream: TStream);
var
  LReader: TReader;
  LLen: NativeInt;
Begin
  Release;
  LReader := TReader.Create(stream, 1024);
  try
    LLen := LReader.ReadInteger;
    if LLen > 0 then
    begin
      Allocate(BitsOf(LLen));
      LReader.Read(FData^, LLen);
    end;
  finally
    LReader.Free;
  end;
end;

Function TQTXBitBuffer.Empty: boolean;
Begin
  Result := FData = NIL;
end;

Function TQTXBitBuffer.GetByte(const Index: NativeInt): byte;
Begin
  If FData  NIL then
  Begin
    If (index >= 0) and (Index = 0) and (Index  Secondary then
        Result := (Primary - Secondary)
      else
        Result := (Secondary - Primary);

      if Exclusive then
      begin
        If (Primary < 1) or (Secondary < 1) then
          inc(Result);
      end;

      If (Result < 0) then
        Result := abs(Result);
    end
    else
      Result := 0;
  end;

Begin
  If (FData  nil) then
  Begin
    If (First  0 do
        Begin
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          SetBit(x, Bitvalue);
          inc(x);
          dec(LLongs);
        end;

        (* process singles *)
        LSingles := NativeInt(LCount mod 8);
        while (LSingles > 0) do
        Begin
          SetBit(x, Bitvalue);
          inc(x);
          dec(LSingles);
        end;

      end
      else
      begin
        if (First = Last) then
          SetBit(First, true)
        else
          Raise EQTXBitBuffer.CreateFmt(ERR_BitBuffer_InvalidBitIndex,
            [FBitsMax, Last]);
      end;
    end
    else
      Raise EQTXBitBuffer.CreateFmt(ERR_BitBuffer_InvalidBitIndex,
        [FBitsMax, First]);
  end
  else
    Raise EQTXBitBuffer.Create(ERR_BitBuffer_BitBufferEmpty);
end;

Procedure TQTXBitBuffer.SetBits(Const Value: TBitOffsetArray;
  Const Bitvalue: boolean);
var
  x: NativeInt;
  FCount: NativeInt;
Begin
  If FData  NIL then
  Begin
    FCount := length(Value);
    If FCount > 0 then
    Begin
      for x := low(Value) to High(Value) do
        SetBit(Value[x], Bitvalue);
    end;
  end
  else
    Raise EQTXBitBuffer.Create(ERR_BitBuffer_BitBufferEmpty);
end;

Function TQTXBitBuffer.FindIdleBit(var Value: NativeUInt;
  Const FromStart: boolean = false): boolean;
var
  FOffset: NativeUInt;
  FBit: NativeUInt;
  FAddr: PByte;
  x: NativeInt;
Begin
  Result := FData  NIL;
  if Result then
  Begin
    (* Initialize *)
    FAddr := FData;
    FOffset := 0;

    If FromStart then
      FReadyByte := 0;

    If FReadyByte < 1 then
    Begin
      (* find byte with idle bit *)
      While FOffset < NativeUInt(FDataLen) do
      Begin
        If BitsSetInByte(FAddr^) = 8 then
        Begin
          inc(FOffset);
          inc(FAddr);
        end
        else
          break;
      end;
    end
    else
      inc(FOffset, FReadyByte);

    (* Last byte exhausted? *)
    Result := FOffset  7 then
            FReadyByte := 0
          else
            FReadyByte := FOffset;

          break;
        end;
        inc(FBit);
      end;
    end;

  end;
end;

Function TQTXBitBuffer.GetBit(Const Index: NativeUInt): boolean;
begin
  If FData  NIL then
  Begin
    If index  7 then
              FReadyByte := 0;
          end;

        end;
      end
      else
      Begin
        (* clear bit if not already clear *)
        If (FByte and (1 shl (BitOfs mod 8)))  0 then
        Begin
          FByte := (FByte and not(1 shl (BitOfs mod 8)));
          PByte(FDataLng + NativeInt(index shr 3))^ := FByte;

          (* remember this byte pos *)
          FReadyByte := Index shr 3;
        end;
      end;

    end
    else
      Raise EQTXBitBuffer.CreateFmt(ERR_BitBuffer_InvalidBitIndex,
        [Count - 1, index]);
  end
  else
    Raise EQTXBitBuffer.Create(ERR_BitBuffer_BitBufferEmpty);
end;

Procedure TQTXBitBuffer.Allocate(MaxBits: NativeUInt);
Begin
  (* release buffer if not empty *)
  If FData  NIL then
    Release;

  If MaxBits > 0 then
  Begin
    (* Allocate new buffer *)
    try
      FReadyByte := 0;
      FDataLen := BytesOf(MaxBits);
      FData := AllocMem(FDataLen);
      FDataLng := NativeUInt(FData);
      FBitsMax := BitsOf(FDataLen);
    except
      on e: Exception do
      Begin
        FData := NIL;
        FDataLen := 0;
        FBitsMax := 0;
        FDataLng := 0;
        Raise;
      end;
    end;

  end;
end;

Procedure TQTXBitBuffer.Release;
Begin
  If FData  NIL then
  Begin
    try
      FreeMem(FData);
    finally
      FReadyByte := 0;
      FData := NIL;
      FDataLen := 0;
      FBitsMax := 0;
      FDataLng := 0;
    end;
  end;
end;

Procedure TQTXBitBuffer.Zero;
Begin
  If FData  NIL then
    Fillchar(FData^, FDataLen, byte(0))
  else
    raise EQTXBitBuffer.Create(ERR_BitBuffer_BitBufferEmpty);
end;

end.
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: