Home > Delphi, Object Pascal > Building a Delphi Database engine, part four

Building a Delphi Database engine, part four

This article is over six months late (gasp!). Work at Embarcadero have been extremely time consuming, and my free time has been bound up in my ex-patreon project. So that’s why I was unable to finish in a more predictable fashion.

But better late than never — and we have finally reached one of the more exciting steps in the evolution of our database engine design, namely the place where we link our metadata to actual data.

So far we have been busy with the underlying mechanisms, how to split up larger pieces of data, how to collect these pieces and re-assemble them, how to grow and scale the database file and so on.

We ended our last article with a working persistence layer, meaning that the codebase is now able to write the metadata to itself, read it back when you open the database, persist sequences (records) – and our humble API is now rich enough to handle tasks like scaling. At the present we only support growth, but we can add file compacting later.

Tables and records

In our last article’s code, the metadata exposed a Table class. This table-class in turn exposed an interface to our field-definitions, so that we have a way to define how a table should look before we create the database.

You have probably taken a look at the code (I hope so, or much of this won’t make much sense) and noticed that the record class (TDbLibRecord) is used both as a blueprint for a table (field definitions), as well as the actual class that holds the values.

If you look at the class again (TDbLibRecord can be found in the file dblib.records.pas), you will notice that it has a series of interfaces attached to it:

  • IDbLibFields
  • IStreamPersist

The first one, which we expose in our Table as the FieldDefs property, simply exposes functions for adding and working with the fields. While somewhat different from Delphi’s traditional TFieldDefinition class, it’s familiar enough. I don’t think anyone who has used Delphi with databases would be confused around it’s members:

  IDbLibFields = interface
    ['{0D6A9FE2-24D2-42AE-A343-E65F18409FA2}']
    function    IndexOf(FieldName: string):  integer;
    function    ObjectOf(FieldName: string): TDbLibRecordField;

    function    Add(const FieldName: string; const FieldClass: TDbLibRecordFieldClass): TDbLibRecordField;
    function    Addinteger(const FieldName: string): TDbLibFieldInteger;
    function    AddStr(const FieldName: string): TDbLibFieldString;
    function    Addbyte(const FieldName: string): TDbLibFieldbyte;
    function    AddBool(const FieldName: string): TDbLibFieldboolean;
    function    AddCurrency(const FieldName: string): TDbLibFieldCurrency;
    function    AddData(const FieldName: string): TDbLibFieldData;
    function    AddDateTime(const FieldName: string): TDbLibFieldDateTime;
    function    AddDouble(const FieldName: string): TDbLibFieldDouble;
    function    AddGUID(const FieldName: string):  TDbLibFieldGUID;
    function    AddInt64(const FieldName: string): TDbLibFieldInt64;
    function    AddLong(const FieldName: string): TDbLibFieldLong;
  end;

But, as you can see, this interface is just a small part of what the class is actually about. The class can indeed hold a list of fields, each with its own datatype – but it can also persist these fields to a stream and read them back again. You can also read and write a value to each field. So it is, for all means and purposes, a single record in class form.

The term people use for this type of class is: property bag, and it was a part of the Microsoft standard components (Active X / COM) for ages. Its probably still there, but I prefer my own take on the system.

In this article we are going to finish that work, namely the ability to define a table, create a database based on the metadata, insert a new record, read records, and push the resulting binary data to the database file. And since the persistency is already in place, opening the database and reading the record back is pretty straight forward.

So this is where the metadata stops being just a blue-print, and becomes something tangible and real.

Who owns what?

Before we continue, we have to stop and think about ownership. Right now the database file persists a global list of sequences. The database class itself has no interest in who owns each sequence, if a sequence belongs to a table, if it contains a picture, a number or whatever the content might be — it simply keeps track of where each sequence begins.

So the first order of the day is to expand the metadata for tables to manage whatever records belongs to that table. In short, the database class will focus on data within its scope, and the table instances will maintain their own overview.

So the metadata suddenly need to save a list of longwords with each table. You might say that this is wasteful, that the list maintained by the database should be eliminated and that each table should keep track of it’s own data. And while that is tempting to do, there is also something to be said about maintenance. Being able to deal with persisted data without getting involved with the nitty-gritty of tables is going to be useful when things like database compacting enters at the end of our tutorial.

Locking mechanism

Delphi has a very user-friendly locking mechanism when it comes to databases. A table or dataset is either in read, edit or insert mode – and various functions are allowed or prohibited depending on that state. And it would probably be wise to merge the engine with Delphi’s own TDatabase and TTable at some point – but right now im more interested in keeping things clean and simple.

When I write “locking mechanism” I am not referring to a file-lock, or memory lock. Had we used memory-mapped files the locking mechanism would have been more elaborate. What I mean with a lock, is basically placing a table in one of the states I mentioned above. The table needs to know what exactly you want to do. Are you adding a record? Are you editing an existing record? The table code needs to know this to safely bring you from one mode to the next.

Suddenly, you realize why each table needs that extra list, because how is the table going to allow methods like first, next, last and previous? The record-list dealt with by the database is just a generic, non-ordered ledger of sequences (a global scope list if you will). Are you going to read all records back when you open the database to figure out who owns what?

A call to First() will mean a completely different offset for each table. And the logical way to handle this, is to give each table it’s own cursor. A class that keeps track of what records belongs to the table, and also keeps track of whatever states the table is in.

The database cursor

Since we are not up against Oracle or MSSQL here, but exploring database theory, I have kept the cursor as simple as I possibly could. It is a humble class that looks like this:

db_cursor

The idea of-course is that the table defaults to “read” mode, meaning that you can navigate around, record by record, or jump to a specific record using the traditional RecNo property.

The moment you want to insert or edit a record, you call the Lock() method, passing along the locking you need (edit or insert). You can then either cancel the operation or call post() to push the data down to the file.

The Lock() method is a function (bool), making it easier to write code, as such:

  if Database.Table.Cursor.Lock(cmInsert) then
  begin
    with Database.GetTableByName('access_log').cursor do
    begin
      Fields.WriteInt('id', FUserId);
      Fields.WriteStr('name', FuserName);
      Fields.WriteDateTime('access', Now);
      Post();
    end;
  end else
  raise exception.create('failed to insert record');

Im sure the are better designs, and the classes and layout can absolutely be made better; but for our purposes it should be more than adequate.

Reloading record data

In the previous articles we focused on writing data. Basically taking a stream or a buffer, breaking it into pages, and then storing the pages (or blocks) around the file where there was available space.

We cleverly crafted the blocks so that they would contain the offset to the next block in a sequence, making it possible to read back a whole sequence of blocks by just knowing the first one (!)

A part of what the cursor does is also to read data back. Whenever the RecNo field changes, meaning that you are moving around the table-records using the typical Next(), Previous(), First() etc functions — if the cursor is in read mode (meaning: you are not inserting data, nor are you editing an existing record), you have to read the record into memory. Otherwise the in-memory fields wont contain the data for that record.

Creating a cursor

One note before you dive into the code: You have to create a cursor before you can use it! So just creating a table etc wont be enough. Here is how you go about doing this:db_cursor_create

Creating the cursor will be neatly tucked into a function for the table instance, we still have other issues to deal with.

What to expect next?

Next time we will be looking at editing a record, commiting changes and deleting records. And with that in place we have finally reached the point where we can add more elaborate functionality, starting with expression parsing and filters!

You can check out the code here: https://bitbucket.org/cipher_diaz/dbproject/src/master/

Support the cause

If you like my articles and want to see more libraries and techniques, then consider donating to the project here: https://www.paypal.me/quartexNOR

paypal

Those that donate $50 or more automatically get access to the Quartex Web OS repositories, including full access to the QTX replacement RTL (for DWScript and Smart Mobile Studio).

Thank you for your support, projects like Amibian.js and the Quartex Web OS would not exist without my backers!

/Jon

  1. No comments yet.
  1. No trackbacks yet.

Leave a comment