Smart Pascal record arrays, brute force search
Having introduced the “self-less” class to my readers in an earlier post (the buddha class as I call it), I started thinking about the practical sides of this. Some may just go “oh, so you have an object with no self, so what?”). Well its actually a big deal, especially when talking with servers and dealing with the myriad of non-standard data formats out there.
Array of record
Dictionaries are great because, behind the scenes, they are organized in such a way that the key you use to read or write a value, can be more or less instantaniously verified, looked up and accessed.
Dictionaries can be made in a number of ways, using various formulas for turning a string-key into a meaningful number (which in turn points to the data it represents).

Finding stuff in unstructured data is fun
But this article is not about fancy dictionaries, but rather something resembling an array of unstructured record. Where each item in the array can actually differ from the other both in content and layout (!) Just think about it: under JavaScript libraries throw different structures around like candy, and there is no such thing as a fixed standard.
So let’s say you call some NodeJS webservice and it nukes you back with 100, 10000 or 1000000 records with completely different structure and content — that would pose quite a challenge using ordinary Delphi or FreePascal (well, not difficult but time consuming at least).
Let’s see just how fast we can push JavaScript to search a large mass of unstructured data. Note: I will be using code which is not yet in the RTL here, just so you dont freak out. The Ticks functionality will be introduced in the next Smart Update.
Right, so let’s start with the basics, lets give things some names:
type TW3Associate = variant; TW3Associates = variant;
Variant under Smart Pascal is not the same as a Delphi, C++ or C# variant. It is just a way of saying “we dont know the type” of something. It essentially maps to JObject which is the most basic, empty “object” JavaScript can muster.
Next, Let’s create some data to play with:
procedure TForm1.W3Button2Click(Sender: TObject); var Associates: Array of variant; begin for var x:=1 to 1000000 do begin Associates.add( class cssname := "Item#" + IntToStr(x); typevalue := x; end ); end; end;
This gives us an array of one million JavaScript objects, each with two properties. Next step is to use some old school, brute-force searching. We are going to use loop-expansion, meaning that instead of a simple for/next loop dealing with a single item at a time – we are going to process 8 items per cycle.
function SearchAssociates(const Field:String;const Value:String; const Associates:TW3Associates):TW3Associate; begin result := unassigned; (* Setup look expansion *) var index := 0; var LLongs := associates.length shr 3; var LSingles := associates.length mod 8; (* Search batch of 8 items *) while (LLongs > 0) do begin if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); dec(LLongs); end; (* Search odd single items at the end *) while (LSingles > 0) do begin if Associates[index][Field] = Value then begin result := Associates[index]; exit; end; inc(index); dec(LSingles); end; end;
And then let’s see how fast it can chew through these “untyped” arrays. And please remember that from JavaScripts’s point of view, each of these structures are unique. It cannot, like Delphi, take for granted that the next object is equal to the previous.
The test code now looks like:
procedure TForm1.W3Button2Click(Sender: TObject); var Associates: Array of variant; begin for var x:=1 to 1000000 do begin Associates.add( class cssname := "Item#" + IntToStr(x); typevalue := x; end ); end; writeln('Performing brute force search:'); var StartTime: Integer; var StopTime: Integer; StartTime := TW3Dispatch.Ticks; var data := SearchAssociates("cssname", "Item#999976",Variant(Associates)); StopTime := TW3Dispatch.Ticks; if (data) then begin writeln("Found object:"); writeln(Data); end else writeln("Search failed!"); writeln("Ticks spent:" + (StopTime-StartTime).toString ); end;
Results
I ran the test 10 times, but only the first five produced different values – the engine then settled on 22016.
- 40960
- 22016
- 20992
- 20992
- 24960
Ticks is calculated as the number of ms since 01.01.1970. There are 621355968000000000 ticks from Ist Jan 1900
to Ist Jan 1970 (hence the const). And 10000 ticks represents a millisecond:
const CNT_TICKS_BASE = 621355968000000000; class function TW3Dispatch.Ticks:Integer; begin result := new JDate().getTime() * 1000 + CNT_TICKS_BASE; end;
Conclusion?
Old school pascal beats both ForEach() and Filter(), methods introduced into JavaScript to deal with this kind of scenario. Goes to show, object pascal may be old (C++ is 3 years older actually), but it’s a damn fine language and makes NodeJS and web programming fun!
Try to do the same with an array of record. And you would see that JavaScript would reuse the same object prototype (thanks to how SMS compiler initialize its records). My guess is that it would use less memory and be faster than your code. And IMHO not need to unroll the loop in this case.
See http://www.html5rocks.com/en/tutorials/speed/v8/ and https://www.youtube.com/watch?v=UJPdhx5zTaw
You should ask Eric on such matter.
But the point was non-fixed (unknown) types, not fixed records — that would naturally be a completely different thing.
And Eric and myself debate these things all the time, we are on the same team as you know. Records were added for delphi compatability, but they are compiler managed. Meaning that their internal field names will be altered. Anonymous classes will not have their field names changed.
So there is a reason for this post beyond just array of record 🙂
First of all, and you can ask eric if you feel like it, “records” were added just to help pascal code get ported. If you do a serialization on them, you will find that the names are changed by the compiler.
Anonymous classes does not suffer that change. So ghost classes are better both for trans-language communication and being more flexible.
Second, yes, there are always more ways to skin a car — so i would love to see your version of it, and i might learn something was well.
Im not stating facts carved in stone, just showing that object pascal is a good web language 🙂
But please, expand on the idea and show me some new stuff – i love learning new techniques so im all ears 🙂
Hi, there’s potencial issue when we use static array with an anonymous class (the buddha class), for instance:
var preloaded = class
values : array [0..4] of array [0..1] of Integer = [
[0, 100],
[1000, 200],
[2000, 300],
[3000, 400],
[4000, 500]
];
end;
WriteLn(JSON.stringify(preloaded));
{“values”:[[4000,500],[1000,200],[2000,300],[3000,400],[4000,500]]}
the correct should be…
var
valor: array [0..4] of array [0..1] of Integer = [
[0, 100],
[1000, 200],
[2000, 300],
[3000, 400],
[4000, 500]
];
begin
var preloaded2 = class
values := valor;
end;
WriteLn(JSON.stringify(preloaded2));
{“values”:[[0,100],[1000,200],[2000,300],[3000,400],[4000,500]]}
Hm. post a bugreport on that, that is an important find!
In fact, a modern js engine would reuse the same object slots even for your buddha class code, just like records. You create the same fields so the same object “type” would be reused by the jit. See my links.