LDef parser done
Note: For a quick introduction to LDef click here: Introduction to LDef.
Great news guys! I finally finished the parser and model builder for LDef!
That means we just need to get the assembler ported. This is presently running fine under Smart Pascal (I like to prototype things there since its faster) – and it will be easy to port it over to Delphi and Freepascal after the model has gone through the steps.
I’m really excited about this project and while I sadly don’t have much free time – this is a project I truly enjoy working on. Perhaps not as much as Smart Pascal which is my baby, but still; its turning into a fantastic system.
Thoughts on the architecture
One of the things I added support for, and that I have hoped that Embarcadero would add to Delphi for a number of years now, is support for contract coding. This is a huge topic that I’m not jumping into here, but one of the features it requires is support for entry and exit sections. Essentially that you can define code that executes before the method body and directly after it has finished (before the result is returned if it’s a function).
This opens up for some very clever means of preventing errors, or at the very least give the user better information about what went wrong. Automated tests also benefits greatly from this.
For example, a normal object pascal method looks, for example, like this:
procedure TForm1.MySpecialMethod; begin writeln("You called my-special-method") end;
The basis of contract design builds on the classical and expands it as such:
procedure TForm1.MySpecialMethod; Before() begin writeln("Before my-special-method"); end; After() begin writeln("After my-special-method"); end; begin writeln("You called my-special-method") end;
Note: contract design is a huge system and this is just a fragment of the full infrastructure.
What is cool about the before/after snippets, is that they allow you to verify parameters before the body is even executed, and likewise you get to work on the result before the value is returned (if any).
You mights ask, why not just write the tests directly like people do all the time? Well, that is true. But there will also be methods that you have no control over, like a wrapper method that calls a system library for instance. Being able to attach before/after code for externally defined procedures helps take the edge off error testing.
Secondly, if you are writing a remoting framework where variant data and multi-threaded invocation is involved – being able to check things as they are dispatched means catching potential errors faster – leading to better performance.
As always, coding techniques is a source of argument – so im not going into this now. I have added support for it and if people don’t need it then fine, just leave it be.
Under LDef assembly it looks like this:
public void main() { enter { } leave { } }
Well I guess that’s all for now. Hopefully my next LDef post will be about the assembler being ready – leaving just the linker. I need to experiment a bit with the codegen and linker before the unit format is complete.
The bytecode-format needs to include enough information so that the linker can glue things together. So every class, member, field etc. must be emitted in a way that is easy and allows the linker to quickly look things up. It also needs to write the actual, resulting method offsets into the bytecode.
Have a happy weekend!
N++ context parser, part 2
Right. In my last article on parsing I introduced a basic buffer class and skeleton parser. In this article I introduce a working 2 level parser, meaning recursive, which implements both the program() and execute() code blocks. Being recursive means that you can now nest several execute() sections inside each other. But at this stage that’s pretty much all I’ve had time to implement — so consider it a rough sketch.
Support
At the moment we have a program parser, which naturally parses a program block which looks like this:
program("name") { }
Registered for the program parser, is a single sub-parser for the “execute” keyword, this is the class TQTXExecuteParser class. This allows you to have execute() sections within a program block, as such:
program("name") { execute (scope) { /* Code goes heres */ } }
To avoid recursive stack errors (stack overflow), the execute class does a trick. Namely that it registers support for itself (execute keyword) at parse-time, meaning that an execute block will support execute child blocks, like this:
program("name") { execute (scope) { /* Sub execute block */ execute (scope) { /* your code here */ } } }
Well, basically this is the base system in place. Adding new keywords and features is now a matter of creating sub parser classes and registering them to either the execute class or the program parser class. The benefit of such a simple system is that it’s fast, easy to understand and use – and easy to maintain. Concrete rules for parsing can be defined, and a lot of code re-cycled between the base-class (TQTXCustomParser) and it’s decendants.
With a bit of work you will have a fine-tuned, fast and efficient language parser.
Supporting other languages
As you probably guessed it’s not that hard to add support for other languages. I mean once you take the time to implement the basic form of a language unit or program-file. But it’s always a time consuming task, where even the tiniest detail matters. If you miss a single language rule (like say, not skippig a comma after it’s been consumed) the error will spread into the next parser and so on – until an exception occurs.
But, once you get the hang of chewing through a source-buffer, supporting enough parsers and keywords (and for Delphi also states, like the “Inerface” keyword state and the “implementation” keyword state) you can pretty much mix and match all you want.
Well, without going to far into the code — here is the updated source. You may want to drop memo control on your form and add the following code:
procedure TForm1.W3Button1Click(Sender: TObject); var mContext: TQTXContext; mParser: TQTXLPPParser; begin mContext:=TQTXContext.Create; try mParser:=TQTXLPPParser.Create(mContext); try mContext.Buffer.LoadFromString(w3memo1.text); try mParser.Parse; except on e: exception do writeln(e.message); end; finally mParser.free; end; finally mContext.free; end; end;
And set the content of the memo to:
procedure TForm1.InitializeForm; begin inherited; w3memo1.Text:=#'program("test") { execute (*) { /* Code here */ } }'; end;
Ok — here is the updated parser code:
unit qtxparser; interface uses SmartCL.System; const TOK_ROUNDOPEN = 1; // "(" TOK_ROUNDCLOSE = 2; // ")" TOK_CUROPEN = 3; // "{" TOK_CURCLOSE = 4; // "}" TOK_SPACE = 5; // " " TOK_ADD = 6; // "+" TOK_SUBTRACT = 7; // "-" TOK_DIVIDE = 8; // "/" TOK_POWER = 9; // "^" TOK_COMMA =10; // "," TOK_COLON =11; // ";" Numeric_Operators: Array [0..3] of Integer = (TOK_ADD,TOK_SUBTRACT,TOK_DIVIDE,TOK_POWER); type ENPPException = Class(EW3Exception); TLPPApplicationObject = Class; TQTXBuffer = Class(TObject) private FData: String; FIndex: Integer; FLineNr: Integer; protected function getCurrent:String;virtual; public Property LineNr:Integer read FLineNr; Property Current:String read getCurrent; Function Back:Boolean; function Next:Boolean; function BOF:Boolean; function EOF:Boolean; procedure SkipJunk; function ReadTo(Const aChars:array of string; var text:String):Boolean;overload; function ReadTo(aValue:String):Boolean;overload; function ReadTo(aValue:String;var inner:String):Boolean;overload; function ReadWord(var Text:String):Boolean; function ReadToEOL(var text:String):Boolean; function PeekAhead(aCount:Integer;var Text:String):Boolean; function Compare(aText:String):Boolean; Property TextFromCurrent:String read ( Copy(FData,FIndex,length(FData)-FIndex) ); procedure First; procedure LoadFromString(Text:String); procedure Clear; End; TQTXContext = Class(TObject) Private FBuffer: TQTXBuffer; public Property Buffer:TQTXBuffer read FBuffer; Property ApplicationObject:TLPPApplicationObject; Constructor Create;virtual; Destructor Destroy;Override; end; TQTXCustomParser = Class; TQTXCustomParserClass = Class of TQTXCustomParser; TQTXParserInfo = Class(TObject) public property Instance: TQTXCustomParser; Property Keyword:String; end; TQTXCustomParser = Class(TObject) private FContext: TQTXContext; FParsers: Array of TQTXParserInfo; protected function getParserIndexFor(keyword:String; var aIndex:Integer):Boolean; function getParserInstanceFor(keyword:String; var aParser:TQTXCustomParser):Boolean; procedure RegisterParser(keyword:String; const aParser:TQTXCustomParser); protected procedure ParseAsExecuteBlock;virtual; protected Procedure ClearParserInstances;virtual; public Property Context:TQTXContext read FContext; procedure Parse;virtual; Constructor Create(aContext:TQTXContext);virtual; Destructor Destroy;Override; end; (* Parser for the keyword "program" *) TQTXProgramParser = Class(TQTXCustomParser) protected procedure ParseProgramName; procedure Parse;override; public Constructor Create(aContext:TQTXContext);override; end; TQTXExecuteParser = class(TQTXCustomParser) protected procedure ParseScope; procedure Parse;override; public Constructor Create(aContext:TQTXContext);override; end; TQTXLPPParser = Class(TQTXCustomParser) protected procedure Parse;override; public Constructor Create(aContext:TQTXContext);override; end; //########################################################################## // In-memory program model classes //########################################################################## TLPPObject = Class(Tobject) end; TLPPModule = Class(TLPPObject) public Property Name:String; end; TLPPModuleList = Array of TLPPModule; TLPPApplicationObject = Class(TLPPObject) private FModules: TLPPModuleList; public Property ApplicationName:String; Property Modules:TLPPModuleList read FModules; function IndexOfModule(aModuleName:String):Integer; Procedure Clear; Constructor Create;virtual; Destructor Destroy;Override; end; implementation var TOK_SYM: Array[TOK_ROUNDOPEN..TOK_COLON] of string = ('(',')','{','}',' ','+','-','/','^',',',';'); //########################################################################### // TLPPApplicationObject //########################################################################### Constructor TLPPApplicationObject.Create; begin inherited Create; end; Destructor TLPPApplicationObject.Destroy; Begin clear; inherited Destroy; end; function TLPPApplicationObject.IndexOfModule(aModuleName:String):Integer; var x: Integer; Begin result:=-1; aModulename:=trim(lowercase(aModuleName)); if length(aModuleName)>0 then begin for x:=0 to FModules.Count-1 do Begin if lowercase(FModules[x].Name)=aModuleName then begin result:=x; break; end; end; end; end; Procedure TLPPApplicationObject.Clear; begin ApplicationName:=''; while FModules.Count>0 do begin FModules[0].free; FModules[0]:=NIL; FModules.Delete(0,1); end; end; //########################################################################### // TQTXLPPParser //########################################################################### Constructor TQTXLPPParser.Create(aContext:TQTXContext); begin inherited Create(aContext); (* Register the "program" keyword-parser *) RegisterParser('program',TQTXProgramParser.Create(context)); end; procedure TQTXLPPParser.Parse; var mCache: String; mSubParser: TQTXCustomParser; Begin repeat case Context.Buffer.Current of ' ': Begin self.Context.Buffer.SkipJunk; end; '/': Begin if context.buffer.Compare('//') then begin self.context.buffer.SkipJunk; end; end; '(': Begin if Context.Buffer.Compare('(*') then Context.buffer.SkipJunk else Begin if getParserInstanceFor(mCache,mSubParser) then begin (* Invoke sub parser *) mSubParser.Parse; (* SKip trailing junk if any *) Context.Buffer.SkipJunk; (* clear cache *) mCache:=''; end else raise ENPPException.CreateFmt ('Syntax error [%s]',[mCache]); end; end; else Begin mCache += Context.Buffer.Current; end; end; if not Context.buffer.next then break; until Context.buffer.EOF; end; //########################################################################### // TQTXExecuteParser //########################################################################### Constructor TQTXExecuteParser.Create(aContext:TQTXContext); Begin inherited Create(AContext); end; procedure TQTXExecuteParser.Parse; var mDummy: Integer; Begin if not self.getParserIndexFor('execute',mDummy) then Begin self.RegisterParser('execute',TQTXExecuteParser.Create(Context)); end; (* Expects: Scope param list (), with a bare minumum of (STAR) *) //writeln(context.buffer.TextFromCurrent); writeln('Enters execute parser'); context.buffer.SkipJunk; (* Check for "(" *) if Context.Buffer.Current=TOK_SYM[TOK_ROUNDOPEN] then Begin // Skip the "(" char Context.Buffer.Next; ParseScope; end; if context.buffer.ReadTo('{') then Begin Context.buffer.next; (* Parse content as an execute block *) self.ParseAsExecuteBlock; end else Raise Exception.Create('Syntax error, expected code-block entry error'); writeln('leaves execute parser'); end; procedure TQTXExecuteParser.ParseScope; var mScope: String; Begin (* Keep reading until we find ")", or break on others *) if context.buffer.ReadTo([')',';','{','}'],mScope) then Begin (* Valid stop? or one of the breakers? *) if context.buffer.Current=')' then Begin writeln('Defined scope for this execute block is:' + mScope); //skip ")" char context.buffer.next; end else Raise Exception.Create('Syntax error, expected scope terminator error'); end; end; //########################################################################### // TQTXProgramParser //########################################################################### Constructor TQTXProgramParser.Create(aContext:TQTXContext); begin inherited Create(aContext); RegisterParser('execute',TQTXExecuteParser.Create(context)); end; procedure TQTXProgramParser.ParseProgramName; var mTemp: String; Begin (* Expects: ("NAME") As in Name of program *) mtemp:=''; if Context.Buffer.Current='(' then begin Context.buffer.next; if Context.buffer.current='"' then begin Context.Buffer.next; if Context.buffer.readTo('"',mTemp) then begin Context.ApplicationObject.ApplicationName:=mTemp; Context.Buffer.next; Context.Buffer.SkipJunk; if Context.buffer.Current=TOK_SYM[TOK_ROUNDCLOSE] then Begin Context.Buffer.next; end else Raise Exception.Create('Syntax error, expected ")" param error'); end else Raise Exception.Create('Syntax error, expected string termination error'); //showmessage(mName); end else Raise Exception.Create('Syntax error, expected string identifier'); end else Raise Exception.Create('Syntax error, expected parameter'); end; procedure TQTXProgramParser.Parse; Begin writeln(ClassName); context.buffer.SkipJunk; (* Check for "(" *) if Context.Buffer.Current=TOK_SYM[TOK_ROUNDOPEN] then Begin ParseProgramName; writeln('Program name:' + Context.ApplicationObject.ApplicationName); context.Buffer.SkipJunk; if context.Buffer.Current=TOK_SYM[TOK_CUROPEN] then Begin context.buffer.next; repeat ParseAsExecuteBlock; Context.buffer.SkipJunk; if context.buffer.Current=TOK_SYM[TOK_CURCLOSE] then break; until Context.Buffer.EOF; end else raise exception.create('Syntax error, expected program-block entry'); end; writeln("Exiting TQTXProgramParer"); end; //########################################################################### // TQTXCustomParser //########################################################################### Constructor TQTXCustomParser.Create(aContext:TQTXContext); Begin inherited Create; FContext:=aContext; end; Procedure TQTXCustomParser.ClearParserInstances; Begin (* Release sub-parser instances if any *) while FParsers.Count>0 do Begin if FParsers[0].Instance<>NIL then Begin FParsers[0].Instance.free; FParsers[0].Instance:=NIL; end; FParsers[0].free; FParsers.Delete(0,1); end; end; Destructor TQTXCustomParser.destroy; begin ClearParserInstances; inherited; end; function TQTXCustomParser.getParserIndexFor(keyword:String; var aIndex:Integer):Boolean; var x: Integer; begin result:=False; aIndex:=-1; keyword:=lowercase(trim(keyword)); if keyword.length>0 then Begin //if FParsers.Count>0 then //begin for x:=0 to FParsers.count-1 do Begin if sameText(FParsers[x].Keyword,keyword) then begin result:=true; aIndex:=x; break; end; end; //end; end; end; function TQTXCustomParser.getParserInstanceFor(keyword:String; var aParser:TQTXCustomParser):Boolean; var x: Integer; begin result:=False; aParser:=NIL; keyword:=lowercase(trim(keyword)); if keyword.length>0 then Begin //if FParsers.Count>0 then //begin for x:=0 to FParsers.count-1 do Begin if sameText(FParsers[x].Keyword,keyword) then begin result:=true; aParser:=FParsers[x].Instance; break; end; end; //end; end; end; procedure TQTXCustomParser.RegisterParser(keyword:String; const aParser:TQTXCustomParser); var mIndex: Integer; mInfo: TQTXParserInfo; Begin keyword:=lowercase(trim(keyword)); if keyword.length>0 then Begin if aParser<>NIL then begin if not getParserIndexFor(keyword,mIndex) then Begin (* register sub parser *) mInfo:=TQTXParserInfo.Create; mInfo.Instance:=aParser; mInfo.Keyword:=keyword; FParsers.Add(mInfo); end else raise ENPPException.createFmt ('Parser for keyword [%s] already registered',[keyword]); end else raise ENPPException.CreateFmt ('Parser class for keyword [%s] was NIL error',[keyword]); end; end; procedure TQTXCustomParser.Parse; begin raise ENPPException.CreateFmt ('No parser implemented for class %s',[classname]); end; (* About: This method continues the parsing process as if it's parsing from within an execute() code block. This means that all the rules of normal 'Execute( * ) " applies. After Program() this is the default parsing mode, which makes it the fallback of more or less all parse-objects which allows ordinary code (e.g "case/switch" sections and more) *) procedure TQTXCustomParser.ParseAsExecuteBlock; var mCache: String; mTemp: String; mParser: TQTXCustomParser; Begin while not Context.Buffer.EOF do Begin case Context.Buffer.Current of ' ', #9: begin //Context.buffer.SkipJunk; end; '(': If context.Buffer.Compare('(*') and not Context.buffer.compare('(*)') then Begin Context.buffer.next; context.buffer.next; Context.Buffer.ReadTo('*)'); Context.Buffer.Next; Context.Buffer.Next; end else Begin if mCache.length>0 then Begin writeln('Command is: "' + mCache + '"'); if getParserInstanceFor(lowercase(mCache),mParser) then Begin writeln('Parser ' + mParser.classname + ' is registered for keyword ' + mCache); try mParser.Parse; except on e: exception do raise Exception.Create(e.message); end; end else Raise Exception.create('Syntax error, unknown token:' + mCache); mCache :=''; exit; end else Raise Exception.Create('Syntax error, unsuspected character'); end; '/': If Context.buffer.compare('//') then begin Context.buffer.ReadToEOL(mTemp); end else if Context.buffer.compare('/*') then Begin Context.buffer.next; Context.buffer.Next; Context.buffer.ReadTo('*/'); end; '{': Begin if mCache.length>0 then Begin if getParserInstanceFor(lowercase(mCache),mParser) then Begin showmessage('over here!'); end else Raise Exception.create('Syntax error, unknown token:' + mCache); end else Raise Exception.Create('Syntax error, unexpected character'); end; '}': Begin end; else Begin mCache += Context.buffer.Current; end; end; if not context.buffer.next then break; end; writeln('Exiting CodeBlock parser at:' + mCache); end; //########################################################################### // TQTXContext //########################################################################### Constructor TQTXContext.Create; begin inherited Create; FBuffer:=TQTXBuffer.create; ApplicationObject:=TLPPApplicationObject.Create; end; Destructor TQTXContext.destroy; Begin FBuffer.free; ApplicationObject.free; inherited; end; //########################################################################### // TQTXBuffer //########################################################################### procedure TQTXBuffer.First; Begin If FData.length>0 then begin FLineNr:=1; FIndex:=0; end; end; procedure TQTXBuffer.SkipJunk; var mTemp: String; begin repeat case Current of ' ', #9 : Begin (* We treat TAB as space *) //if not next then //break; end; '/': Begin (* Skip C style remark *) if Compare('/*') then begin if readTo('*/') then Begin Next; Next; end else raise ENPPException.Create('Comment not closed error'); end else Begin (* Skip Pascal style remark *) if Compare('//') then begin if ReadToEOL(mTemp) then next else raise ENPPException.Create('Expected end of line error'); end; end; end; '(': Begin (* Skip pascal style remark *) if compare('(*') and not compare('(*)') then Begin if readTo('*)') then begin Next; Next; end else raise ENPPException.Create('Comment not closed error'); end else break; end; #13: begin // end; else begin Break; end; end; next; until EOF; end; Procedure TQTXBuffer.Clear; begin FData:=''; FLineNr:=1; FIndex:=0; end; procedure TQTXBuffer.LoadFromString(Text:String); Begin FLineNr:=1; FData:=trim(text); FData:=StrReplace(FData,#10,#13); FData:=StrReplace(FData,#13#13,#13); FIndex:=1; if length(FData)<1 then FIndex:=-1; end; function TQTXBuffer.getCurrent:String; Begin result:=FData[FIndex]; end; function TQTXBuffer.ReadWord(var Text:String):Boolean; begin result:=False; Text:=''; if not EOF then begin repeat if (current in ['A'..'Z','a'..'z','0'..'9']) then Text += current else break; until not next; result:=length(Text)>0; end; end; function TQTXBuffer.Compare(aText:String):Boolean; var mData: String; Begin result:=PeekAhead(length(aText),mData) and SameText(lowercase(mData),lowercase(aText)); end; function TQTXBuffer.PeekAhead(aCount:Integer;var Text:String):Boolean; var mPos: Integer; Begin result:=False; Text:=''; if not EOF then Begin mPos:=FIndex; try while aCount>0 do begin Text+=Current; if not Next then break; dec(aCount); end; result:=length(text)>0; finally FIndex:=mPos; end; end; end; function TQTXBuffer.ReadToEOL(var text:String):Boolean; Begin result:=ReadTo([#13,#10],text); end; function TQTXBuffer.ReadTo(aValue:String;var inner:String):Boolean; var mText: String; begin inner:=''; result:=False; aValue:=lowercase(aValue); if aValue.length>0 then begin repeat if PeekAhead(aValue.length,mText) then begin mText:=lowercase(mText); result:=SameText(mText,aValue); if result then break else inner += Current; end else inner += Current; Next; until EOF; end; end; function TQTXBuffer.readTo(aValue:String):Boolean; var mText: String; begin result:=False; aValue:=lowercase(aValue); if aValue.length>0 then begin repeat if PeekAhead(aValue.length,mText) then begin mText:=lowercase(mText); result:=SameText(mText,aValue); if result then break; end; if not Next then break; until EOF; (* repeat if PeekAhead(aValue.length,mText) and SameText(lowercase(mtext),aValue) then Begin result:=true; break; end else if not next then break; until EOF; *) end else Raise ENPPException.Create ('ReadTo() failed, invalid target value error'); end; function TQTXBuffer.ReadTo(Const aChars:Array of string; var text:String):Boolean; var x: Integer; Begin result:=False; text:=''; if aChars.Length>0 then begin for x:=FIndex to FData.length do Begin if (Current in aChars) then Begin result:=true; break; end else text+=Current; if not Next then break; end; end; end; Function TQTXBuffer.Back:Boolean; begin result:=FIndex>1; if result then dec(FIndex); end; function TQTXBuffer.Next:Boolean; begin Result:=FIndex<fdata.length; if="" result="" then="" begin="" inc(findex);="" (current="" in="" [#13,#10])="" inc(flinenr);="" end;="" function="" tqtxbuffer.bof:boolean;="" result:="FIndex=1;" tqtxbuffer.eof:boolean;="">=FData.Length; end; end.
N++ accessing services
Sometimes you come up with solutions which are a thing of pure beauty and elegance. Im falling in love with the simplicity of my N++ programming language, it’s robust boundaries which creates a situation where errors are quickly identified and dealt with.
Below is the syntax for using a web-service. Notice that setting up an WSDL endpoint is no more difficult than setting up a stdout pipe (command-line output).
program("service_test") { handshake { input { /* Consume WSDL Web-Service Endpoint */ service1 @ service("http://www.test.com/SOAP/myService/WSDL/", serviceType:SoapService); } output { myProcess @ process("self"); stdio @ pipe("stdout"); } } /* Execute RPC call */ execute (stdio,service1) { stdio:writeln("Calling webservice"); execute (*) { var int32 result = 0; set result = service1:getUserId("quartex","secret"); stdio:writelnF("UserID on server={0}", result); } fail (e) { stdio.writelnF("Calling soap service failed: {0}",e); proceed; } } /* Exit code for process */ set myProcess:exitCode = 0; }
For those of you interested in creating programming languages, I hope this wet’s your appetite to learn N++. At the moment the proof-of-concept is being written in Smart Pascal for nodeJS, but a native version running on Linux, Unix, Windows and OS X is the final product.
N++ is free to use. It is not open-source but created as a non-profit language which will be managed through an organization (more or less identical to php, perl and python).
To date, it’s the only language which is fully service oriented from the ground up.
You must be logged in to post a comment.