Archive

Archive for May 9, 2019

Understanding a stack

May 9, 2019 Leave a comment

The concept of stacks is an old one, and together with linked-lists and queues – these form the most fundamental programming concepts a developer needs to master.

But, the stack most people use today in languages like object pascal and C++ are not actual stacks; they are more like “conveniently repurposed lists“. Not a huge issue I agree, but the misconception is enough to cause confusion when people dive into low-level programming.

Adventures in assembly-land

stackishIt might seem odd to focus on something as trivial as a stack, but I have my reasons. A friend of mine who is a brilliant coder with plenty of large projects behind him recently decided to have a go at assembly coding. He was doing fine and everything was great, until he started pushing and popping  things off the stack.

After a little chat I realized that the problem was not his code, but rather how he viewed the stack. He was used to high-level versions of stacks, which in most cases are just lists storing arbitrary sized data – so he was looking at the stack as a TList<item> expecting similar behavior. Superficially a real-stack and a list-stack work the same if all you do is clean push and pop operations, but the moment you start designing a stack-scheme and push more elaborate constructs (stack-frames), things can go wrong really fast.

The nature of a real stack

A “real” stack that is a part of a hardware SOC (system on a chip) has nothing to do with lists. It’s actually a solid chunk of memory with a register to keep track of an offset into this memory block.

Let’s for sake of argument say you have 4k of stack space right? It’s clean and contains nothing, so the SP (stack pointer, or offset) is zero. What happens when you push something to the stack? for example:

push EDX

The code above simply writes the content of the EDX register to whatever offset the SP contains. It then updates the SP with the size of the data (EDX is a 32bit register, so the SP is incremented by a longword or 4 bytes). In Delphi pseudocode what happens is something like:

var LAddr := FStackBuffer;
inc(LAddr, SP);
PLongword(LAddr)^ := EDX;
inc(SP, SizeOf(EDX));

The thing about a stack is that it doesn’t manage data-length for you. And that is a big difference to remember. It will push or pop data based on the size of the source (in this case the 32bit EDX register) you use.

If you push 1024 bytes of data to a list based stack, the list keeps track of the size and data for you. So when you pop the data from the stack, you get back that data regardless. But a “real” stack couldn’t care less — which is also why it’s so easy to screw up an entire program if you make a mistake.

In short: The length of what you push – must be matched when you pop the data back (!) If you push a longword, you MUST pop a longword later.

Benefits of a real stack

call stackThe benefit is that the cost of storing values on a stack is almost zero in terms of cpu operations. A list based stack is more expensive; it will allocate memory for a record-item to hold the information about the data, then it will allocate memory to hold the actual data (depends on the type naturally) and finally copy the data into the newly allocated buffer. Hundreds if not thousands of instructions can be involved here.

A real stack will just write whatever you pushed directly into the stack-memory at whatever offset SP is at. Once written it will add the length of the write to the SP – and that’s it! So it’s one of the oldest and fastest mechanisms for lining up data in a predictable way.

Again the rules are simple: when you pop something off the stack, the size must match whatever you used to push it there. So if you pushed a longword (EDX) you also have to make sure you use a 32-bit target when you pop the value back. If you use RDX, which is 64 bit then you will essentially steal 4 bytes from something else using that stack – and all hell will break loose down the line.

Stack schemes and frames

Im not going to dig too deeply into stack-frames here, but instead write a few words about stack-schemes and using the stack to persist data your code relies on. The lines blur between those two topics anyways.

The SP (stack pointer) is not just a simple offset you can read, you can also write and change it (it also serves as a pointer). You can also read from whatever memory the SP is pointing at without polling any data from the stack.

What language developers usually do, is that they design entire structures on the stack that are, when you get into the nitty-gritty, “offset based records”. For example, lets say you have a record that looks like this:

type
PMyRecord ) ^TMyRecord;
TMyRecord = record
  first: Pointer;
  second: integer;
  Third: array[0..255] of longword;
end;

Instead of allocating conventional ram to hold that record, people push it to the stack and then use offsets to read and update the values there. A bit like a super global variable if you like. This is why when you disassemble code, you find stuff like:

mov EDX, (SP)+4

If the above record was on the stack, that pseudo code would move the field “second” into the EDX register. Because that field is 4 bytes from the stack start (providing SP points to zero).

Every programming language has a stack scheme to keep track of things. Local variables, global variables, class instances, type RTTI — most of these things are allocated in conventional ram – but there is a “program record” on the stack that makes it easy to access that information quickly.

This “moving a whole record onto the stack” is basically what a stack-frame is all about. It used to be a very costly affair with a heavy cpu speed penalty. If you look in your Delphi compiler options you will see that there is a checkbox regarding this very topic. Delphi can be told to avoid stack-frames and do register allocation instead, which was super quick compared to stack-frames – but CPU’s today are largely optimized for stack-frame allocation as default, so I doubt there is much to gain by this in 2019.

Note: A stack frame is much more, but its out of scope for this post. Google it for more info.

To sum up

When doing high-level coding you don’t really need to bother with the nuances between a TStack<item> and a “real” stack. But if you plan on digging deeper and learning a few lines of assembly – learning the differences is imperative. Its boring stuff but as fundamental as wheels on a bicycle. There is no way to avoid it, so might as well jump in.

In its absolute raw form, here is roughly the same functionality for Delphi. This was written on the fly in 2 minutes while on the road, so its purely to give you a rough idea of behavior. I would add a secondary field to keep track of the end (next insertion point), that way SP can be changed without overwriting data on new pushes.

And yes, wrapping this in a TObject utterly defeats the purpose of low-level coding, but hopefully it gives you some idea of the differences 🙂

stack_01

stack_02