5. Low-level Code : Pillars – tukoblog

There are three fundamental pillars of coding: sequence, iteration and branching. This section therefore introduces coding by explaining these pillars and their significance.

Sequence

A computer program is in essence a sequence of instructions. This works at two levels. Firstly, the programmer writes code in a high-level programming language. Secondly, the program is compiled into machine code. The program is a sequence of instructions, but the machine code sequence is what is executed.

A computer program is a sequence of instructions in machine code.

So what are the instructions in the sequence are made out of? Well, let us first of all look at this sequence :

Print out "foo"
Output "foo"
Send the following to the screen: "foo"
the word is now "foo"
show us the word "foo"

This sequence is written in a human language and illustrates how human languages are too imprecise to write computer code in. Look for example at how easy it is to say the same thing in so many different ways. This imprecision unacceptable if we want to deliver instructions to a computer.

We have already sketched out the early days of computers, of valves and machines big as castles, but back in the 1950’s a more refined aspect of computing was also active courtesy of the theorists. These thinkers were studying the idea of computer intelligence and in doing so investigated the possibility communicating with the computer and the consequent relationship between computer and language.

The thinkers soon realised language is a very complex thing. Sure, you can make a computer understand a sentence like ‘The cat sat on the mat’. It is relatively simple to write rules that say a sentence is not syntactically incorrect or a word is not in the dictionary (an asterisk here denotes an ungrammatical sentence) :

*cat the on the sat mat [ syntax check fails ]
*sat mat cat the the on
*the clat sat on the mlat [ dictionary lookup fails ]
*dhe cat apped ob dhe mat

Semantics was the deal-breaker :

The cat ate on the mat
?*The cat ate the mat 
?*The cat wandered through the mat
?*The cat expressed the mat

The simplest human conversation turns on an immense and ever-expanding database of context. Any human would know that mats don’t sit on cats. That is context. A computer does not know this and has no sense of context. A poet, however, might use the phrase, knowing it is absurd. Cats do crawl under mats and that would be a context to use the phrase ‘The mat sat on the cat’. This deliberate misuse of language is even further beyond computer’s reach.

The cat sat on the mat
But the cat was very fat
So *~~the mat sat on the cat~~ [SEMERROR 2X43h!]
Said how do you like that?

This was also the era of Noam Chomsky and his revolutionary work ‘Syntactic Structures’ (1957). The basic and revolutionary idea was that language can be seen in mathematical terms. Here a ‘dictionary’ is a set of values that can be transformed by rules (‘grammar’) into phrases. One of his great insights was that there are ‘phrase structure’ grammars that, while inadequate to describe human language, are nevertheless grammars that describe language.

This is where the computer thinkers come in. We have seen how human language is too diffuse to be used to program a computer. But a simpler phrase-structure language is another thing. The key point now is that the new way of thinking about language was mathematical. If you defined a language in a mathematical way, the correctness of its syntax was provable and testable. If you could design a computer language this way, you could read any code written in it and prove if it was syntactically correct. Note that there could be no semantics problems as per human language. Such a language would be perfectly unambiguous.

That, in a nutshell, is what the compiler we have often talked about does. It constructs the rules of the language, reads the code, checks the syntax.

A programming language defined thusly is built around a dictionary of keywords and operators and a grammar. A dictionary might include items like this :

keywords: if then while do for break
operators: +-*/%^&()

Even at the simplest level, it can be seen that a program written in that language would be incorrect if it used keywords such as ‘boo’ or ‘yaka’ or ‘sha’.

The grammar tells the compiler what the rules are for putting keywords and operatives in the right order. Say this syntax is correct :

x = 3

The following instructions must therefore be wrong (unless the language is appallingly designed) :

= 3 x
3 x =
3 = x

The last line is worthy of note, for it provides a neat little insight into how code works. The intent of the language here is clearly to say: place the value on the right hand side (rhs) into the memory referenced by the left hand side (lhs). So ‘x = 3’ means ‘place the value 3’ (rhs) into the memory referenced by ‘x’ (lhs). If we reverse the ‘x’ and ‘3’ we can see that ‘3’ is a literal and the code makes no sense. There is nowhere for the value in ‘x’ to go!

Returning to our first example, we can replace the imprecise English with proper computer code :

print("foo");

Because the syntax can be proved correct, all ambiguity is removed :

sat cat mat [ GREAT! OK! ]
sat mat cat [ NO! 2X43h! SEMERROR! ]

and no poet could argue with that, not even Homer himself.

Iteration

Computer code would be perfectly rotten if there was just an endless sequence of code. Imagine a million lines of solid uninterrupted statements. The sequence is the first pillar of programming. The second pillar is iteration.

Look at this code from a world with no iteration :

print("foo")
print("foo")
print("foo")
print("foo")
print("foo")

With iteration :

// iterator plus control code in parentheses
for (i = 1; i <= 5; i = i + 1)
{ // code block start
     print("foo"); // code within block
} // code block end

This is a ‘for’ loop (iteration). Note the familiar curly braces. These define a block of code. The block begins with the opening brace and ends with the closing one. All the code in between (all of one line here) belongs to the for loop. Between the ‘for’ and the opening brace you can see how the loop is controlled.

The code is saying ‘loop 5 times’. First it places the value of ‘1’ into the memory referenced by ‘i’. Second it tests to see if the value in ‘i’ is less than or equal to 5. If the value in ‘i’ is 6, the iteration stops and we pass to the closing brace (which means ‘end’). If the value in ‘i’ is less than or equal to 5, the third segment in executed, and 1 is added to the value in ‘i’. Having got past the control code, the code within the braces is executed. Here of course, the word ‘foo’ is printed.

In this code, the value in ‘i’ is initially set to 1, so it will pass into the block and output ‘foo’. At the second point of iteration, ‘i’ will be 2, then 3, then 4, then 5. So ‘foo’ will be output 5 times.

The other common iterator found in programming languages is called a ‘while’ loop:

i = 1;
while (i <= 5)
{
     print("foo");
     i++;
}
     
. . .

i = 1;
do 
{
     print("foo");
} while(++i <= 5);

As you can see, there are two variations, each useful depending on the situation. The general meaning of the code should be clear. Both loops output “foo” five times. In the first loop, the control code is at the start of the iteration, in the second (a ‘do while’ iteration) it is placed at the end.

This example introduces a neat bit of concision common to all C-like languages, the ‘plusplus’ notation. Instead of saying ‘i = i + 1’, we can express the same thing concisely with either i++ or ++i. Both add 1 to the value of ‘i’. However, i++ means ‘run the code in this line then increment i’ and ++i means ‘increment i and then run the code in this line’. Generally, i++ is better, being more readable. In the do-while loop, though, observe how the value of ‘i’ must be incremented before the test in the control code. Here, ++i is the one to use. Anyway, therein lies the witty joke behind the name C++. (Was the beta design called ++C?)

Branching

The third pillar of programming is branching. Consider the following code :

decimal winnings = casino.GetBalance(me); // get my winnings!
myaccount.Invest(winnings); // invest them!

This code uses objects, which we encountered in the last chapter. Don’t worry yet about what exactly an object is, the code here is easy enough to understand in itself. The casino object has code that sends back the user’s balance and the account object has code that invests the specified amount. The code thus retrieves the user’s casino balance and invests it into their bank account.

It should be obvious that there is something profoundly wrong with this code. Since when are casinos associated with winning? Even if people do sometimes win at casinos, this code assumes they do without fail.

This is where branching comes into play. Without branching, the problem with code above would be unsolvable. With branching, the solution is obvious :

decimal casinoBalance = casino.GetBalance(me);
if (casinoBalance > 0)
{
     myaccount.Invest(casinoBalance );
}

A branch can be thought of as an if-then statement. A lot of programming languages in fact use these as keywords, Pascal for example :

if (casinoBalance > 0) then
begin
     // code
end;

This shows how elegantly concise C-like languages are, for they remove the need for a ‘then’ statement whilst retaining full readability.

These three pillars of programming — sequence, iteration and branching — sound so simple, but they are so fundamental that they can express practically anything. Of course, while learning to program merely begins here, you now understand how a program is structured at the most fundamental level.