ALGOL W

During his stay at Stanford University Wirth designed ALGOL W, his first widely distributed structured language. He first implemented PL/360 , the structured assembler for the IBM 360 architecture, to have a development system for ALGOL-W.

ALGOL W is a programming language, based on a proposal for ALGOL X by Niklaus Wirth and Tony Hoare as a successor to ALGOL 60 in the International Federation for Information Processing (IFIP) IFIP Working Group 2.1. When the committee decided that the proposal was not a sufficient advance over ALGOL 60, the proposal was published as A contribution to the development of ALGOL.
After making small modifications to the language Wirth supervised a high quality implementation for the IBM/360 at Stanford University that was widely distributed.

It represented a relatively conservative modification of ALGOL 60, adding string, bitstring, complex number and reference to record datatypes and call-by-result passing of parameters, introducing the while statement, replacing switch with the case statement, and generally tightening up the language. The implementation was written in PL/360, an ALGOL-like assembly language designed by Wirth. The implementation includes influential debugging and profiling abilities.

ALGOL W’s syntax is built on a subset of the EBCDIC character set. In ALGOL 60 reserved words are distinct lexical items, but in ALGOL W they are merely sequences of characters, and do not need to be stropped. Reserved words and identifiers are separated by spaces. In these ways ALGOL W’s syntax resembles that of Pascal and later languages.

The Algol W Language Description (see below)defines Algol W in an affix grammar that resembles BNF. This grammar was a precursor of the Van Wijngaarden grammar.

Much of Algol W’s semantics is defined grammatically: Identifiers are distinguished by their definition within the current scope. For example, a ⟨procedure identifier⟩ is an identifier that has been defined by a procedure declaration, a ⟨label identifier⟩ is an identifier that is being used as a goto label. The types of variables and expressions are represented by affixes. For example ⟨τ function identifier⟩ is the syntactic entity for a function that returns a value of type τ, if an identifier has been declared as an integer function in the current scope then that is expanded to ⟨integer function identifier⟩. Type errors are grammatical errors. For example “⟨integer expression⟩ / ⟨integer expression⟩” and “⟨real expression⟩ / ⟨real expression⟩” are valid but distinct syntactic entities that represent expressions, but “⟨real expression⟩ DIV ⟨integer expression⟩” (i.e. integer division performed on a floating-point value) is an invalid syntactic entity.

The major part of ALGOL W, amounting to approximately 2700 cards (see listing below), was written in Wirth’s PL360. An interface module for the IBM operating system in use (OS, DOS, MTS, ORVYL) was written in IBM assembler, amounting to fewer than 250 cards. In an OS environment on a 360/67 with spooled input and output files, the compiler will recompile itself in about 25 seconds. The compiler is approximately 2700 card images. Thus, when the OS scheduler time is subtracted from the execution time given above, it is seen that the compiler runs at a speed in excess of 100 cards per second (for dense code). In a DOS environment on a 360/30, the compiler is limited only by the speed of the card reader. The compiler has successfully recompiled itself on a 64K 360/30 at a rate of 1200 cards per minute (the speed of the card reader).

Documents for download:

ALGOL W Reference Manual, Niklaus Wirth, June 1972
ALGOL W Reference Manual, February 1972
ALGOL-W Listing November 1969
Algol W Language Description by Henry Bauer Sheldon Becker Susan L. Graham Edwin Satterthwaite Richard L. Sites June 1972
ALGOL W Language description, Henry R. Bauer, Sheldon Becker, Susan L. Graham
MTS ALGOL-W 1980

If you want to work with ALGOL-W, there is a recent implementation AWE for Linux (and Windows via CYGWIN) .

From https://everything2.com/title/Algol+W

In the early 1960s, the IFIP Working Group 2.1 committee was created with the mandate to design a language to replace Algol+60. By the time of the regular committee meeting in the fall of 1966, there were three competing proposals on the table. One of the proposals, by Niklaus Wirth, was for a language which was a relatively modest improvement over Algol 60. One of the other proposals, by Van Wijngaarden, was chosen by the committee and was to evolve into Algol 68. As a result of the decision to pursue Van Wijngaarden’s proposal, a number of the committee members, including Wirth and C.A.R. Hoare, abandoned the committee (it was, apparently, a pretty intense meeting). Wirth developed his proposal into an Algol-style language which, eventually, came to be known as Algol W (like Algol 60 before it, the Algol W language’s gramma was formally described using BNF notation). Although less ambitious than Van Wijngaarden’s proposal, Wirth’s language introduced a number of new concepts including:

  • double precision floating point data type
  • complex data type
  • bit string data type
  • dynamic data structures (i.e. structs and pointers to structs)
  • block expressions

When it came time to actually compiler for the language, a decision was made to first develop a systems programming language called PL360 (the only real alternatives were FORTRAN and Assembler). Once a PL360 compiler had been written, it was used (in about 1968) to implement an Algol W compiler on the OS/360 and MTS (Michigan Terminal system) operating systems.

Algol W was reasonably successful as a teaching language although it was rarely if ever used in commercial contexts. By about 1980, Algol W was being replaced by languages like Pascal.

The Algol W language

The following is a brief description of the language as it was defined in 1972.

Basic syntax

Algol W dates back to the days of punched cards. Consequently, the language is defined strictly in terms of upper case. For example, the following is a valid Algol W program

BEGIN
WRITE("Hello world");
END.

but the following is not

begin
write("Hello world");
end.

Like C, statements are terminated by a semi-colon and whitespace can appear anywhere except within an identifier or reserved word.

The reserved words are
ABS
ALGOL (used when writing separately compiled ‘modules’)
AND
ARRAY
ASSERT
BEGIN
BITS
CASE
COMMENT
complex number COMPLEX
DIV
DO
ELSE
EEND
FALSE
FOR
FORTRAN (used to call a separately compiled FORTRAN routine)
GO TO
GOTO
IF
INTEGER
IS
LOGICAL” LOGICAL
LONG
NULL
OF
OR
PROCEDURE
REAL
RECORD
REFERENCE
REM
RESULT
SHL
SHORT
SHR
STEP
STRING
THEN
TRUE
UNTIL
VALUE
WHILE

Data types

The following is the list of data types supported by the language. Example declarations and constants are provided for each data type.

  • INTEGER (32-bit signed integer)
    INTEGER I, J;
    10
    1232
    
  • REAL (32-bit floating point)
    REAL W, X;
    10.
    10.0
    0.1
    .1
    
  • LONG REAL (64-bit floating point)
    LONG REAL Y, Z;
    10.L
    10.0L
    0.1L
    .1L
    10L
    
  • COMPLEX (32-bit complex – i.e. a pair of 32-bit floating point values)
    COMPLEX A, B;
    10.I
    10.0I
    0.1I
    .1I
    10I
    

    Note that the complex constant only represents the imaginary part of the number. A specification of a complete complex value would involve the addition of a real value with a complex value. e.g. 5.0 + 3I or 7 + 3I (the integer constant 7 would be automatically converted to the real value 7.0 before it was added to the complex constant value 3i).

  • LONG COMPLEX (64-bit complex – i.e. a pair of 64-bit floating point values)
    COMPLEX C, D;
    10.IL 10.0IL 0.1IL .1IL 10IL
    
  • LOGICAL (a boolean value – i.e. either TRUE or FALSE)
    LOGICAL E, F;
    TRUE
    FALSE
    
  • BITS (a 32 element sequence of bits)
    BITS G, H;
    #001248CF
    

    Note that bits constants are represented in hexadecimal. The above constant is equivalent to the binary value 00000000000100100100100011001111.

  • STRING (fixed length string)
    STRING S; STRING(40) T;
    "hello"        a string containing the 5 characters h, e, l, l and o
    """"           a string containing a single "
    

    All strings are fixed length with a minimum length of 1 and a maximum length of 256. If the length isn’t specified in a STRING declaration then it defaults to 16. Also note that there is no backslash convention as is found in many “modern” languages. Use two double quotes in a row if you want to have a single double quote within a string constant.

  • RECORD (essentially a C struct)
    RECORD PERSON( STRING(40) NAME; INTEGER AGE; REAL HEIGHT );
    PERSON( "DANIEL BOULET", 45, 180.3 )
    PERSON
    

    The PERSON( “DANIEL BOULET”, 45, 180.3 ) example defines
    a new PERSON RECORD and initializes all the fields. The second PERSON example creates a new PERSON RECORD with the field
    values uninitialized. Neither of these examples are really constants.

  • REFERENCE (essentially a Java-style pointer – i.e. no pointer-to-pointer or pointer arithmetic operations are supported)
    REFERENCE (PERSON) DANIEL; REFERENCE (PERSON,THING) IT;
    NULL
    
    • a REFERENCE variable can be declared to point at either a single
      RECORD type (e.g. DANIEL above) or a list of alternative record types
      (e.g. IT above).
      If a REFERENCE variable can point at more than one RECORD type then
      the IS operator (see below) can be used to determine which type it
      actually points at.
    • the only REFERENCE constant is the NULL value.

Algol W also supports arrays which can be constructed using any of the other data types. Here are a few example declarations:

INTEGER ARRAY VECTOR(1::10);         COMMENT ten element array of integers;
REAL    ARRAY MATRIX(1::10,1::10);   COMMENT a square 10x10 array of reals;
INTEGER ARRAY RANGE(-10::10);        COMMENT a 21 element array (-10 to +10);
REFERENCE(THING) ARRAY THINGS(1::N); COMMENT an array of references to 
                                             things with a size determined
					     at run-time;

Any of the bounds of an array declaration can be an expression which is evaluated at run-time. If the result is an array in which the lower bound is greater than the upper bound (for any dimension) then the array has no elements – i.e. any attempt to access the array will result in a run-time array bounds error.

Examples for each of the above arrays are:

WRITE( VECTOR(1) );           COMMENT print out first element of VECTOR;
MATRIX(10,1) := MATRIX(1,10); COMMENT copy an element of MATRIX;
RANGE(0) := 0;                COMMENT set the middle element of RANGE to 0;
WRITE(NAME(THINGS(1)));       COMMENT write out the value of the NAME field of the first THING;

Operators and expressions

Algol W supports the usual set of operators although it distinguishes between integer and non-integer division operators.
The operators are:

OR
AND
¬
< <= = ¬= >= > IS
+ -
* / DIV REM
SHL SHR **
LONG SHORT ABS

A few notes are in order:

  • The operators on each line in the above list are of equal precedence (i.e.
    priority) with
    the first line being the lowest precedence and the last line having the
    highest precedence.
  • ¬ is the logical negation operator (equivalent to the C programming language’s ! operator) and ¬= is the inequality comparison operator (equivalent to C’s != operator).
  • the IS operator can be used to determine which RECORD type a REFERENCE variable is pointing at. e.g.
    IF IT IS PERSON THEN
       WRITE("IT is pointing at a PERSON record")
    ELSE
       WRITE("IT is not pointing at a PERSON record");
    
  • DIV and REM are the integer divide and remainder operators (if / is applied to a pair of integer values then the result is a real value).
  • SHL and SHR are the bit-wise shift left and right operators.
  • LONG ‘expands’ REAL and COMPLEX values to LONG REAL and LONG COMPLEX.
    SHORT ‘contracts’ LONG REAL and LONG COMPLEX values to
    REAL and COMPLEX. It’s an error to apply LONG to something which is already LONG or to apply SHORT to something which is not LONG.
  • ABS returns the absolute value of it’s operand.
  • with the exception of the OR and AND operators, the operands of binary operators were evaluated in an unpredictable order. For example, the expression F(X) * G(X) might result in F(X) being called either before or after G(X) gets called. The order would be always the same for a given compilation of a given program but could change if the program was recompiled. Consequently, an operator which combined the result of two function calls which each modified the same global variable could cause unpredictable results.
  • the AND and OR operators are McCarthy operators -i.e. they use what today is often called lazy evaluation: If the left operand of an OR is true then the right operand isn’t evaluated as the result can only be TRUE. If the left operand of an AND is false then the right operand isn’t evaluated as the result can only be FALSE.

Parentheses can, of course, be used to override the operator precedence. Missing from the above list is the unary – and + operators which have higher precendence than any of the binary operators. Missing from this entire writeup is a description of the reasonably large (for the era) set of built-in library routines. Also missing from the above list is a mechanism to access fields of a RECORD. The syntax for this is identical to a function call where the name of the function is the name of the field and the single parameter to the call is the REFERENCE variable. For example,

AGE(DANIEL) := AGE(DANIEL) + 1;

would increment the AGE field of the PERSON record referenced by DANIEL.

An example program

Here’s a very short example program

BEGIN
INTEGER I, J, PRODUCT;
READ(I,J);  COMMENT read two integers from the input device;
PRODUCT := I * J; WRITE("The product is ",PRODUCT);
END.

A few notes are in order

  • The entire program must be enclosed in a BEGIN-END block and the final END must have a period after it.
  • The READ statement is a call to the built-in READ procedure which reads a single input line and assigns successive values to successive parameters (input is free format in this example and additional input lines are read if necessary).
  • Comments are inserted by using the COMMENT reserved word.
  • Everything up to the next semi-colon is ignored

  • Multiple statements can appear on a single line and a statement can span any number of lines.
  • The assignment operator is :=.
  • This allows it to be distinguished from the comparison for equality operator which is =.

  • Lower case letters are allowed within comments and string constants.

Block structured

Like all Algol languages, Algol W is a block structured language. This means that the scope of an identifier is determined by the block within which it is declared. For example, the output of the following program

BEGIN
INTEGER I;
I := 10;
WRITE("I is ",I);
   BEGIN
   INTEGER I;
   I := 20;
   WRITE("this I is ",I);
   END;
WRITE("I is still ",I);
   BEGIN
   I := 30;
   WRITE("I is now ",I);
   END;
WRITE("I's final value is ",I);
END.

would be

I is 10
this I is 20
I is still 10
I is now 30
I's final value is 30

The scope of the I declared on the second line is the entire program except for the first inner block which has a different I declared within it.

Procedures

Algol W has function procedures and proper procedures. A function procedure returns a value and can be used in any context in which an expression can appear. A proper procedure does not return a value. Today, these are generally called functions and procedures respectively. Procedures of either kind can be defined to accept parameters. Here are a pair of examples:

BEGIN
INTEGER P, Q;

PROCEDURE PROC( INTEGER VALUE I, J; INTEGER RESULT PRODUCT );
   BEGIN
   WRITE("call to PROC - I is ",I,", J is ",J,".");
   PRODUCT := I * J;
   END;

INTEGER PROCEDURE FUNC( INTEGER VALUE I, J );
    BEGIN
    WRITE("call to FUNC - I is ",I,", J is ",J,".");
    I * J
    END;

PROC(10, 20, P);
Q := FUNC(10, 20);
END.

The first is a proper procedure called PROC which takes two integer parameters which are initialized based on the first two arguments passed on the call to PROC (i.e. I is initialized to 10 and J is initialized to 20 in this example). It returns a result parameter PRODUCT (i.e. P will be assigned 200 just as the call to PROC returns).
The second is a function procedure called FUNC which takes two integer parameters and returns their product as the value of the procedure (the last thing before the closing END in a function procedure’s body must be an expression which is evaluated to compute the value of the procedure).
Here’s an example of a function procedure that takes no parameters:

BEGIN
INTEGER Q;

REAL PROCEDURE ZERO;
   BEGIN
   0
   END;

Q := ZERO;
END.

This is an expensive (and absurd) way to initialize Q to 0. Short procedures consisting of a single statement can dispense with the BEGIN-END block:

PROCEDURE HELLO;
    WRITE("Hello");

Algol W, like its predecessor Algol 60, also supports the call by name parameter passing mechanism:

BEGIN
INTEGER J;
PROCEDURE BYNAME(INTEGER MAGIC);
   BEGIN
   WRITE("MAGIC is ",MAGIC);
   J := J * 2;
   WRITE("MAGIC is now ",MAGIC);
   END;
J := 10;
BYNAME(J * 3);
END.

A call by name parameter is re-evaluated every time it is used. Consequently, the output of the above program would be:

MAGIC is 30
MAGIC is now 60

The potential for, shall we say, strange program behaviour when using call by name is great enough that the mechanism is rarely used.
A binary operator which combines the value of two function calls, at least one of which modifies a global variable and the other of which relies upon the same global variable, produces unpredictable results. For example, the output of the following program

BEGIN
INTEGER J;
INTEGER PROCEDURE LEFT;
   BEGIN
   J := J + 1;
   J
   END;
INTEGER PROCEDURE RIGHT;
   BEGIN
   J := J * 3;
   J
   END;
J := 2;
WRITE( LEFT * RIGHT );
END.

might be 27 or 42 depending on which of LEFT or RIGHT is called first when the operands of the * operator in the WRITE call are evaluated.
Like all the Algol languages, Algol W supports recursion. Here’s an excellent way to consume vast quantities of CPU time:

INTEGER PROCEDURE ACKERMANN(INTEGER VALUE X, Y);
    IF X = 0 THEN
	Y + 1
    ELSE IF Y = 0 THEN
	ACKERMANN(X - 1, 1)
    ELSE
	ACKERMANN(X - 1, ACKERMANN(X, Y - 1));

If you want to find out if your computer can stay up for a few thousand millennium, call this function with

ACKERMANN(1000,1000)

Block expressions

Although apparently not part of the original language, the 1972 version of Algol W supports block expressions. This is a generalization of how function procedure bodies are written.
It allows the following sort of nonsense to be written:

BEGIN
WRITE("the answer is ",
   BEGIN
   INTEGER X;
   X := 2 * 3;
   X
   END
   * 7
);
END.

In practical terms, block expressions aren’t used very often as they tend to make the program harder to read without adding any real value. Block expressions which modify global variables can also produce unpredictable results:

BEGIN
INTEGER J;
J := 2;
WRITE(
   BEGIN
   J := J + 1;
   J
   END
   *
   BEGIN
   J := J * 3;
   J
   END
);
END.

This program is equivalent to the example given above involving the function procedures LEFT and RIGHT.

Other statements

The IF statement is used to conditionally execute a single statement:

IF I > 10 THEN
   I := I * 2;

An ELSE clause can be included to deal with the alternative case:

IF I > 10 THEN
   I := I * 2
ELSE
   I := I - 3;

Notice that the statement preceding the ELSE isn’t terminated by a semi-colon. The reason is that there is actually only one statement here and it’s terminated by a semi-colon after the 3. If an ELSE clause exists then the statement after the THEN must be a simple statement (e.g. an assignment statement, a call to a proper procedure or a BEGIN-END block). Since an IF statement is not a simple statement, the following is invalid:

IF A < B THEN
    IF X > Y THEN
	X := 2
    ELSE
	Y := 2
ELSE
    Q := 2;

The following is a valid way of expressing what was intended above:

IF A < B THEN
    BEGIN
    IF X > Y THEN
	X := 2
    ELSE
	Y := 2;
    END
ELSE
    Q := 2;

An IF statement without an ELSE clause can have any statement for its body. Consequently, the following is valid:

IF A < B THEN
    IF X > Y THEN
	X := 2
    ELSE
	Y := 2;

as is

IF A < B THEN
    IF X > Y THEN
	X := 2
    ELSE
	IF X < Y THEN
	    X := 3
	ELSE
	    X := 5;

IMPORTANT: don’t let the indentation fool you.
This

IF A < B THEN IF X > Y THEN
X := 2
ELSE
IF X < Y THEN X := 3 ELSE X := 5;

is identical in all respects (except meaningless indentation) to the previous example! Just to be clear, the last two examples are also equivalent to this:

IF A < B THEN
    BEGIN
    IF X > Y THEN
	X := 2
    ELSE
	BEGIN
	IF X < Y THEN
	    X := 3
	ELSE
	    X := 5;
	END;
    END;

If the body of the THEN part or the ELSE part needs to be more than a single statement then a BEGIN-END block is required:

IF I > 10 THEN
   BEGIN
   I := I * 2;
   J := 2;
   END
ELSE
   BEGIN
   I := I - 3;
   J := 5;
   END;

The notion of block expressions is also available in what is called a conditional expression:

J :=
IF I > 10 THEN
   BEGIN
   I := I * 2;
   2
   END
ELSE
   BEGIN
   I := I - 3;
   5
   END;

This is functionally equivalent to the previous IF-THEN-ELSE example.

Here's an IF statement that takes advantage of the fact that the AND and OR operators are McCarthy-style (i.e. lazy evaluation like the C || and && operators):

IF PTR ¬= NULL AND XVAR(PTR) > 10 THEN
   BEGIN
   WRITE("XVAR is too big");
   ASSERT FALSE;
   END;

This also demonstrates the ASSERT statement which can be used to abort the program if an assumption fails. A shorter but more meaningful example of ASSERT would be:

ASSERT NOT ( PTR = NULL OR XVAR(PTR) <= 10 );

The program will terminate with a run-time assertion-failure if the logical (i.e. boolean) expression on the ASSERT statement is false. The first example is, arguably, better as the user gets a better error message prior to termination. The FOR statement is used when the number of iterations required is known prior to the start of the first iteration. For example, the following would print out all the integers from 1 to 10:

FOR I := 1 UNTIL 10 DO
   WRITE(I);

FOR loops can go backwards as well:

FOR I := 10 STEP -1 UNTIL 1 DO
   WRITE(I);

This gives you the integers from 10 down to 1. Once the direction has been determined (i.e. the sign of the STEP), the loop starts at the initial value and iterates until the termination value. If the initial value is after the termination value in an upwards loop or before the termination value in a downwards loop then the loop body is not executed. The initial, step and termination values are computed exactly once (i.e. changing the variables involved in their computation once the loop starts has no effect on how many times the loop goes around or what value is assigned to the iterating variable on each iteration).

The iterating variable, I in the above example, is implicitly declared by the FOR statement as an INTEGER (i.e. it exists only for the duration of the FOR loop) and it may not be assigned to within the body of the loop. There is no exact equivalent to the C language's break or continue statements although the language supports go to statements for the truly foolish (or very brave). There is also nothing equivalent to C's return statement.

The WHILE loop is used when the FOR statement isn't suitable (e.g. number
of iterations not known in advance or irregular step size).
A single example should suffice:

J := 1;
WHILE A(J) < 0 AND J <= 10 DO
   BEGIN
   J := J + 1;
   IF J = 3 THEN
      J := J + 1;
   END;

The body of a FOR or a WHILE statement can be any valid single statement (use a BEGIN-END block if a multi-statement body is required).

The CASE statement is somewhat analogous to the C language's switch statement although it operates rather differently. Here's an example:

CASE I - 1 OF
    BEGIN
    J := 2;
    J := 3;
    BEGIN J := 5; K := 8; END;
    K := J
    END;

The case selector expression (i.e. I - 1 in this example) is used to select a single statement within the body of the BEGIN-END block which is then executed. The value of the expression determines the ordinal number of the statement to execute. i.e. if I - 1 is 1 then the J := 2 statement is executed, if I - 1 is 2 then the J := 3 statement is executed, etc.
Note the use of an inner BEGIN-END block which allows the third statement to actually consists of two statements. It is a fatal run-time error if the value of the case selector expression is less than 1 or greater than the number of statements in the BEGIN-END block.