Pascal-S: A Subset and its Implementation N. Wirth 1. Aims and motivation Several years ago, the Computer Science Department of ETH Zurich had started to use the programming language Pascal in its introductory programming courses [1, 2] . These courses are taught mainly to engineers, physicists, and mathematicians in their first year. The large number of participants dictates the use of an efficient, and economical system; economical with regard to the students learning effort, to computing time, and to storage requirements. The fIrst demand requires a system with comprehensive syntax and runtime error checking and the provision of meaningful, well-explained diagnostics. Machine economy was realized through a combination of a compiler and a sub-batch monitor, the latter alternately invoking compilation and program execution. This scheme requires an absolutely watertight protection against errors in compiled programs and, of course, an entirely error-free compiler and monitor. The system developed for this purpose by R. Schild proved to be highly suc- cessful, and it turned out to be so economical with respect to computing time that the system's extensive use by about 350 students amounted to less than 0.5 per cent. of the entire computing services provided by the computation centre averaged over the entire term, although the student job batch was collected, run, and retur- ned four times per day. Nevertheless, there were also some disadvantages. First of all, there were only four fixed times when jobs were collected (and sometimes fewer due to machine failures). Then, the student jobs had to be handled separately from all other jobs separate batching and special job cards to be provided by the operators). These drawbacks and the advent of a separate medium-speed batch terminal located in the students' program preparation room indicated that a system with different characteristics might be more appropriate. The students' batch terminal allows for self-service. It is therefore highly desirable-if not mandatory-that each student's job be scheduled and run independently of other jobs. Fast turnaround can-under the prospect of large numbers of jobs- only be guaranteed, if the jobs use relatively little store. In fact, storage space is at a much higher premium than processor time. Under these conditions, a compact system is mandatory. Without compromising on the demand for extensive error checking, our interpretive solution appears as most promising because it allows for a simple compiler and dense code. Program size can be further .reduced if the system is restricted to handling that subset of Pascal which is actually taught in these introductory courses. Hence, the new system was intentionally designed to process a subset. The resulting reduction of development labour was an additonal incentive for this decision. In choosing an interpretive approach, one must be aware of and willing to accept a substantial loss of efficiency in program execution. A factor of 30 compared to reasonably good compiled code is not unusual, and a factor of 20 must be called 'very good'. Such factors can of course, only be accepted if the gains expected elsewhere are equally substantial. They can only be compensated if the execution effort of the compiled program is relatively small, say at least 20 times smaller than job initiation, compiler loading, compilation, and program loading together. In view of the everyday performance figures of common operating systems, this condition is indeed satisfied for very many problems that can successfully be asssigned as programming exercises. The most important single factor is gained by eliminating the need for a relocation loader. This is achieved by directly depositing the compiled code in store. As programs tend to be small, even the demand for storage economy cannot be used as a strong counter argument agains this strategy. The system described in this report consists of a compiler and an interpreter for a subset of Pascal called Pascal-S. Section 2 defmes that subset; it contains a complete syntax speeification in terms of concise syntax diagrams. These diagrams directly mirror the structure of the compiler. Section 3 provides an overview of the entrie system which is described as a single Pascal program. Some figures are provided concerning system size and performance. Section 4 explains the architecture of the hypothetical computer that executes compiled Pascal-S programs. Some typical program constructs are listed together with the code generated for them. The principles of operation of the computer become understandable through these sample constructs which at the same time pciture the task of translating Pascal-S into this code. Section 5 discusses the compiler itself, and it starts out with an explanation of the tables and their structure. used to represent the information given in a program's declarations. The explanations are necessarily terse and brief and incomplete. They are intended for people who already have some background on compilers; they are in particular referred to by Wirth [2], where the principles along which this sytem is constructed are taught and developed. For all details, the reader is referred to section 6 which is a full listing of the entire system. One may wonder about the value of including a program listing in extenso. But I think it is important and hope it will be useful. The primary value of the langu~ge and system Pascal is that it allows one to construct large programs that are useful and highly efficient in a form that can be read and cornrnunicated. The listing of thePascal-S system is intended to support this claim. It also proves that compiler and interpreter can be described in a machine-independent, well-structured form that nevertheless is effectively machine translatable. The relative brevity of the program (25 pages) also raises a new aspect of compiler portability; it is entirely possible to transport such a system by hand coding. The effort is at most one of a few man-months, even for a computer where nothing but symbolic assembly code or FORTRAN are available. 2. The language Pascal-S The choice of features to be included in the subset now called Pascal-S was mainly guided by the contents of traditional introductory prograrnrning courses. Beyond this it is subject to personal experience, judgement, and prejudice. A firm guideline was provided by the demand that the system must process a strict subset of Pascal, i.e. that every Pascal-S program must also be acceptable by the compiler of standard Pascal without being subjected to the slightest change. This rule makes it possible for students to switch over to the regular system in later courses 'without noticing'. The power of a language and its range of applications largely depend on its data types and associated operators. They also determine the amount of effort required to master a language. Pascal-S adheres in this respect largely to the tradition of ALGOL 60. Its primitive data types are the integers, the real numbers, and the Boolean truth values. They are augmented in a most important and crucial way by the type char, representing the available set of printable characters. Omitted from Pascal are the scalar types and sub range types. Pascal-S includes only two kinds of data strnctures: the array and the record (without variants). Omitted are the set and the file structure. The exceptions are the two standard textflles input and output which are declared implicitly (but must be listed in the program heading). A very essential omission is the absence of pointer types and thereby of all dynamic structures. Of course, also all packing options (packed records, packed arrays) are also omitted. The choice of data types and structures essentially determines the complexity of a processing system. Statement and control strnctures contribute very little to it. Hence, Pascal-S includes most of Pascal's statement structures (compound, conditional, selective, and repetitive statements). The only omissions are the with and the goto statement. The latter was omitted very deliberately because of the principal use of Pascal-S in teaching the systematic design of well-structured programs. Procedures and functions are included in their fUll generality. The only exception is that procedues and functions cannot be used as parameters. The detailed syntax of Pascal-S can be seen from the syntax diagrams which are included in the Appendix. They reveal a simple and consistent language that can be learned in toto in a very short time, yet encompasses many of the truly fundamental concepts of programming. Teaching experience over many years has shown that the concept of the sequence (sequential file) is of fundamental importance for the understanding of many computing practices and techniques. In spite of the absence of declarable files, it can be taught quite well with Pascal-S because of the presence of the two standard textflles. We have deliberately decided to exclude the primitive operators put and get, and have restricted file operations to read (on input) and write (on output). Also included are the simple but flexible 'formatting' facilities of the Pascal write statement. They proved to be not only useful and desirable from the point of view of utility but, indeed, also quite simple to teach. (Actually, they do not even need to be taught; students learn to use them quite naturally, as they are entriely free of pitfalls.) The standard objects available in Pascal-S are: Constants: true, false Types: integer, real, boolean, char Functions: abs, sqr, odd, chr, ord, succ, pred, round, tronc, sin, cos, exp, In, sqrt, arctan, eof. eoln, Procedures: read, readln, write, writeln, Functions pred and succ are only applicable to arguments of type char. The argument of ord can be of type char, boolean, or integer. For further details concerning the language, the reader is.referred to the literature [1,3]. It is noteworthy that the subset corresponds largely to that part of the language Pascal which is covered in a textbook for an introductory programming course [3,4]. 3. The implementation The Pascal-S system is described as a compiler that translates Pascal-S programs into code for a hypothetical stack computer especially designed for this purpose. This comouter is itself defmed as an algorithm, called the interpreter of the compiled code. Bothe compiler and interpreter are described in a largely machine-independent way by using the high-level language Pascal exclusively. In fact, these two parts form a single Pascal program. It is listed in section 6. The advantages of a description using a high-level language are particularly apparent during the development of a system, but are equally significant if it has to be transported and adapted to a different computer. In fact, Pascal-S can be implemented immediately on all machines where a full Pascal compiler is available. Of course, the success of such an automatically generated system crucially depends on the quality of the tool compiler. The Pascal 6000-3.4 compiler used at ETH on the CDC 6400 computer generates high-quality code, and recompiles the entire Pascal-S sytem in 20 sec. The disadvantage of an interpretive system is its relatively large overhead during program execution. Experience has shown, however, that for small programs the expense for central processor utilization for interpretation is anyway quite small. Some actual figures are given for a few sample programs in section 7. Exercises requiring less than 4 sec of computer time (costing less than SFr. 2.-) are predominant. Of course, exercises will have to be carefully chosen, particularly in numerical mathematics. Apart from its machine-independent specification, the following characteristics are noteworthy: (1) The system resides on the disk store as an absolute binary overlay file; hence, loading is fast. (2) The storage space needed is reasonably small. On the CDC 6000 computer it requires lO,OOO 60-bit words, including data and I/O buffers. (3) The compiler is designed to recover from syntax errors and to proceed after emitting a diagnostic keyword. This policy usually allows many errors to be detected from a single compilation. A significant effort is made to suppress so-called 'spurious' error messages, i.e. indication of irregular situations that are due to previously reported errors. (4) Control is not returned to the operating system between compilation and execution. No loader is invoked nor is secondary storage accessed to deposit j the generated code. This resulted in a very significant saving of overhead and cost. (5) A copy of the input data is made on the output file immediately after compilation. This is often an invaluable aid to consultants and tutors. (6) The interpretation steps needed for program exection are counted, and provide a precise, reproducible hardware-independent measure for the actual computational effort expended. (7) By exchanging a single operating system control card the student may switch over to the regular, full Pascal compiler. Pascal-S is a true subset of standard Pascal. (8) If an error is detected at run time, execution is aborted and a listing of the names and values of all currently accessible variables is printed, together with the coordinate of the point of interruption and an indication of the reason for the interruption (post-mortem dump). (9) The system requires no access to secondary store except for the standard input and output files. The amount of these data is usually so small that a single access (block transfer) is sufficient for each file. 4. The interpreter The Pascal-S system consists of two main parts: compiler and interpreter. Their principal interface is the array variable to which the compiler asssigns the generated code. The interpreter itself is formulated as a procedure which is called after succesful compilation. The interpreter describes a straightforward stack computer, consisting of a store S organized as a stack, two index registers T and B which control the stack, a program counter PC, an instruction register IR, a program status register PS, and a DISPLA Y used to speed up the addressing mechanism. Each element of the stack represents either an integer, a real number, a logical value, or a character. The principal structure is procedure interpret; begin initialize registers and auxiliary counters; repeat ir;= code (pc] ;pc ;= pc+1; interpret (ir) until ps =1= mn; if ps =1= fin then postmortemdump end Each instruction (order) is characterized by an order code f with values between 0 and 63. Orders with values, 0,1,2,3 have two parameters x andy. Instructions 0, 1, 2 generate an address of the data element on the stack with offset y in the currently active data segment on level x. Orders with codes from 8 to 30 have a single parameter y whose meaning differs in the individual cases. Orders with codes 31...63 have no parameters and are operators whose arguments are the elements on the top of the stack. We refrain from introducing a complete set of mnemonics of the orders. Instead, short key words are given as comments in the listing of the interpreter where necessary. The individual routines are in most cases simple enough to make any commentary superfluous. We can therefore restrict ourselves to a presentation of the general layout of the stack and of the patterns of emitted code for specific language constructs. These will show where individual orders are used, and thereby make their principles of operation understandable. 4.1 Storage layout and procedure calls Each stack element may either be an integer, a real number, a logical value, or a character. Integers are also used as stack indices. Each activated procedure reserves a stack section on the top. The beginning of each section is designated by a mark which contains a pointer to the previous section. These pointers form the so-called dynamic link, as they record the dynamic history of procedure activations. There is also a static link, which connects those sections that belong to procedures declared within each other. This chain designates all segments which are to be currently accessible. The static chain starting with the currently active procedure is also copied into the short array called display. This is done to speed up data access. The first five locations of each stack section are occupied by the so-called section mark. The mark contains the two links, the return address, a pointer (index) to the symbol table, and the result table (used by function procedures only). All data are accessed by an offset address relative to the section origin, or via the top stack pointer. Subsequent locations are used for procedure parameters, followed by local variables. The top of the stack is used for intermediate results in evaluating expressions. Stack sections are 'allocated' when procedures are activated by changing the stack pointers and setting up the two links. First, the stack is 'marked' i.e. a space for the block mark is reserved. Then the actual parameters are processed. In the case of value parameters the actual values are loaded onto the stack; in the case ofvariable parameters, stack addresses are loaded instead of values. Finally, the procedure call order changes the Band T registers and assigns the links and return address to the section mark. Upon procedure exit, the return order reverses the operations performed by the call order. If the static level of the current procedure is lower than the one of the procedure to which control returns, then the display has to be updated. This is done by a separate order which indicates the two levels. 4.2 Control structures Control structures are translated into sequences "of instructions containing jump orders to the following patterns: (1) if b then s (2) if b then sl else s2 (3) while b do s (4) repeat s until b (5) for i := a to b do s (6) for i := a downto b do s (7) case e of v1: s1;v2: s2; ... ;vn: sn end (1) code (b) (2) code (b) conditional jump to L conditional jump to L 1 code (s) code (sl) L: ... jump to L2 L1: code (s2) L2: ... (3) L1: code (b) (4) L: code (s) conditional jump to L2 code (b) code (s) conditional jump to L jump to L 1 L2: ... (5) load address i (6) load address i code (a) code (a) code (b) code (b) for1upL2 for1downL2 L1: code (s) L1: code (s) for2up L1 for2down L1 L2: ... L2: .,. (7) code (e) search switchlist L Ll: code (sl) jump to K L2: code (s2) jump to K Ln: code (sn) jump to K L: (vl,Ll) (v2,L2) (vn,Ln) K: ... In the case of the for statement, the address of the control variable and the two limit values are left on the stack during execution of the repeated statement. The switch order used in the case statement performs a simple, linear search through the switch list, comparing the value on top of the stack with entries vl...vn. If a match vi is found, a jump to Li is executed. 4.3 Post-mortem dump If interpretation of the code leads to an error condition, execution is terminated and a symbolic post-mortem dump is generated. Detected error conditions are: (a) division by zero, (b) selector value of a case statement out of range, (c) array index out of bounds, (d) stack overflow, (e) line limit exceeded (too many lines), (f) output line too long, and (g) attempt to read beyond the end of the input file. The post-mortem consists of a list of the currently active procedures with an indication of their activation points, and (for each procedure) a list of its local variables and their current values. In order to keep the amount of information reasonably small, only unstructured variables are listed, as they usually contain the information relevant for detecting the cause of a trap. An example of a dump follows: 0 program run error (output); 0 var i: integer; b: boolean; x: real; 0 0 functionf(m,n: integer): integer; 0 begin f:= f(n, m mod n) end; 9 9 begin x:= 9.87654321; b:= true, i:= f(511,31) 20 end. (eof) Halt at 6 because of division by 0 f called at 7 n = 0 m = 1 f called at 7 n = 1 m = 15 f called at 7 n = 15 m = 31 f called at 21 n = 31 m = 511 x = 9.8765432100000E+OOO b = true i = 8506 38 steps 5. The compiler Pascal is a language that can be parsed with a lookahead of a single symbol. The compiler therefore uses the simple and efficient method of top-down parsing with one-symbol lookahead. It is organized as a set of procedures, each representing a specific sentential construct and parsing goal. These procedures may activate each other recursively, just as certain sentential constructs occur recursively. The parsers obtain their input through a scanner called insymbol. This scanner reads the input file character by character, and delivers the next Pascal symbol each time it is called. For this purpose, the scanner requires a lookahead of one character. The total lookahead of the compiler is therefore one symbol plus character. For further information on the principles of operation of a top-down, recursive descent compiler the reader is referred to Wirth[5]. When working his way through the compiler listing, the reader is advised to start with the scanner called insymbol. The next symbol read is assigned to the global I variable sy (which represents the symbollookahead). The scanner receives its input by calling procedure nextch, which assigns the next character read to the global r variable ch (which represents the character lookahead). If an identifier is encountered, the actual identifier is assigned to the global variance id, and if it is a number, its value is assigned to the global variables inum or rnum. If the symbol is a string, the string is directly assigned to the string table stab (note that strings may occur only as parameters in write statements). The set of procedures used to parse and translate Pascal-S programs closely mirrors the syntactic structure of the- language. The reader is advised to consult the syntax diagrams, as they represent abstract flow-charts of the parser procedures. It is useful to keep the following compiler excerpt in mind, which mirrors the way in which the compiler is partitioned and exhibits the interdependence of the principal procedures (see also the procedure-dependence diagram in the Appendix). block constant typ arraytyp parame terlist constantdeclaration typedeclaration variabledeclaration proceduredeclaration statement selector call expression simpleexpression term factor standard functions assignment compound statement if statement case statement repeat statement while statement for statement standard procedures Much effort is spent providing robustness against ill-formed input recovery and obtaining sensible error diagnostics. To achieve this aim, a systematic approach to syntax error handling is used [5, 6]. Its main principle is that each parser always returns control after having advanced up to a symbol that may legally follow the sentential construct that the parser is supposed to process. If the input program contains errors, this goal is usually achieved by skipping text until such an acceptable symbol is encountered. For this purposed, procedures skip and test are used. The scheme requires that each parser know the set of symbols that may legally follow its sentential construct in the current context. To this aim, each parser is provided with a parameter indicating that set of so-called follow symbols (called fsys). This set is, however, augmented by certain key symbols which are never to be ignored. Typically, these key symbols are those which head a specific sentential construct, such as begin, if, type, etc. Hence, these parameters do not necessarily specify legal follow symbols, but rather the symbols where a possible skip has to terminate. The following test program shows the compiler's handling of syntactically illformed texts. Adequate recovery from syntactic errors is indeed a crucial criterion for a system to be used in an environment where errors occur frequently. A list of brief explanations to the error numbers is included in the Appendix. 0 program syntaxerror (output); 0 const m = 10, n := 20 **** ^14 ^16 0 type t = array 1..10 of real; **** ^14 ^11 ^12 0 r := record x: real, **** ^16 ^14 0 b,c: boolean 0 end 0 var i: integer; **** ^14 0 p,q: boolean;x,y: real;; **** ^ 6 0 i: integer, ch: char **** ^ 1 ^14 0 a: array (l..m) of integer; **** ^14 ^ll ^12 0 const y = 3.14159; **** ^56 0 begin i := x m := i **** ^ 6 3 if b do p=(porq); **** ^14 ^52 ^51 ^14 12 while j < 10 **** ^ 0 14 begin k:= .5(x-y; y:=x) **** ^35 ^14 ^ 6 18 end 19 if p then p = 1; else i := 2; **** ^14 ^51 ^14 24 repeat x := p + i*(x>y); **** ^33 30 for x = 1 to q **** ^18 31 begin i:=a[2 I **** ^19 36 until i=j **** ^28 ^ 0 39 for j ;= 1 to n while x > 0 do **** ^35 ^54 50 begin a(j] :=a[j+1);read(i) **** ^11 ^ 0 ^28 ^20 59 end. **** ^14 program incomplete key words 0 undef id 1 multi def 4 ) 6 syntax 11 [ 12 ] 14 ; 16 = 18 convar typ 19 type 20 prog.param 28 no array 33 arith type 35 types 51 := 52 then 54 do 56 begin The compiler can functionally be subdivided into two main parts: the part processing declarations and the part processing statements and expressions. Their common interface is the symbol table tab and further associated tables. They are constructed by the declaration processing part, and constitute the necessary context in which the program statements are to be compiled. Knowledge of the struc- ture of these tables is therefore of fundamental importance. The key table is tab. Each declared identifier causes one entry. All entries of identifiers local to the same procedure (block) are linked together. Note that the compiler program is written without using any dynamic structures and pointers. Hence, a linked chain is represented by explicit array indices. The field called obi indicates whether the identifier denotes a constant, a variable, a type, a procedure, or a function. Its type is specified by the field called typo The meaning of the remaining fields varies according to the object and type. If the type of an entry is an array type, then the ref field is an index. to the table of array structures called aref; if it is a record type, the ref field contains an index to the table of records and blocks called btab. The Boolean field nonnal specifies whether an entry is an actual (normal) variable to be addressed directly or a formal parameter to be addressed indirectly. The fields lev and adr specify the address pair of a variable or procedure. If the entry denotes a constant and the constant is of type integer, Boolean, or char, then the adr field indicates its value. If its type is real, the adr field specifies an index of the table of real numbers called rconst. The array table atab specifies for each array structure its index type and index bounds, its element type (eltyp and elref, where the latter is used in analogy to the field ref above), and its size in terms of storage elements. For the sake of convenience only, the size of an element is also present, although it could easily be derived via eltyp and elref. Each procedure and each record type defmition causes an entry in the table of 'blocks' called btab. It contains pointers to the last identifier (in tab) defmed local I to the block and to the last parameter of the corresponding procedure. Note that all previous entities can be accessed through the linked chain. Morever, the entry specifies the storage size needed to represent the set of variables belonging to the respective record of procedure. A sample program and the structure of the tables constructed during its compilation are shown below. These tables are derived from the auxiliary output generated by the compiler itself. Note that these tables are not released (collapsed) after exit from a block, as the information gathered may be required to generate a post- mortem dump. They are also accessed by some orders during program execution, e.g. by procedure calls and index orders. This makes it unnecessary to copy certain information into the code (e.g. array-index bounds and data-segment lengths), thereby contributing to code density. 0 program testO(output); 0 const ten = 10; plus = '+'; 0 type row = array [I..ten] of real; 0 complex = record re,im: real end; 0 var i, j: integer; 0 p: boolean; 0 z: complex; r 0 matrix: array [-3..+3] of row; 0 pattern: array [1..5,1..5] of char; 0 0 procedure dummy(var i: integer; var z: complex); 0 var u,v: row; 0 hl,h2: record c: complex;r: row 0 end; 0 0 function null(x,y:real; z: complex): boolean; 0 var a: array ['a'..'z'] of complex; 0 u: char; 0 begin while x j then i := i-j else j := j-i 46 until i = j; 53 writeln(i) 55 end. Identifiers: link obj typ ref nrm lev adr 29 ten 0 0 1 0 0 1 10 30 plus 29 0 4 0 0 1 37 31 row 30 2 5 1 0 1 10 32 complex 31 2 6 3 0 1 2 33 re 0 1 2 0 1 2 0 34 im 33 1 2 0 1 2 1 35 i 32 1 1 0 1 1 5 36 j 35 1 1 0 1 1 6 37 p 36 1 3 0 1 1 7 38 z 37 1 6 3 1 1 8 39 matrix 38 1 5 2 1 1 10 40 pattern 39 1 5 3 1 1 80 41 dummy 40 3 0 4 1 1 16 42 i 0 1 1 0 0 2 5 43 z 42 1 6 3 0 2 6 44 u 43 1 5 1 1 2 7 45 v 44 1 5 1 1 2 17 46 hI 45 1 6 5 1 2 27 47 h2 46 1 6 5 1 2 39 48 c 0 1 6 3 1 3 0 49 r 48 1 5 1 1 3 2 50 null 47 4 3 6 1 2 0 51 x 0 1 2 0 1 3 5 52 Y 51 1 2 0 1 3 6 53 z 52 1 6 3 1 3 7 54 a 53 1 5 5 1 3 9 55 u 54 1 4 0 1 3 61 Blocks: last 1par psze vsze 1 28 1 0 0 2 41 28 5 105 3 34 0 0 2 4 50 43 7 51 5 49 0 0 12 6 55 53 9 62 Arrays: xtyp etyp eref low high elsz size 1 1 2 0 1 10 1 10 2 1 5 1 -3 3 10 70 3 1 5 4 1 5 5 25 4 1 4 0 1 5 1 5 5 4 6 3 1 26 2 52 Code: 0 13 5, 13 6, 41 , 11 10, 03 5, 5 13 5, 25 1, 54 , 38 , 10 0, 10 03 0, 13 5, 13 6, 39 , 38 , 15 33 , 01 7, 18 50, 02 27, 34 , 20 02 39, 9 1, 34 , 12 6, 22 2, 25 19 8, 38 , 32 , 01 5, 24 85, 30 38 , 01 6, 24 51, 38 , 11 5, 35 11 6, 49 , 11 44, 0 5, 11 5, 40 11 6, 53 , 38 , 10 49, 01 6, 45 11 6, 11 5, 53 , 38 , 11 5, 50 11 6, 45 , 11 34, 1 15, 29 1, 55 63 , 31 , (eof) 17 51 steps 6. Machine-dependencies Every program must be tailored to the facilities that are available in the language and the computing system used. It is desirable to restrict these considerations to the high-level language (in this case Pascal) and to ignore the actual hardware. In the case of a program to be used thousands of times and every day by many people, this is only possible and economically justifiable if the language successfully hides the hardware without causing appreciable loss in efficiency. Although the Pascal 6000- 3.4 compiler satisfies this requirement to a high degree, we nevertheless had to resort to facilities that are not available in standard Pascal in a few instances. These facilities are particular to the Pascal implementation on the CDC computer and are listed in detail below. Their use introduces what may be called First-order machine- dependencies. The only such facility used is the segmented file. It allows one to recognize a substructure of the file called segment (in Pascal terminology) or 'logical record' (in CDC terminology). The fact that the file input is to be treated like a segmented file is indicated by a plus sign in the program parameter list (implemented through Update 10 of Pascal 6000-3.4). A job is represented in the CDC operating system as an input file consisting of three segments; control statements for the operating system, program, and data. The recognition of this substructure is essential for Pascal-S in order to skip backwards to list the input data segment, and it is desirable in order to keep the rules for setting up a Pascal-S job deck identical to those for all other jobs, particularly normal Pascal jobs. The reader is referred to Jensen and Wirth [1], manual section 13.A.1, for an explanation of the procedure getseg and the predicate eos. But even when strictly adhering to a machine-independent language such as standard Pascal, a second kind of machine-dependent consideration creeps in if a program is carefully planned. I shall call them second-order machine-dependencies. They are due to the use of knowledge about limitations and characteristics of the underlying system and the desire to use it optimally. They may cause another implementation to reject the program (if its limitations are more severe) or merely to process the program less efficiently. The first category concerns, for instance, the range of available integers or-much more problematic in Pascal-the size of allowable sets. The second category includes considerations of storage structure. In the present system program, such considerations played an important role in achieving high efficiency and economy, and are manifested in the choice of several constants which are all defmed in the beginning of the program. The choice of these constants, explained below, must be reconsidered if Pascal-S is reimplemented on a different computer, be it by hand translation or by recompilation through an already available Pascal compiler. (The reader not concerned with this problem may easily skip the rest of this section.) alng defines the number of characters in the array type alfa. In a word-oriented computer the choice of this value is critical, and should be the number of characters packable into a word or a small number of words. ling defines the maximum length of an input line delivered by the operating system. emax is the maximum value of the decimal exponent of a real number acceptable by the computer. emax = log(maxreal). emin is the minimum value of the decimal exponent. (Smaller numbers are considered as identical to O. For most computers emin = -emax.) kmax is the number of significant digits in a real number, i.e. kmax = m log 2, if m is the number of bits of the mantissa of the binary floating-point number. ermax is the number of error messages available. It is chosen such that type e"s was acceptable to the available Pascal compiler. omax,lmax,nmax were chosen such that the packed record type order could be represented in a single 60-bit word of the CDC computer. xmax is chosen such that the sub range type index occupies a reasonably small part of a word to provide high storage economy in packed tables, yet encompasses a sufficiently large subset of the integers to cover all array-index values. lineleng is equal to the maximum number of characters permissable in a line to be printed. Further comments about second-order machine-dependencies follow. The type order allows for negative values of components f and x, although this is not needed. The reason is that access to signed fields of packed records is more efficient in the Pascal 6000-3.4 system, and because these accesses are very frequent in the interpreter. The number of basic symbols much be such that the type symset is acceptable to the available Pascal compiler. The array sps is a constant table used by the scanner. Its index range must be such that it covers all characters which are neither letters nor digits. For the sake of character set independence, its range is indicated as being the entire character set. The set constants used in the scanner depend on the assertion that ch in ['a'..'z'] = ch is a letter ch in ['0'..'9'] = ch is a digit Most available character sets satisfy these equivalences. Several statements in the scanner rely on them by computing the numeric value of a digit x as ord(x )-ord ('0'). This is the only r~quirement imposed on the ordering of character sets. The present version of Pascal-S assumes the use of the restricted ASCII character set. Unused characters cause an error indication when encountered. In the interpreter, the functions chr and ord are implemented as dummy procedures, because we postulate for characters the ordering given by the collating sequence of the given character set (thereby accepting an implicit machine-dependence). The constants 0 and 63 are the ordinal numbers of the first and the last character in the given set. Note that an element in the stack may represent a value of any of the four standard types integer, real, boolean, or char. This implies that the same amount of storage is allocated for values of these types. This results in uneconomical utilization of store, particularly for Boolean and character values, but it simplifies both compiler and interpreter considerably. With regard to real numbers, a floating-point representation should be chosen that uses the same number of words or bytes as integers, since a very high numerical precision is usually not required in the types of problems for which Pascal-S is intended. A particular problem of practical importance is the regaining of control in the case of a trap performed by the hardware or the underlying operating system. No solution for this problem is indicated here, as it is inherently dependent on the environment. 7. The compiler-interpreter program program Pascals(input+,output); (* 1.6.75 *) (* N. Wirth, E.T.H CH-8092 Zurich *) label 99; const nkw = 27; (* no. of key words *) alng = 10; (* no. of significant chars in identifiers *) llng = 120; (* input line length *) emax = 322; (* max exponent of real numbers *) emin = -292; (* min exponent *) kmax = 15; (* max no. of significant digits *) tmax = 100; (* size of table *) bmax = 20; (* size of block-table *) amax = 30; (* size of array-table *) c2max = 20; (* size of real constant table *) csmax = 30; (* max no. of cases *) cmax = 850; (* size of code *) lmax = 7; (* maximum level *) smax = 600; (* size of string table *) ermax = 58; (* max error no. *) omax = 63; (* highest order code *) xmax = 131071; (* 2**17 - 1 *) nmax = 281474976710655; (* 2**48 - 1 *) lineleng = 136; (* output line length *) linelimit = 200; stacksize = 1500; type symbol = (intcon, realcon, charcon, stringt, notsy, plus, minus, times, idiv, rdiv, imod, andsy, orsy, egl, neg, gtr, geg, lss, leg, lparent, rparent, lbrack, rbrack, comma, semicolon, period, colon, becomes, constsy, typesy, varsy, functionsy, proceduresy, arraysy, recordsy, programsy, ident, beginsy, ifsy, casesy, repeatsy, whilesy, forsy, endsy, elsesy, untilsy, ofsy, dosy, tosy, downtosy, thensy); index = -xmax .. +xmax; alfa = packed array [1..alng] of char; object = (konstant, variable, typel, prozedure, funktion); types = (notyp, ints, reals, bools, chars, arrays, records); symset = set of symbol; typset = set of types; item = record typ: types; ref: index; end; order = packed record f: -omax..+omax; x: -lmax..+lmax; y: -nmax..+nmax; end; var sy: symbol; (* last symbol read by insymbol *) id: alfa; (* identifier from insymbol *) inum: integer; (* integer from insymbol *) rnum: real; (* real number from insymbol *) sleng: integer; (* string length *) ch: char; (* last character read from source program *) line: array [1..llng] of char; cc: integer; (* character counter *) lc: integer; (* program location counter *) ll: integer; (* length of current line *) errs: set of 0..ermax; errpos: integer; progname: alfa; iflag, oflag: boolean; constbegsys, typebegsys, blockbegsys, facbegsys, statbegsys: symset; key: array [1..nkw] of alfa; ksy: array [1..nkw] of symbol; sps: array [char] of symbol; (* special symbols *) t, a, b, sx, c1, c2: integer; (* indicies to tables *) stantyps: typset; display: array [0..lmax] of integer; tab: array [0..tmax] of (* identifier table *) packed record name: alfa; link: index; obj: object; typ: types; ref: index; normal: boolean; lev: 0..lmax; adr: integer; end; atab: array [1..amax] of (* array-table *) packed record inxtyp, eltyp: types; elref, low, high, elsize, size: index; end; btab: array [1..bmax] of (* block table *) packed record last, lastpar, psize, vsize: index end; stab: packed array [0..smax] of char; (* string table *) rconst: array [1..c2max] of real; code: array [0..cmax] of order; procedure errormsg; var k: integer; msg: array [0..ermax] of alfa; begin msg[ 0] := 'undef id '; msg[ 1] := 'multi def '; msg[ 2] := 'identifier'; msg[ 3] := 'program '; msg[ 4] := ') '; msg[ 5] := ': '; msg[ 6] := 'syntax '; msg[ 7] := 'ident, var'; msg[ 8] := 'of '; msg[ 9] := '( '; msg[10] := 'id, array '; msg[11] := '[ '; msg[12] := '] '; msg[13] := '.. '; msg[14] := '; '; msg[15] := 'func. type'; msg[16] := '= '; msg[17] := 'boolean '; msg[18] := 'convar typ'; msg[19] := 'type '; msg[20] := 'prog.param'; msg[21] := 'too big '; msg[22] := '. '; msg[23] := 'typ (case)'; msg[24] := 'character '; msg[25] := 'const id '; msg[26] := 'index type'; msg[27] := 'indexbound'; msg[28] := 'no array '; msg[29] := 'type id '; msg[30] := 'undef type'; msg[31] := 'no record '; msg[32] := 'boole type'; msg[33] := 'arith type'; msg[34] := 'integer '; msg[35] := 'types '; msg[36] := 'param type'; msg[37] := 'variab id '; msg[38] := 'string '; msg[39] := 'no.of pars'; msg[40] := 'type '; msg[41] := 'type '; msg[42] := 'real type '; msg[43] := 'integer '; msg[44] := 'var, const'; msg[45] := 'var, proc '; msg[46] := 'types (:=)'; msg[47] := 'typ (case)'; msg[48] := 'type '; msg[49] := 'store ovfl'; msg[50] := 'constant '; msg[51] := ':= '; msg[52] := 'then '; msg[53] := 'until '; msg[54] := 'do '; msg[55] := 'to downto '; msg[56] := 'begin '; msg[57] := 'end '; msg[58] := 'factor '; k := 0; writeln; writeln(' key words'); while errs <> [] do begin while not (k in errs) do k := k+1; writeln(k,' ',msg[k]); errs := errs - [k] end end (* errormsg *); procedure nextch; (* read next character; process line end *) begin if cc = ll then begin if eos(input) then begin writeln; writeln(' program incomplete'); errormsg; goto 99 end; if errpos <> 0 then begin writeln; errpos := 0 end; write(lc:5, ' '); ll := 0; cc := 0; while not eoln(input) do begin ll := ll+1; read(ch); write(ch); line[ll] := ch end; writeln; ll := ll+1; read(line[ll]); end; cc := cc+1; ch := line[cc]; end (* nextch *); procedure error(n: integer); begin if errpos = 0 then write(' ****'); if cc > errpos then begin write(' ': cc-errpos, '^', n:2); errpos := cc+3; errs := errs + [n] end end (* error *); procedure fatal(n: integer); var msg: array [1..7] of alfa; begin writeln; errormsg; msg[ 1] := 'identifier'; msg[ 2] := 'procedures'; msg[ 3] := 'reals '; msg[ 4] := 'arrays '; msg[ 5] := 'levels '; msg[ 6] := 'code '; msg[ 7] := 'strings '; writeln(' compiler table for ', msg[n], ' is too small'); goto 99 (* terminate compilation *) end (* fatal *); procedure insymbol; (* reads next symbol *) label 1, 2, 3; var i, j, k, e: integer; procedure readscale; var s, sign: integer; begin nextch; sign := 1; s := 0; if ch = '+' then nextch else if ch = '-' then begin nextch; sign := -1 end; while ch in ['0'..'9'] do begin s := 10*s + ord(ch) - ord('0'); nextch end; e := s*sign + e end (* readscale *); procedure adjustscale; var s: integer; d, t: real; begin if k+e > emax then error(21) else if k+e < emin then rnum := 0 else begin s := abs(e); t := 1.0; d := 10.0; repeat while not odd(s) do begin s := s div 2; d := sqr(d) end; s := s-1; t := d*t until s = 0; if e >= 0 then rnum := rnum*t else rnum := rnum/t end end (* adjustscale *); begin (* insymbol *) 1: while ch = ' ' do nextch; if ch in ['a'..'z'] then begin (* word *) k := 0; id := ' '; repeat if k < alng then begin k := k+1; id[k] := ch end; nextch until not (ch in ['a'..'z', '0'..'9']); i := 1; j := nkw; (* binary search *) repeat k := (i+j) div 2; if id <= key[k] then j := k-1; if id >= key[k] then i := k+1 until i > j; if i-1 > j then sy := ksy[k] else sy := ident end else if ch in ['0'..'9'] then begin (* number *) k := 0; inum := 0; sy := intcon; repeat inum := inum*10 + ord(ch) - ord('0'); k := k+1; nextch until not (ch in ['0'..'9']); if (k > kmax) or (inum > nmax) then begin error(21); inum := 0; k := 0 end; if ch = '.' then begin nextch; if ch = '.' then ch := ':' else begin sy := realcon; rnum := inum; e := 0; while ch in ['0'..'9'] do begin e := e-1; rnum := 10.0*rnum + (ord(ch)-ord('0')); nextch end; if ch = 'e' then readscale; if e <> 0 then adjustscale end end else if ch = 'e' then begin sy := realcon; rnum := inum; e := 0; readscale; if e <> 0 then adjustscale end; end else case ch of ':': begin nextch; if ch = '=' then begin sy := becomes; nextch end else sy := colon end; '<': begin nextch; if ch = '=' then begin sy := leg; nextch end else if ch = '>' then begin sy := neg; nextch end else sy := lss end; '>': begin nextch; if ch = '=' then begin sy := geg; nextch end else sy := gtr end; '.': begin nextch; if ch = '.' then begin sy := colon; nextch end else sy := period end; '''': begin k := 0; 2: nextch; if ch = '''' then begin nextch; if ch <> '''' then goto 3 end; if sx+k = smax then fatal(7); stab[sx+k] := ch; k := k+1; if cc = 1 then begin (* end of line *) k := 0; end else goto 2; 3: if k = 1 then begin sy := charcon; inum := ord(stab[sx]) end else if k = 0 then begin error(38); sy := charcon; inum := 0 end else begin sy := stringt; inum := sx; sleng := k; sx := sx+k end end; '(': begin nextch; if ch <> '*' then sy := lparent else begin (* comment *) nextch; repeat while ch <> '*' do nextch; nextch until ch = ')'; nextch; goto 1 end end; '+', '-', '*', '/', ')', '=', ',', '[', ']', '#', '&', ';': begin sy := sps[ch]; nextch end; '$', '%', '@', '\', '~', '{', '}', '^': begin error(24); nextch; goto 1 end end end (* insymbol *); procedure enter(x0: alfa; x1: object; x2: types; x3: integer); begin t := t+1; (* enter standard identifier *) with tab[t] do begin name := x0; link := t-1; obj := x1; typ := x2; ref := 0; normal := true; lev := 0; adr := x3 end end (* enter *); procedure enterarray(tp: types; l, h: integer); begin if l > h then error(27); if (abs(l)>xmax) or (abs(h)>xmax) then begin error(27); l := 0; h := 0; end; if a = amax then fatal(4) else begin a:= a+1; with atab[a] do begin inxtyp := tp; low := l; high := h end end end (* enterarray *); procedure enterblock; begin if b = bmax then fatal(2) else begin b := b+1; btab[b].last := 0; btab[b].lastpar := 0 end end (* enterblock *); procedure enterreal(x: real); begin if c2 = c2max-1 then fatal(3) else begin rconst[c2+1] := x; c1 := 1; while rconst[c1] <> x do c1 := c1+1; if c1 > c2 then c2 := c1 end end (* enterreal *); procedure emit(fct: integer); begin if lc = cmax then fatal(6); code[lc].f := fct; lc := lc+1 end (* emit *); procedure emit1(fct, b: integer); begin if lc = cmax then fatal(6); with code[lc] do begin f := fct; y := b end; lc := lc+1 end (* emit1 *); procedure emit2(fct, a, b: integer); begin if lc = cmax then fatal(6); with code[lc] do begin f := fct; x := a; y := b end; lc := lc+1 end (* emit2 *); procedure printtables; var i: integer; o: order; begin writeln('0identifiers link obj typ ref nrm lev adr'); for i := btab[1].last +1 to t do with tab[i] do writeln(i, ' ', name, link:5, ord(obj):5, ord(typ):5, ref:5, ord(normal):5, lev:5, adr:5); writeln('0blocks last lpar psze vsze'); for i := 1 to b do with btab[i] do writeln(i, last:5, lastpar:5, psize:5, vsize:5); writeln('0arrays xtyp etyp eref low high elsz size'); for i := 1 to a do with atab[i] do writeln(i, ord(inxtyp):5, ord(eltyp):5, elref:5, low:5, high:5, elsize:5, size:5); writeln('0code:'); for i := 0 to lc-1 do begin if i mod 5 = 0 then begin writeln; write(i: 5) end; o := code[i]; write(o.f:5); if o.f < 31 then if o.f < 4 then write(o.x:2, o.y:5) else write(o.y:7) else write(' '); write(',') end; writeln end (* printtables *); procedure block(fsys: symset; isfun: boolean; level: integer); type conrec = record case tp: types of ints, chars, bools: (i: integer); reals: (r: real); notyp, arrays, records: (); end; var dx: integer; (* data allocation index *) prt: integer; (* t-index of this procedure *) prb: integer; (* b-index of this procedure *) x: integer; procedure skip(fsys: symset; n: integer); begin error(n); while not (sy in fsys) do insymbol end (* skip *); procedure test(s1, s2: symset; n: integer); begin if not (sy in s1) then skip(s1+s2, n) end (* test *); procedure testsemicolon; begin if sy = semicolon then insymbol else begin error(14); if sy in [comma, colon] then insymbol end; test([ident]+blockbegsys, fsys, 6) end (* testsemicolon *); procedure enter(id: alfa; k: object); var j, l: integer; begin if t = tmax then fatal(1) else begin tab[0].name := id; j := btab[display[level]].last; l := j; while tab[j].name <> id do j := tab[j].link; if j <> 0 then error(1) else begin t := t+1; with tab[t] do begin name := id; link := l; obj := k; typ := notyp; ref := 0; lev := level; adr := 0 end; btab[display[level]].last := t end end end (* enter *); function loc(id: alfa): integer; var i, j: integer; (* locate id in table *) begin i := level; tab[0].name := id; (* sentinel *) repeat j := btab[display[i]].last; while tab[j].name <> id do j := tab[j].link; i := i-1; until (i<0) or (j<>0); if j = 0 then error(0); loc := j end (* loc *); procedure entervariable; begin if sy = ident then begin enter(id, variable); insymbol end else error(2) end (* entervariable *); procedure constant(fsys: symset; var c: conrec); var x, sign: integer; begin c.tp := notyp; c.i := 0; test(constbegsys, fsys, 50); if sy in constbegsys then begin if sy = charcon then begin c.tp := chars; c.i := inum; insymbol end else begin sign := 1; if sy in [plus, minus] then begin if sy = minus then sign := -1; insymbol end; if sy = ident then begin x := loc(id); if x <> 0 then if tab[x].obj <> konstant then error(25) else begin c.tp := tab[x].typ; if c.tp = reals then c.r := sign*rconst[tab[x].adr] else c.i := sign*tab[x].adr end; insymbol end else if sy = intcon then begin c.tp := ints; c.i := sign*inum; insymbol end else if sy = realcon then begin c.tp := reals; c.r := sign*rnum; insymbol end else skip(fsys, 50) end; test(fsys, [], 6) end end (* constant *); procedure typ(fsys: symset; var tp: types; var rf, sz: integer); var x: integer; eltp: types; elrf: integer; elsz, offset, t0, t1: integer; procedure arraytyp(var aref, arsz: integer); var eltp: types; low, high: conrec; elrf, elsz: integer; begin constant([colon, rbrack, rparent, ofsy]+fsys, low); if low.tp = reals then begin error(27); low.tp := ints; low.i := 0 end; if sy = colon then insymbol else error(13); constant([rbrack, comma, rparent, ofsy]+fsys, high); if high.tp <> low.tp then begin error(27); high.i := low.i end; enterarray(low.tp, low.i, high.i); aref := a; if sy = comma then begin insymbol; eltp := arrays; arraytyp(elrf, elsz) end else begin if sy = rbrack then insymbol else begin error(12); if sy = rparent then insymbol end; if sy = ofsy then insymbol else error(8); typ(fsys, eltp, elrf, elsz) end; with atab[aref] do begin arsz := (high-low+1)*elsz; size := arsz; eltyp := eltp; elref := elrf; elsize := elsz end; end (* arraytyp *); begin (* typ *) tp := notyp; rf := 0; sz := 0; test(typebegsys, fsys, 10); if sy in typebegsys then begin if sy = ident then begin x := loc(id); if x <> 0 then with tab[x] do if obj <> typel then error(29) else begin tp := typ; rf := ref; sz := adr; if tp = notyp then error(30) end; insymbol end else if sy = arraysy then begin insymbol; if sy = lbrack then insymbol else begin error(11); if sy = lparent then insymbol end; tp := arrays; arraytyp(rf, sz) end else begin (* records *) insymbol; enterblock; tp := records; rf := b; if level = lmax then fatal(5); level := level+1; display[level] := b; offset := 0; while sy <> endsy do begin (* field section *) if sy = ident then begin t0 := t; entervariable; while sy = comma do begin insymbol; entervariable end; if sy = colon then insymbol else error(5); t1 := t; typ(fsys+[semicolon, endsy, comma, ident], eltp, elrf, elsz); while t0 < t1 do begin t0 := t0+1; with tab[t0] do begin typ := eltp; ref := elrf; normal := true; adr := offset; offset := offset + elsz end end end; if sy <> endsy then begin if sy = semicolon then insymbol else begin error(14); if sy = comma then insymbol end; test([ident, endsy, semicolon], fsys, 6) end end; btab[rf].vsize := offset; sz := offset; btab[rf].psize := 0; insymbol; level := level-1 end; test(fsys, [], 6) end end (* typ *); procedure parameterlist; (* formal parameter list *) var tp: types; rf, sz, x, t0: integer; valpar: boolean; begin insymbol; tp := notyp; rf := 0; sz := 0; test([ident, varsy], fsys+[rparent], 7); while sy in [ident, varsy] do begin if sy <> varsy then valpar := true else begin insymbol; valpar := false end; t0 := t; entervariable; while sy = comma do begin insymbol; entervariable; end; if sy = colon then begin insymbol; if sy <> ident then error(2) else begin x := loc(id); insymbol; if x <> 0 then with tab[x] do if obj <> typel then error(29) else begin tp := typ; rf := ref; if valpar then sz := adr else sz := 1 end; end; test([semicolon, rparent], [comma, ident]+fsys, 14) end else error(5); while t0 < t do begin t0 := t0+1; with tab[t0] do begin typ := tp; ref := rf; normal := valpar; adr := dx; lev := level; dx := dx + sz end end; if sy <> rparent then begin if sy = semicolon then insymbol else begin error(14); if sy = comma then insymbol end; test([ident, varsy], [rparent]+fsys, 6) end end (* while *); if sy = rparent then begin insymbol; test([semicolon, colon], fsys, 6) end else error(4) end (* parameter list *); procedure constantdeclaration; var c: conrec; begin insymbol; test([ident], blockbegsys, 2); while sy = ident do begin enter(id, konstant); insymbol; if sy = egl then insymbol else begin error(16); if sy = becomes then insymbol end; constant([semicolon, comma, ident]+fsys, c); tab[t].typ := c.tp; tab[t].ref := 0; if c.tp = reals then begin enterreal(c.r); tab[t].adr := c1 end else tab[t].adr := c.i; testsemicolon end end (* constantdeclaration *); procedure typedeclaration; var tp: types; rf, sz, t1: integer; begin insymbol; test([ident], blockbegsys, 2); while sy = ident do begin enter(id, typel); t1 := t; insymbol; if sy = egl then insymbol else begin error(16); if sy = becomes then insymbol end; typ([semicolon, comma, ident]+fsys, tp, rf, sz); with tab[t1] do begin typ := tp; ref := rf; adr := sz end; testsemicolon end end (* typedeclaration *); procedure variabledeclaration; var t0, t1, rf, sz: integer; tp: types; begin insymbol; while sy = ident do begin t0 := t; entervariable; while sy = comma do begin insymbol; entervariable; end; if sy = colon then insymbol else error(5); t1 := t; typ([semicolon, comma, ident]+fsys, tp, rf, sz); while t0 < t1 do begin t0 := t0+1; with tab[t0] do begin typ := tp; ref := rf; lev := level; adr := dx; normal := true; dx := dx + sz end end; testsemicolon end end (* variabledeclaration *); procedure procdeclaration; var isfun: boolean; begin isfun := sy = functionsy; insymbol; if sy <> ident then begin error(2); id := ' '; end; if isfun then enter(id, funktion) else enter(id, prozedure); tab[t].normal := true; insymbol; block([semicolon]+fsys, isfun, level+1); if sy = semicolon then insymbol else error(14); emit(32+ord(isfun)) (* exit *) end (* proceduredeclaration *); procedure statement(fsys: symset); var i: integer; procedure expression(fsys: symset; var x: item); forward; procedure selector(fsys: symset; var v: item); var x: item; a, j: integer; begin (* sy in [lparent, lbrack, period] *) repeat if sy = period then begin insymbol; (* field selector *) if sy <> ident then error(2) else begin if v.typ <> records then error(31) else begin (* search field identifier *) j := btab[v.ref].last; tab[0].name := id; while tab[j].name <> id do j := tab[j].link; if j = 0 then error(0); v.typ := tab[j].typ; v.ref := tab[j].ref; a := tab[j].adr; if a <> 0 then emit1(9, a) end; insymbol end end else begin (* array selector *) if sy <> lbrack then error(11); repeat insymbol; expression(fsys+[comma, rbrack], x); if v.typ <> arrays then error(28) else begin a := v.ref; if atab[a].inxtyp <> x.typ then error(26) else if atab[a].elsize = 1 then emit1(20, a) else emit1(21, a); v.typ := atab[a].eltyp; v.ref := atab[a].elref end until sy <> comma; if sy = rbrack then insymbol else begin error(12); if sy = rparent then insymbol end end until not (sy in [lbrack, lparent, period]); test(fsys, [], 6) end (* selector *); procedure call(fsys: symset; i: integer); var x: item; lastp, cp, k: integer; begin emit1(18, i); (* mark stack *) lastp := btab[tab[i].ref].lastpar; cp := i; if sy = lparent then begin (* actual parameter list *) repeat insymbol; if cp >= lastp then error(39) else begin cp := cp+1; if tab[cp].normal then begin (* value parameter *) expression(fsys+[comma, colon, rparent], x); if x.typ = tab[cp].typ then begin if x.ref <> tab[cp].ref then error(36) else if x.typ = arrays then emit1(22, atab[x.ref].size) else if x.typ = records then emit1(22, btab[x.ref].vsize) end else if (x.typ = ints) and (tab[cp].typ = reals) then emit1(26, 0) else if x.typ <> notyp then error(36); end else begin (* variable parameter *) if sy <> ident then error(2) else begin k := loc(id); insymbol; if k <> 0 then begin if tab[k].obj <> variable then error(37); x.typ := tab[k].typ; x.ref := tab[k].ref; if tab[k].normal then emit2(0, tab[k].lev, tab[k].adr) else emit2(1, tab[k].lev, tab[k].adr); if sy in [lbrack, lparent, period] then selector(fsys+[comma, colon, rparent], x); if (x.typ <> tab[cp].typ) or (x.ref<>tab[cp].ref) then error(36) end end end end; test([comma, rparent], fsys, 6) until sy <> comma; if sy = rparent then insymbol else error(4) end; if cp < lastp then error(39); (* too few actual parameters *) emit1(19, btab[tab[i].ref].psize-1); if tab[i].lev < level then emit2(3, tab[i].lev, level) end (* call *); function resulttype(a, b: types): types; begin if (a>reals) or (b>reals) then begin error(33); resulttype := notyp end else if (a=notyp) or (b=notyp) then resulttype := notyp else if a=ints then if b=ints then resulttype := ints else begin resulttype := reals; emit1(26, 1) end else begin resulttype := reals; if b=ints then emit1(26, 0) end end (* resulttype *); procedure expression; var y: item; op: symbol; procedure simpleexpression(fsys: symset; var x: item); var y: item; op: symbol; procedure term(fsys: symset; var x: item); var y: item; op: symbol; procedure factor(fsys: symset; var x: item); var i, f: integer; procedure standfct(n: integer); var ts: typset; begin (* standard function no. n *) if sy = lparent then insymbol else error(9); if n < 17 then begin expression(fsys+[rparent], x); case n of (* abs, sqr *) 0, 2: begin ts:= [ints, reals]; tab[i].typ := x.typ; if x.typ = reals then n := n+1 end; (* odd, chr *) 4, 5: ts := [ints]; (* ord *) 6: ts := [ints, bools, chars]; (* succ, pred *) 7, 8: ts := [chars]; (* round, trunc *) 9, 10, 11, 12, 13, 14, 15, 16: (* sin, cos, ...*) begin ts := [ints, reals]; if x.typ = ints then emit1(26, 0) end; end; if x.typ in ts then emit1(8, n) else if x.typ <> notyp then error(48) end else (* eof, eoln *) begin (* n in [17, 18] *) if sy <> ident then error(2) else if id <> 'input ' then error(0) else insymbol; emit1(8, n); end; x.typ := tab[i].typ; if sy = rparent then insymbol else error(4) end (* standfct *); begin (* factor *) x.typ := notyp; x.ref := 0; test(facbegsys, fsys, 58); while sy in facbegsys do begin if sy = ident then begin i := loc(id); insymbol; with tab[i] do case obj of konstant: begin x.typ := typ; x.ref := 0; if x.typ = reals then emit1(25, adr) else emit1(24, adr) end; variable: begin x.typ := typ; x.ref := ref; if sy in [lbrack, lparent, period] then begin if normal then f := 0 else f := 1; emit2(f, lev, adr); selector(fsys, x); if x.typ in stantyps then emit(34) end else begin if x.typ in stantyps then if normal then f := 1 else f := 2 else if normal then f := 0 else f := 1; emit2(f, lev, adr) end end; typel, prozedure: error(44); funktion: begin x.typ := typ; if lev <> 0 then call(fsys, i) else standfct(adr) end end (* case, with *) end else if sy in [charcon, intcon, realcon] then begin if sy = realcon then begin x.typ := reals; enterreal(rnum); emit1(25, c1) end else begin if sy = charcon then x.typ := chars else x.typ := ints; emit1(24, inum) end; x.ref := 0; insymbol end else if sy = lparent then begin insymbol; expression(fsys+[rparent], x); if sy = rparent then insymbol else error(4) end else if sy = notsy then begin insymbol; factor(fsys, x); if x.typ=bools then emit(35) else if x.typ<>notyp then error(32) end; test(fsys, facbegsys, 6) end (* while *) end (* factor *); begin (* term *) factor(fsys+[times, rdiv, idiv, imod, andsy], x); while sy in [times, rdiv, idiv, imod, andsy] do begin op := sy; insymbol; factor(fsys+[times, rdiv, idiv, imod, andsy], y); if op = times then begin x.typ := resulttype(x.typ, y.typ); case x.typ of notyp: ; ints : emit(57); reals: emit(60); end end else if op = rdiv then begin if x.typ = ints then begin emit1(26, 1); x.typ := reals end; if y.typ = ints then begin emit1(26, 0); y.typ := reals end; if (x.typ=reals) and (y.typ=reals) then emit(61) else begin if (x.typ<>notyp) and (y.typ<>notyp) then error(32); x.typ := notyp end end else if op = andsy then begin if (x.typ=bools) and (y.typ=bools) then emit(56) else begin if (x.typ<>notyp) and (y.typ<>notyp) then error(32); x.typ := notyp end end else begin (* op in [idiv, imod] *) if (x.typ=ints) and (y.typ=ints) then if op=idiv then emit(58) else emit(59) else begin if (x.typ<>notyp) and (y.typ<>notyp) then error(34); x.typ := notyp end end end end (* term *); begin (* simpleexpression *) if sy in [plus, minus] then begin op := sy; insymbol; term(fsys+[plus, minus], x); if x.typ > reals then error(33) else if op = minus then emit(36) end else term(fsys+[plus, minus, orsy], x); while sy in [plus, minus, orsy] do begin op := sy; insymbol; term(fsys+[plus, minus, orsy], y); if op = orsy then begin if (x.typ=bools) and (y.typ=bools) then emit(51) else begin if (x.typ<>notyp) and (y.typ<>notyp) then error(32); x.typ := notyp end end else begin x.typ := resulttype(x.typ, y.typ); case x.typ of notyp: ; ints: if op = plus then emit (52) else emit(53); reals: if op = plus then emit(54) else emit(55) end end end end (* simpleexpression *); begin (* expression *) simpleexpression(fsys+[egl, neg, lss, leg, gtr, geg], x); if sy in [egl, neg, lss, leg, gtr, geg] then begin op := sy; insymbol; simpleexpression(fsys, y); if (x.typ in [notyp, ints, bools, chars]) and (x.typ = y.typ) then case op of egl: emit(45); neg: emit(46); lss: emit(47); leg: emit(48); gtr: emit(49); geg: emit(50); end else begin if x.typ = ints then begin x.typ := reals; emit1(26, 1) end else if y.typ = ints then begin y.typ := reals; emit1(26, 0) end; if (x.typ=reals) and (y.typ=reals) then case op of egl: emit(39); neg: emit(40); lss: emit(41); leg: emit(42); gtr: emit(43); geg: emit(44); end else error(35) end; x.typ := bools end end (* expression *); procedure assignment(lv, ad: integer); var x,y: item; f: integer; (* tab[i].obj in [variable, prozedure] *) begin x.typ := tab[i].typ; x.ref := tab[i].ref; if tab[i].normal then f := 0 else f := 1; emit2(f, lv, ad); if sy in [lbrack, lparent, period] then selector([becomes, egl]+fsys, x); if sy = becomes then insymbol else begin error(51); if sy = egl then insymbol end; expression(fsys, y); if x.typ = y.typ then if x.typ in stantyps then emit(38) else if x.ref <> y.ref then error(46) else if x.typ = arrays then emit1(23, atab[x.ref].size) else emit1(23, btab[x.ref].vsize) else if (x.typ=reals) and (y.typ=ints) then begin emit1(26, 0); emit(38) end else if (x.typ<>notyp) and (y.typ<>notyp) then error(46) end (* assignment *); procedure compoundstatement; begin insymbol; statement([semicolon, endsy]+fsys); while sy in [semicolon]+statbegsys do begin if sy = semicolon then insymbol else error(14); statement([semicolon, endsy]+fsys) end; if sy = endsy then insymbol else error(57) end (* compoundstatement *); procedure ifstatement; var x: item; lc1, lc2: integer; begin insymbol; expression(fsys+[thensy, dosy], x); if not (x.typ in [bools, notyp]) then error(17); lc1 := lc; emit(11); (* jmpc *) if sy = thensy then insymbol else begin error(52); if sy = dosy then insymbol end; statement(fsys+[elsesy]); if sy = elsesy then begin insymbol; lc2 := lc; emit(10); code[lc1].y := lc; statement(fsys); code[lc2].y := lc end else code[lc1].y := lc end (* if statment *); procedure casestatement; var x: item; i, j, k, lc1: integer; casetab: array [1..csmax] of packed record val, lc: index end; exittab: array [1..csmax] of integer; procedure caselabel; var lab: conrec; k: integer; begin constant(fsys+[comma, colon], lab); if lab.tp <> x.typ then error(47) else if i = csmax then fatal(6) else begin i := i+1; k := 0; casetab[i].val := lab.i; casetab[i].lc := lc; repeat k := k+1 until casetab[k].val = lab.i; if k < i then error(1); (* multiple definition *) end end (* caselabel *); procedure onecase; begin if sy in constbegsys then begin caselabel; while sy = comma do begin insymbol; caselabel end; if sy = colon then insymbol else error(5); statement([semicolon, endsy]+fsys); j := j+1; exittab[j] := lc; emit(10) end end (* onecase *); begin insymbol; i := 0; j := 0; expression(fsys+[ofsy, comma, colon], x); if not (x.typ in [ints, bools, chars, notyp]) then error(23); lc1 := lc; emit(12); (* jmpx *) if sy = ofsy then insymbol else error(8); onecase; while sy = semicolon do begin insymbol; onecase end; code[lc1].y := lc; for k := 1 to i do begin emit1(13, casetab[k].val); emit1(13, casetab[k].lc) end; emit1(10, 0); for k := 1 to j do code[exittab[k]].y := lc; if sy = endsy then insymbol else error(57) end (* casestement *); procedure repeatstatement; var x: item; lc1: integer; begin lc1 := lc; insymbol; statement([semicolon, untilsy]+fsys); while sy in [semicolon]+statbegsys do begin if sy = semicolon then insymbol else error(14); statement([semicolon, untilsy]+fsys) end; if sy = untilsy then begin insymbol; expression(fsys, x); if not (x.typ in [bools, notyp]) then error(17); emit1(11, lc1) end else error(53) end (* repeatstement *); procedure whilestatement; var x: item; lc1, lc2: integer; begin insymbol; lc1 := lc; expression(fsys+[dosy], x); if not (x.typ in [bools, notyp]) then error(17); lc2 := lc; emit(11); if sy = dosy then insymbol else error(54); statement(fsys); emit1(10, lc1); code[lc2].y := lc end (* whilestatement *); procedure forstatement; var cvt: types; x: item; i, f, lc1, lc2: integer; begin insymbol; if sy = ident then begin i := loc(id); insymbol; if i = 0 then cvt := ints else if tab[i].obj = variable then begin cvt := tab[i].typ; emit2(0, tab[i].lev, tab[i].adr); if not (cvt in [notyp, ints, bools, chars]) then error(18) end else begin error(37); cvt := ints end end else skip([becomes, tosy, downtosy, dosy]+fsys, 2); if sy = becomes then begin insymbol; expression([tosy, downtosy, dosy]+fsys, x); if x.typ <> cvt then error(19); end else skip([tosy, downtosy, dosy]+fsys, 51); f := 14; if sy in [tosy, downtosy] then begin if sy = downtosy then f := 16; insymbol; expression([dosy]+fsys, x); if x.typ <> cvt then error(19) end else skip([dosy]+fsys, 55); lc1 := lc; emit(f); if sy = dosy then insymbol else error(54); lc2 := lc; statement(fsys); emit1(f+1, lc2); code[lc1].y := lc end (* forstatement *); procedure standproc(n: integer); var i, f: integer; x, y: item; begin case n of 1, 2: begin (* read *) if not iflag then begin error(20); iflag := true end; if sy = lparent then begin repeat insymbol; if sy <> ident then error(2) else begin i := loc(id); insymbol; if i <> 0 then if tab[i].obj <> variable then error(37) else begin x.typ := tab[i].typ; x.ref := tab[i].ref; if tab[i].normal then f := 0 else f := 1; emit2(f, tab[i].lev, tab[i].adr); if sy in [lbrack, lparent, period] then selector(fsys+[comma, rparent], x); if x.typ in [ints, reals, chars, notyp] then emit1(27, ord(x.typ)) else error(40) end end; test([comma, rparent], fsys, 6); until sy <> comma; if sy = rparent then insymbol else error(4) end; if n = 2 then emit(62) end; 3, 4: begin (* write *) if sy = lparent then begin repeat insymbol; if sy = stringt then begin emit1(24, sleng); emit1(28, inum); insymbol end else begin expression(fsys+[comma, colon, rparent], x); if not (x.typ in stantyps) then error(41); if sy = colon then begin insymbol; expression(fsys+[comma, colon, rparent], y); if y.typ <> ints then error(43); if sy = colon then begin if x.typ <> reals then error(42); insymbol; expression(fsys+[comma, rparent], y); if y.typ <> ints then error(43); emit(37) end else emit1(30, ord(x.typ)) end else emit1(29, ord(x.typ)) end until sy <> comma; if sy = rparent then insymbol else error(4) end; if n = 4 then emit(63) end; end(* case *) end (* standproc *); begin (* statement *) if sy in statbegsys+[ident] then case sy of ident: begin i:= loc(id); insymbol; if i <> 0 then case tab[i].obj of konstant, typel: error(45); variable: assignment(tab[i].lev, tab[i].adr); prozedure: if tab[i].lev <> 0 then call(fsys, i) else standproc(tab[i].adr); funktion: if tab[i].ref = display[level] then assignment(tab[i].lev+1, 0) else error(45) end end; beginsy: compoundstatement; ifsy: ifstatement; casesy: casestatement; whilesy: whilestatement; repeatsy: repeatstatement; forsy: forstatement; end; test(fsys, [], 14) end (* statement *); begin (* block *) dx := 5; prt := t; if level > lmax then fatal(5); test([lparent, colon, semicolon], fsys, 7); enterblock; display[level] := b; prb := b; tab[prt].typ := notyp; tab[prt].ref := prb; if sy = lparent then parameterlist; btab[prb].lastpar := t; btab[prb].psize := dx; if isfun then if sy = colon then begin insymbol; (* function type *) if sy = ident then begin x := loc(id); insymbol; if x <> 0 then if tab[x].obj <> typel then error(29) else if tab[x].typ in stantyps then tab[prt].typ := tab[x].typ else error(15) end else skip([semicolon]+fsys, 2) end else error(5); if sy = semicolon then insymbol else error(14); repeat if sy = constsy then constantdeclaration; if sy = typesy then typedeclaration; if sy = varsy then variabledeclaration; btab[prb].vsize := dx; while sy in [proceduresy, functionsy] do procdeclaration; test([beginsy], blockbegsys+statbegsys, 56) until sy in statbegsys; tab[prt].adr := lc; insymbol; statement([semicolon, endsy]+fsys); while sy in [semicolon]+statbegsys do begin if sy = semicolon then insymbol else error(14); statement([semicolon, endsy]+fsys) end; if sy = endsy then insymbol else error(57); test(fsys+[period], [], 6) end (* block *); procedure interpret; (* global code, tab, btab *) var ir: order; (* instruction buffer *) pc: integer; (* program counter *) ps: (run, fin, caschk, divchk, inxchk, stkchk, linchk, lngchk, redchk); t: integer; (* top stack index *) b: integer; (* base index *) lncnt, ocnt, blkcnt, chrcnt: integer; (* counters *) h1, h2, h3, h4: integer; fld: array [1..4] of integer; (* default field widths *) display: array [1..lmax] of integer; s: array [1..stacksize] of (* blockmark: *) record case types of (* s[b+0] = fct result *) ints: (i: integer); (* s[b+1] = return adr *) reals: (r: real); (* s[b+2] = static link *) bools: (b: boolean); (* s[b+3] = dynamic link *) chars: (c: char); (* s[b+4] = table index *) notyp, arrays, records: () end; begin (* interpret *) s[1].i := 0; s[2].i := 0; s[3].i := -1; s[4].i := btab[1].last; b := 0; display[1] := 0; t := btab[2].vsize - 1; pc := tab[s[4].i].adr; ps := run; lncnt := 0; ocnt := 0; chrcnt := 0; fld[1] := intfld; fld[2] := relfld; fld[3] := bolfld; fld[4] := chrfld; repeat ir := code[pc]; pc := pc+1; ocnt := ocnt + 1; case ir.f of 0: begin (* load address *) t := t+1; if t > stacksize then ps := stkchk else s[t].i := display[ir.x] + ir.y end; 1: begin (* load value *) t := t+1; if t > stacksize then ps := stkchk else s[t] := s[display[ir.x] + ir.y] end; 2: begin (* load indirect *) t := t+1; if t > stacksize then ps := stkchk else s[t] := s[s[display[ir.x] + ir.y].i] end; 3: begin (* update display *) h1 := ir.y; h2 := ir.x; h3 := b; repeat display[h1] := h3; h1 := h1-1; h3 := s[h3+2].i until h1 = h2 end; 8: case ir.y of 0: s[t].i := abs(s[t].i); 1: s[t].r := abs(s[t].r); 2: s[t].i := sqr(s[t].i); 3: s[t].r := sqr(s[t].r); 4: s[t].b := odd(s[t].i); 5: begin (* s[t].c := chr(s[t].i); *) if (s[t].i < 0) or (s[t].i > 63) then ps := inxchk end; 6: (* s[t].i := ord(s[t].c); *) 7: s[t].c := succ(s[t].c); 8: s[t].c := pred(s[t].c); 9: s[t].i := round(s[t].r); 10: s[t].i := trunc(s[t].r); 11: s[t].r := sin(s[t].r); 12: s[t].r := cos(s[t].r); 13: s[t].r := exp(s[t].r); 14: s[t].r := ln(s[t].r); 15: s[t].r := sqrt(s[t].r); 16: s[t].r := arctan(s[t].r); 17: begin t := t+1; if t > stacksize then ps := stkchk else s[t].b := eof(input) end; 18: begin t := t+1; if t > stacksize then ps := stkchk else s[t].b := eoln(input) end; end; 9: s[t].i := s[t].i + ir.y; (* offset *) 10: pc := ir.y; (* jump *) 11: begin (* conditional jump *) if not s[t].b then pc := ir.y; t := t-1 end; 12: begin (* switch *) h1 := s[t].i; t := t-1; h2 := ir.y; h3 := 0; repeat if code[h2].f <> 13 then begin h3 := 1; ps := caschk end else if code[h2].y = h1 then begin h3 := 1; pc := code[h2+1].y end else h2 := h2 + 2 until h3 <> 0 end; 14: begin (* forlup *) h1 := s[t-1].i; if h1 <= s[t].i then s[s[t-2].i].i := h1 else begin t := t-3; pc := ir.y end end; 15: begin (* for2up *) h2 := s[t-2].i; h1 := s[h2].i + 1; if h1 <= s[t].i then begin s[h2].i := h1; pc := ir.y end else t := t-3; end; 16: begin (* for1down *) h1 := s[t-1].i; if h1 >= s[t].i then s[s[t-2].i].i := h1 else begin pc := ir.y; t := t-3 end end; 17: begin (* for2down *) h2 := s[t-2].i; h1 := s[h2].i - 1; if h1 >= s[t].i then begin s[h2].i := h1; pc := ir.y end else t := t-3; end; 18: begin (* mark stack *) h1 := btab[tab[ir.y].ref].vsize; if t+h1 > stacksize then ps := stkchk else begin t := t+5; s[t-1].i := h1-1; s[t].i := ir.y end end; 19: begin (* call *) h1 := t - ir.y; (* h1 points top base *) h2 := s[h1+4].i; h3 := tab[h2].lev; display[h3+1] := h1; h4 := s[h1+3].i + h1; s[h1+1].i := pc; s[h1+2].i := display[h3]; s[h1+3].i := b; for h3 := t+1 to h4 do s[h3].i := 0; b := h1; t := h4; pc := tab[h2].adr end; 20: begin (* index *) h1 := ir.y; (* h1 points to atab *) h2 := atab[h1].low; h3 := s[t].i; if h3 < h2 then ps := inxchk else if h3 > atab[h1].high then ps := inxchk else begin t := t-1; s[t].i := s[t].i + (h3-h2) end end; 21: begin (* index *) h1 := ir.y; (* h1 points to atab *) h2 := atab[h1].low; h3 := s[t].i; if h3 < h2 then ps := inxchk else if h3 > atab[h1].high then ps := inxchk else begin t := t-1; s[t].i := s[t].i + (h3-h2)*atab[h1].elsize end end; 22: begin (* load block *) h1 := s[t].i; t := t-1; h2 := ir.y + t; if h2 > stacksize then ps := stkchk else while t < h2 do begin t := t+1; s[t] := s[h1]; h1 := h1+1 end end; 23: begin (* copy block *) h1 := s[t-1].i; h2 := s[t].i; h3 := h1 + ir.y; while h1 < h3 do begin s[h1] := s[h2]; h1 := h1+1; h2 := h2+1 end; t := t-2 end; 24: begin (* literal *) t := t+1; if t > stacksize then ps := stkchk else s[t].i := ir.y end; 25: begin (* load real *) t := t+1; if t > stacksize then ps := stkchk else s[t].r := rconst[ir.y] end; 26: begin (* float *) h1 := t - ir.y; s[h1].r := s[h1].i end; 27: begin (* read *) if eof(input) then ps := redchk else case ir.y of 1: read(s[s[t].i].i); 2: read(s[s[t].i].r); 4: read(s[s[t].i].c) end; t := t-1 end; 28: begin (* write string *) h1 := s[t].i; h2 := ir.y; t := t-1; chrcnt := chrcnt+h1; if chrcnt > lineleng then ps := lngchk; repeat write(stab[h2]); h1 := h1-1; h2 := h2+1 until h1 = 0 end; 29: begin (* write1 *) chrcnt := chrcnt + fld[ir.y]; if chrcnt > lineleng then ps := lngchk else case ir.y of 1: write(s[t].i: fld[1]); 2: write(s[t].r: fld[2]); 3: write(s[t].b: fld[3]); 4: write(s[t].c); end; t := t-1 end; 30: begin (* write2 *) chrcnt := chrcnt + s[t].i; if chrcnt > lineleng then ps := lngchk else case ir.y of 1: write(s[t-1].i: s[t].i); 2: write(s[t-1].r: s[t].i); 3: write(s[t-1].b: s[t].i); 4: write(s[t-1].c: s[t].i); end; t := t-2 end; 31: ps := fin; 32: begin (* exit procedure *) t := b-1; pc := s[b+1].i; b := s[b+3].i end; 33: begin (* exit function *) t := b; pc := s[b+1].i; b := s[b+3].i end; 34: s[t] := s[s[t].i]; 35: s[t].b := not s[t].b; 36: s[t].i := - s[t].i; 37: begin chrcnt := chrcnt + s[t-1].i; if chrcnt > lineleng then ps := lngchk else write(s[t-2].r: s[t-1].i: s[t].i); t := t-3 end; 38: begin (* store *) s[s[t-1].i] := s[t]; t := t-2; end; 39: begin t := t-1; s[t].b := s[t].r = s[t+1].r end; 40: begin t := t-1; s[t].b := s[t].r <> s[t+1].r end; 41: begin t := t-1; s[t].b := s[t].r < s[t+1].r end; 42: begin t := t-1; s[t].b := s[t].r <= s[t+1].r end; 43: begin t := t-1; s[t].b := s[t].r > s[t+1].r end; 44: begin t := t-1; s[t].b := s[t].r >= s[t+1].r end; 45: begin t := t-1; s[t].b := s[t].i = s[t+1].i end; 46: begin t := t-1; s[t].b := s[t].i <> s[t+1].i end; 47: begin t := t-1; s[t].b := s[t].i < s[t+1].i end; 48: begin t := t-1; s[t].b := s[t].i <= s[t+1].i end; 49: begin t := t-1; s[t].b := s[t].i > s[t+1].i end; 50: begin t := t-1; s[t].b := s[t].i >= s[t+1].i end; 51: begin t := t-1; s[t].b := s[t].b or s[t+1].b end; 52: begin t := t-1; s[t].i := s[t].i + s[t+1].i end; 53: begin t := t-1; s[t].i := s[t].i - s[t+1].i end; 54: begin t := t-1; s[t].r := s[t].r + s[t+1].r; end; 55: begin t := t-1; s[t].r := s[t].r - s[t+1].r; end; 56: begin t := t-1; s[t].b := s[t].b and s[t+1].b; end; 57: begin t := t-1; s[t].i := s[t].i * s[t+1].i end; 58: begin t := t-1; if s[t+1].i = 0 then ps := divchk else s[t].i := s[t].i div s[t+1].i end; 59: begin t := t-1; if s[t+1].i = 0 then ps := divchk else s[t].i := s[t].i mod s[t+1].i end; 60: begin t := t-1; s[t].r := s[t].r * s[t+1].r; end; 61: begin t := t-1; s[t].r := s[t].r / s[t+1].r; end; 62: if eof(input) then ps := redchk else readln; 63: begin writeln; lncnt := lncnt + 1; chrcnt := 0; if lncnt > linelimit then ps := linchk end end (* case *); until ps <> run; if ps <> fin then begin writeln; write('0halt at', pc:5, ' because of '); case ps of caschk: writeln('undefined case'); divchk: writeln('division by 0'); inxchk: writeln('invalid index'); stkchk: writeln('storage overflow'); linchk: writeln('too much output'); lngchk: writeln('line too long'); redchk: writeln('reading past end of file'); end; h1 := b; blkcnt := 10; (* post mortem dump *) repeat writeln; blkcnt := blkcnt - 1; if blkcnt = 0 then h1 := 0; h2 := s[h1+4].i; if h1 <> 0 then writeln(' ',tab[h2].name, ' called at', s[h1+1].i: 5); h2 := btab[tab[h2].ref].last; while h2 <> 0 do with tab[h2] do begin if obj = variable then if typ in stantyps then begin write(' ', name, ' = '); if normal then h3 := h1+adr else h3 := s[h1+adr].i; case typ of ints: writeln(s[h3].i); reals: writeln(s[h3].r); bools: writeln(s[h3].b); chars: writeln(s[h3].c); end end; h2 := link end; h1 := s[h1+3].i until h1 < 0; end; writeln; writeln(ocnt, ' steps') end (* interpret *); begin writeln; { main program } key[ 1] := 'and '; key[ 2] := 'array '; key[ 3] := 'begin '; key[ 4] := 'case '; key[ 5] := 'const '; key[ 6] := 'div '; key[ 7] := 'do '; key[ 8] := 'downto '; key[ 9] := 'else '; key[10] := 'end '; key[11] := 'for '; key[12] := 'function '; key[13] := 'if '; key[14] := 'mod '; key[15] := 'not '; key[16] := 'of '; key[17] := 'or '; key[18] := 'procedure '; key[19] := 'program '; key[20] := 'record '; key[21] := 'repeat '; key[22] := 'then '; key[23] := 'to '; key[24] := 'type '; key[25] := 'until '; key[26] := 'var '; key[27] := 'while '; ksy[ 1] := andsy; ksy[ 2] := arraysy; ksy[ 3] := beginsy; ksy[ 4] := casesy; ksy[ 5] := constsy; ksy[ 6] := idiv; ksy[ 7] := dosy; ksy[ 8] := downtosy; ksy[ 9] := elsesy; ksy[10] := endsy; ksy[11] := forsy; ksy[12] := functionsy; ksy[13] := ifsy; ksy[14] := imod; ksy[15] := notsy; ksy[16] := ofsy; ksy[17] := orsy; ksy[18] := proceduresy; ksy[19] := programsy; ksy[20] := recordsy; ksy[21] := repeatsy; ksy[22] := thensy; ksy[23] := tosy; ksy[24] := typesy; ksy[25] := untilsy; ksy[26] := varsy; ksy[27] := whilesy; sps['+'] := plus; sps['-'] := minus; sps['*'] := times; sps['/'] := rdiv; sps['('] := lparent; sps[')'] := rparent; sps['='] := egl; sps[','] := comma; sps['['] := lbrack; sps[']'] := rbrack; sps['#'] := neg; sps['&'] := andsy; sps[';'] := semicolon; constbegsys := [plus, minus, intcon, realcon, charcon, ident]; typebegsys := [ident, arraysy, recordsy]; blockbegsys := [constsy, typesy, varsy, proceduresy, functionsy, beginsy]; facbegsys := [intcon, realcon, charcon, ident, lparent, notsy]; statbegsys := [beginsy, ifsy, whilesy, repeatsy, forsy, casesy]; stantyps := [notyp, ints, reals, bools, chars]; lc := 0; ll := 0; cc := 0; ch := ' '; errpos := 0; errs := []; insymbol; t := -1; a := 0; b := 1; sx := 0; c2 := 0; display[0] := 1; iflag := false; oflag := false; if sy <> programsy then error(3) else begin insymbol; if sy <> ident then error(2) else begin progname := id; insymbol; if sy <> lparent then error(9) else repeat insymbol; if sy <> ident then error(2) else begin if id = 'input ' then iflag := true else if id = 'output ' then oflag := true else error(0); insymbol end until sy <> comma; if sy = rparent then insymbol else error(4); if not oflag then error(20) end end; enter(' ', variable, notyp, 0); (* sentinel *) enter('false ', konstant, bools, 0); enter('true ', konstant, bools, 1); enter('real ', typel, reals, 1); enter('char ', typel, chars, 1); enter('boolean ', typel, bools, 1); enter('integer ', typel, ints , 1); enter('abs ', funktion, reals, 0); enter('sqr ', funktion, reals, 2); enter('odd ', funktion, bools, 4); enter('chr ', funktion, chars, 5); enter('ord ', funktion, ints, 6); enter('succ ', funktion, chars, 7); enter('pred ', funktion, chars, 8); enter('round ', funktion, ints, 9); enter('trunc ', funktion, ints, 10); enter('sin ', funktion, reals, 11); enter('cos ', funktion, reals, 12); enter('exp ', funktion, reals, 13); enter('ln ', funktion, reals, 14); enter('sqrt ', funktion, reals, 15); enter('arctan ', funktion, reals, 16); enter('eof ', funktion, bools, 17); enter('eoln ', funktion, bools, 18); enter('read ', prozedure, notyp, 1); enter('readln ', prozedure, notyp, 2); enter('write ', prozedure, notyp, 3); enter('writeln ', prozedure, notyp, 4); enter(' ', prozedure, notyp, 0); with btab[1] do begin last := t; lastpar := 1; psize := 0; vsize := 0 end; block(blockbegsys+statbegsys, false, 1); if sy <> period then error(22); emit(31); (* halt *) if btab[2].vsize > stacksize then error(49); if progname = 'test0 ' then printtables; if errs = [] then begin if iflag then begin if eof then writeln(' input data missing') else begin writeln(' (eor) '); (* copy input data *) while not eof do begin write(' '); while not eoln do begin read(ch); write(ch) end; writeln; read(ch) end; end end; writeln(' (eof) '); interpret end else errormsg; 99: end. Appendix A: Syntax diagrams Notes: (1) Round boxes denote symbols of the language Pascal, rectangular boxes denote syntactic constructs represented by diagrams. (2) Separators may be inserted between any two symbols. However, no separators must occur within numbers and identifiers. (3) At least one separator must occur between consecutive identifiers, numbers, and word symbols (such as BEGIN, END). (4) Separators are blanks, ends of lines, and comments. A comment is an arbitrary sequence of characters enclosed within a pair of comment brackets (* and *). (5) The occurrence of the non-qualified word identifier in a syntax diagram implies that at this point an arbitrary, new identifier may be chosen. This iden- tifier thereby becomes a constant, type, variable, field, function, or procedure identifier. Appendix C: Explanations to error codes O. The designated identifier has not been declared. 1. The indicated identifier is declared more than once in the same scope. 2. An identifier is expected. 3. Every program must begin with the symbol program. 4. A closing parenthesis is expected. 5. A colon is expected. In declarations, the colon is followed by a type. 6. At this point, the indicated symbol is incorrectly used. The compiler skips this and possibly several following symbols. 7. In a formal parameter list each section must begin with an identifier or the symbol var, depending on whether the parameter is a value or a variable parameter. 8. The symbol of is expected. 9. An opening parenthesis is expected. 10. A type definition must begin with an identifier, the symbol array, or the symbol record. 11. An opening bracket is expected ([). 12. A closing bracket is expected (]). 13. The symbol .. is expected (no blank between the dots). 14. A semicolon is expected. 15. The result of a function must be of type integer, real, boolean, or char. 16. An equal sign is expected. The symbol := is used in assignment statements only, but not in declarations. 17. The expression following the symbol if, while, or until must be of type boolean. 18. The control variable following the symbol for must be of type integer, char, or boolean. 19. The expressions which specify the initial and fmal values of the control variable in a for statement must be of the same type as the control variable. 20. The parameter 'output' must be included in the program heading. 21. The indicated number is too large. The maximum number of digits is 14; the absolute value must not exceed 10**323 (on the CDC 6000 implementation). 22. A dot is expected at the end of the program. Check corresponding begin and end symbols! 23. The expression following the symbol case must be of type integer, char, or boolean. (In the latter case, an if statement is recommended.) 24. The designated character is not acceptable. 25. In a constant defmition, the equal sign must be followed by a constant. If an identifier is used, it must denote a constant. 26. The type of an index expression must be identical to the index type specified in the array declaration. 27. In an array declaration, the lower bound must not exceed the upper bound. They must be within a permissible range of values (less than 2** 17). Also, their types must be identical, either integers, logical values, or characters. Real numbers are not acceptable. 28. Every indexed variable must be declared as an array. 29. A type identifier is expected here. 30. This type is not defmed. (Recursive type defmitions are not allowed.) 31. Every variable with a field selector must be declared as a record. 32. The operands of the operators not, and, and or must be of type boolean. 33. The specified type of this arithmetic expression is illegal. Note also that entire arrays cannot occur as operands to arithmetic or logical operators. 34. Operands of div and mod must be of type integer. 35. The types of the comparands are incompatible. They must be identical, except if one comparand is of type integer and the other of type real. Arrays must be compared element by element. 36. The types of corresponding actual and formal parameters must be identical. An exception is made if the formal parameter is a value parameter of type real. Then the actual parameter may also be of type integer. 37. A variable is expected. 38. A string must contain at least one character. 39. The number of actual parameters must be equal to the number of specified formal parameters. 40. The parameters of the procedure read must be of type char, integer, or real. 41. The parameters of the procedure write must be of type char, integer, real, or boolean. 42. If a statement has the form write(x:m:n), then x must be an expression of type real. 43. If a statement has the form write(x:n) or write(x:m:n),then m and n must be expressions of type integer. 44. No type or procedure identifiers may occur as part of an expression. 45. A statement cannot begin with a type or a function identifier. An exception is the assignment of a result value to a function. In this case, it must be part of the function body. 46. In an assignment x := y, the types of the variable x and the expressiony must be identical. An exception is the case when x is real. Then y may also be of type integer. 47. Every case label must be a constant of the same type as the expression in the case clause. 48. The indicated argument of the standard function is of an illegal type. 49. The program requires too much storage. 50. A constant cannot begin with the indicated symbol. 51. The symbol := is expected (no space between : and =). 52. The symbol then is expected. 53. The symbol until is expected. 54. The symbol do is expected. 55. The symbol to (or downto) is expected. 56. The symbol begin is expected. 57. The symbol end is expected. 58. A factor must begin with an identifier, a constant, the symbol not, or with a left parenthesis. References 1. Jensen, K., and Wirth, N., Pascal-User Manual and Report, Lecture Notes in Computer Science, No. 18, Springer-Verlag, Berlin, Heidelberg, New York, 1974. 2. Wirth, N., The programming language Pascal, Acta Informatica, 1, 35-63, 1971. 3. Wirth, N., Systematisches Programmieren, Teubner-Verlag, Stuttgart, 1972. 4. Wirth, N., Systematic Programming, Prentice-Hall, Englewood Cliffs, New Jersey, 1973. 5. Wirth, N., Algorithms + Data Structures = Programs, Prentice-Hall, Englewood Cliffs, New Jersey, 1975. 6. Ammann, U., The Method of Structured Programming Applied to the Development of a Compiler (Eds. A. Guenther et al.), International Computing Symposium 1973, pp. 93-99, North-Holland, 1974.