The p-System Stack and Heap 9th July 1983 Alastair Milne I. p-System versions I.4, I.5, II.0, II.1, all subversions of III: These versions (from I.3 up to the latest version III system) all arrange memory in the same way: at the p-System interpreter is loaded into the lowest memory addresses (so it occupies the low end of memory). After it and the p-System globals have been allocated, the bottom of the heap is placed immediately above them. So that both heap and stack will have the greatest possible room to grow when data are added to them, the bottom of the stack is placed at the other end of memory. When anything is pushed onto it, it grows down (toward the low addresses); and when anything is allocated on the heap, it grows up (toward the high addresses). This leaves all the memory in the system between them, for them to grow into. Low Memory High Memory --------------------------------------------------------------------------- INT|Pas.Sys.| glbls | heap (growing -->) (<-- growing) stack --------------------------------------------------------------------------- ^ ^ ^ | <---------------- USER MEMORY ------------------> | | | | | | | | | |- p-System globals (directory pointer, IOstatus, etc.) | | | |- PascalSystem, main segment of SYSTEM.PASCAL. (in p-code) | |- p-machine interpreter (in native assembly code) {$P} THE STACK: The p-System is a stack-oriented machine, which means that all its operations are performed using the stack. This means that it uses the stack for arithmetic (and boolean) evaluations; but the system also uses the stack for passing parameters, and for allocating the local variables of the individual procedure when it is called (so that the memory they use can simply be popped off the stack when the procedure returns). The p-System even loads the code it is to execute on the stack. Everything is organised around it. When a code segment (such as a program) is executed, the p-System finds the segment in its file, and pushes it onto the stack. It then calls a procedure (number 1 for most segments) in that segment, which results in a space (called an "activation record") being pushed onto the stack for that procedure's variables and parameters. Now that procedure is executed. If it involves any arithmetic or logical operations, their operands are pushed onto the stack, evaluated, and the results are popped as they are used; in an assignment, they are popped into the variables in the activation record. If the first procedure calls a second one, passing parameters to it, the calling arguments are pushed onto the stack, then the second procedure is called. This causes another activation record to be pushed onto the stack. It contains space for the second procedure's locals; but it also overlaps onto the arguments that the first procedure pushed. In this way, the second procedure can regard the parameters as local variables (NOTE: for VAR parameters, the calling procedure pushes pointers to the actual arguments, and the second procedure follows those pointers). In order to find out how much space to push onto the stack for each procedure call, the p-System consults the code segment itself. There, the compiler has written, for each procedure, how large an activation record that procedure requires. The compiler itself finds out by totalling the sizes of all the local declarations (plus parameters) of each procedure. Therefore, when the system pushes an activation record onto the stack, it may be regarded as a system allocation. It may also be considered a "static" allocation, because it doesn't change between runs of the program: a given procedure will always allocate (or have allocated for it) the same amount of space on the stack. Routines Manipulating the Stack: None: only the p-System itself manipulates the stack. {$P} THE HEAP: A heap is simply an area of memory from which a user program may request chunks for its own use. Under most (non-UCSD) operating systems, this area is unordered, and requires a certain amount of housekeeping to make sure that it doesn't become fragmented, and that previously requested chunks of memory which are no longer being used are returned to the heap for other users to request. (In fact, this is a major part of an operating system's job). Under p-System versions previous to IV.0, the heap serves the same purpose, but it is organised as a stack: every time the (single) user program requests that a chunk of memory be reserved for a particular use, that chunk is pushed onto the heap. Later, the program may restore that chunk to the system (for other use) by releasing it, which simply pops it off the heap. The dangerous aspect to this is that if other allocations have been made on the heap above this particular one, they will also be restored to the system. The information in them won't be destroyed yet, but as soon as the system has a need for memory on the heap, it will use the memory still being used by those other allocations. Some care is therefore required in the use of the heap. Since memory on the heap is allocated in response to specific requests from the user program, the pattern of heap allocations is determined much more by the course of the program's execution than by its construction. It can therefore be called "dynamic allocation": what is allocated (and when) may be very different from one execution to another of the same program. The way a Pascal program requests a chunk of memory for its own use is to call the standard procedure NEW (see declaration at the end of this section). NEW takes a pointer variable, pushes onto the heap an area the size of whatever type the particular pointer points to, and sets the pointer variable to point at the new allocation. The top of the heap is now the top of that allocation, and the next call to NEW will push an area on top of it. The pointer is essential; without it, the program would have no way of accessing the newly allocated memory. {$P} How Memory Looks after a Call to NEW: <-- toward Low Memory High Memory ----------------------------------- | | |- activation V | V record. =========================================================================== PREVIOUS ALLOCATION | MyAllocation| |O | p-Code segment =========================================================================== | ^ pointer to | MyAllocation. |---------- HEAP -----------------| |------- STACK ------| ^ top of heap (MARK would return this) NOTE: the relative sizes are NOT to scale. Being able to reserve more memory during the course of execution (rather than having the amount available permanently set by the compiler) enormously enhances the power of Pascal. It is not necessary to foresee all possible memory needs; nor would the declarations for them be practical (or even possible) if one could. When the need for more memory is found (to allocated another item in a linked list, for example, or to add another entry to a table), that amount can be requested from the system. The program therefore need not waste memory with massive overestimates of the amount needed; nor must it lack needed memory because of underestimates. The minimum amount of physical memory that the p-System requires to set up this organisation is 64K bytes. On machines which are byte- addressed (such as the Terak), 64K bytes is as much they can address. Word-addressed machines, such as the PDQ-3, can address twice that much (because each address refers to a word, rather than a byte). On word-addressed machines with 128K bytes, the p-System shifts the stack and the heap so that the extra memory lies between them, giving the whole system much more room to work. Memory beyond that is inaccessible to the p-System. Routines Manipulating the Heap: NEW(VAR Creation: PointerToAnything); MARK(VAR TopOfHeapMark: PointerToInteger); RELEASE(OldJunk: PointerToAnything); {$P} II. p-System versions IV.0 and IV.1: THE STACK: The stack works essentially as it does under the previous versions, with one major exception: code segments are NOT loaded onto it. Instead, they are placed in a new structure, called the "code pool" (see below). THE HEAP: The heap is no longer organised simply as a stack: it has become more like the heap supported by large operating systems. Memory can now be allocated and released anywhere within it, not only at the top. The extent to which it has been changed is not clear: although there now exist new routines (VARNEW and VARDISPOSE) which handle allocations and disposals anywhere in the heap, there still exist the old ones (NEW, MARK, and RELEASE) which work only on the top of the heap. A particular change in MARK and RELEASE must be noted. Under previous versions of the p-System, any allocation made by a call to NEW could be eliminated by a call to RELEASE. The procedure MARK did nothing whatever to the heap; it simply returned the pointer to the heap's current top. All three procedures can take any kind of pointer as a parameter. Under version IV, MARK and RELEASE must be paired (NEW and RELEASE must NOT), and they can only take pointers to integers as parameters. Passing anything else to them results in the run-time error "Illegal heap operation". MARK no longer simply returns a pointer; every call to it allocates a new record in a linked list through the heap, which the p-System apparently uses for keeping track of allocations. This change is a major discrepancy between version IV and the previous versions, and can lead to serious difficulty in transporting programs between p-System versions. Routines Manipulating the Heap: NEW(VAR Creation: PointerToAnything); MARK(VAR HeapTopPointer: PointerToInteger); RELEASE(OldJunk: PointerToInteger); <-- note version IV change. VARNEW(VAR Creation: PointerToAnything; CreateSize: INTEGER): INTEGER; VARDISPOSE(OldJunk: PointerToAnything; DisposeSize: INTEGER): INTEGER; Note that VARNEW and VARDISPOSE require the user to state how large the allocation (or disposal) is to be; they do not automatically allocate for the type the pointer points to. They are functions; they return the amount of memory actually allocated or disposed. THE CODE POOL: Starting with p-System version IV.0, an new entity enters the memory organisation: the code pool. Whereas all previous versions of the p-System load code segments onto the stack as they are called (from previous segments), IV.0 and IV.1 place them in an area called the "code pool", which floats between the tops of the heap and the stack. The systems moves it up or down in memory as required to give the stack and the heap room to grow. The size of the code pool is intentionally very variable. Almost any segment whose code is not actually in the process of being executed can be dropped from memory (under the previous versions, a segment is dropped as soon as it returns to the previous segment; if it is called again immediately afterward, it is re-loaded from disc). This even applies to segments which have called other segments. When the time comes for control to return to them, they will be re- loaded from disc. Note that the term "segment" includes units, which, under verion IV, can be dropped from memory just as segment routines can. However, if there is no demand for new memory, the p-System keeps resident every segment it loads. Several calls to a particular segment routine will therefore usually read the disc only once, unless the program requires a lot of memory. This organisation makes the version IV p-System much more effective than the previous versions in satisfying user memory requirements. AVAILABLE MEMORY FUNCTIONS: Under all previous versions of the p-System, the function MEMAVAIL returns the number of words of memory currently unused. Although this function still exists in IV, its return is almost meaningless. It is usually a gross underestimate of the memory that the system could make available if it had to. There is a new function, VARAVAIL, which takes as a parameter a string containing the names of segments. It returns the amount of memory which would now be free if those segments were the ones currently in memory. Although it requires more work by the programmer, to work which of the program's and the system's segments should be considered resident to make the return meaningful, it does at least return a result more useful than MEMAVAIL's. Example: WordsLeft := VARAVAIL('MYPROG, MYSEG1, MYSEG2, SCREENOPS, REALOPS, STRINGOPS'); {ScreenOps, RealOps, and StringOps are 3 units in SYSTEM.PASCAL} EXTENDED MEMORY: The same addressing restrictions apply to IV as apply to all previous versions (since the restrictions are a characteristic of the machines themselves). However, version IV has two ways of permitting extra memory to be used (even on byte-addressed machines). External Code Pools: In the usual 64K byte organisation, the code pool is "internal": it floats between the stack and the heap; therefore, some competition for memory still exists between code, stack, and heap. However, if memory is extended to 128K bytes, an option may be selected which places the code pool in the upper 64K bytes, and leaves the stack and the heap in the lower 64K (still placed at the opposite ends of it, and still growing toward each other, with all of addressable memory in between). This not only permits the stack and the heap much more growth, it also improves the system's performance in running code files, because much more code can remain resident. There is seldom any need to re-read a segment from the disc. RAM Disc: Extra memory beyond this may be used as if it were another on-line disc; this is called RAM disc (for Random Access Memory). The p-System gives it its own directory and name (usually RAM:, or RAMDISK:); and it may be set up so that when the system boots, all the files from the boot disc (particularly the system files) are loaded onto the RAM disc, which not only frees a drive for use with a different disc, but also greatly improves the system's performance when it runs those system files. The danger with RAM disc is that it is not permanent memory, as a true disc is; so anything written to it must also be written to a real disc if it is to be kept. Furthermore, if the system should abort and require re-booting in the execution of a program, the contents of the RAM disc are, on most systems, lost.