The p-System Stack and Heap

                                 9th July 1983
                                 Alastair Milne


I.  p-System versions I.4, I.5, II.0, II.1, all subversions of III:

     These versions (from I.3 up to the latest version III system) all arrange 
         memory in the same way: at the p-System interpreter is loaded into 
         the lowest memory addresses (so it occupies the low end of memory).  
         After it and the p-System globals have been allocated, the bottom of 
         the heap is placed immediately above them.  So that both heap and 
         stack will have the greatest possible room to grow when data are 
         added to them, the bottom of the stack is placed at the other end of
         memory.  When anything is pushed onto it, it grows down (toward the
         low addresses); and when anything is allocated on the heap, it grows
         up (toward the high addresses). This leaves all the memory in the
         system between them, for them to grow into.


    Low Memory                                                      High Memory
    ---------------------------------------------------------------------------
    INT|Pas.Sys.| glbls |  heap (growing -->)              (<-- growing)  stack
    ---------------------------------------------------------------------------
    ^   ^        ^      | <----------------  USER MEMORY  ------------------> |
    |   |        |
    |   |        |
    |   |        |- p-System globals (directory pointer, IOstatus, etc.)
    |   |
    |   |- PascalSystem, main segment of SYSTEM.PASCAL. (in p-code)
    |
    |- p-machine interpreter (in native assembly code)


  {$P}
  THE STACK:

     The p-System is a stack-oriented machine, which means that all its 
         operations are performed using the stack.  This means that it uses 
         the stack for arithmetic (and boolean) evaluations; but the system
         also uses the stack for passing parameters, and for allocating the
         local variables of the individual procedure when it is called (so 
         that the memory they use can simply be popped off the stack when the 
         procedure returns).  The p-System even loads the code it is to
         execute on the stack.  Everything is organised around it. 

     When a code segment (such as a program) is executed, the p-System finds 
         the segment in its file, and pushes it onto the stack.  It then calls 
         a procedure (number 1 for most segments) in that segment, which 
         results in a space (called an "activation record") being pushed onto 
         the stack for that procedure's variables and parameters.  Now that 
         procedure is executed.  If it involves any arithmetic or logical 
         operations, their operands are pushed onto the stack, evaluated, and 
         the results are popped as they are used; in an assignment, they are 
         popped into the variables in the activation record.

     If the first procedure calls a second one, passing parameters to it, the
         calling arguments are pushed onto the stack, then the second
         procedure is called.  This causes another activation record to be
         pushed onto the stack.  It contains space for the second procedure's
         locals; but it also overlaps onto the arguments that the first
         procedure pushed.  In this way, the second procedure can regard the
         parameters as local variables (NOTE: for VAR parameters, the calling
         procedure pushes pointers to the actual arguments, and the second
         procedure follows those pointers).

     In order to find out how much space to push onto the stack for each
         procedure call, the p-System consults the code segment itself.
         There, the compiler has written, for each procedure, how large an
         activation record that procedure requires.  The compiler itself finds
         out by totalling the sizes of all the local declarations (plus
         parameters) of each procedure.  Therefore, when the system pushes an
         activation record onto the stack, it may be regarded as a system
         allocation.

     It may also be considered a "static" allocation, because it doesn't 
         change between runs of the program: a given procedure will always 
         allocate (or have allocated for it) the same amount of space on the 
         stack.

   Routines Manipulating the Stack:

     None: only the p-System itself manipulates the stack.


  {$P}
  THE HEAP:

     A heap is simply an area of memory from which a user program may request 
         chunks for its own use.  Under most (non-UCSD) operating systems, 
         this area is unordered, and requires a certain amount of housekeeping 
         to make sure that it doesn't become fragmented, and that previously 
         requested chunks of memory which are no longer being used are 
         returned to the heap for other users to request. (In fact, this is a 
         major part of an operating system's job).

     Under p-System versions previous to IV.0, the heap serves the same 
         purpose, but it is organised as a stack: every time the (single) user 
         program requests that a chunk of memory be reserved for a particular 
         use, that chunk is pushed onto the heap.  Later, the program may 
         restore that chunk to the system (for other use) by releasing it, 
         which simply pops it off the heap.  

     The dangerous aspect to this is that if other allocations have been made
         on the heap above this particular one, they will also be restored to
         the system.  The information in them won't be destroyed yet, but as
         soon as the system has a need for memory on the heap, it will use
         the memory still being used by those other allocations.  Some care
         is therefore required in the use of the heap.

     Since memory on the heap is allocated in response to specific requests 
         from the user program, the pattern of heap allocations is determined 
         much more by the course of the program's execution than by its 
         construction.  It can therefore be called "dynamic allocation": what 
         is allocated (and when) may be very different from one execution to 
         another of the same program.

     The way a Pascal program requests a chunk of memory for its own use is 
         to call the standard procedure NEW (see declaration at the end of 
         this section).  NEW takes a pointer variable, pushes onto the heap an
         area the size of whatever type the particular pointer points to, and
         sets the pointer variable to point at the new allocation.  The top of 
         the heap is now the top of that allocation, and the next call to NEW 
         will push an area on top of it.  The pointer is essential; without
         it, the program would have no way of accessing the newly allocated
         memory.

    {$P}
   How Memory Looks after a Call to NEW:

    <-- toward Low Memory                                           High Memory

                        -----------------------------------
                        |                                 |  |- activation
                        V                                 |  V    record.
    ===========================================================================
    PREVIOUS ALLOCATION | MyAllocation|                  |O    | p-Code segment
    ===========================================================================
    |                                                     ^ pointer to
    |                                                       MyAllocation.
    |---------- HEAP -----------------|                  |------- STACK ------|
                                      ^
                            top of heap
                        (MARK would return this)

         NOTE: the relative sizes are NOT to scale.


     Being able to reserve more memory during the course of execution (rather 
         than having the amount available permanently set by the compiler) 
         enormously enhances the power of Pascal.  It is not necessary to 
         foresee all possible memory needs; nor would the declarations for 
         them be practical (or even possible) if one could.  When the need for 
         more memory is found (to allocated another item in a linked list, for 
         example, or to add another entry to a table), that amount can be 
         requested from the system.  The program therefore need not waste 
         memory with massive overestimates of the amount needed; nor must it 
         lack needed memory because of underestimates.

     The minimum amount of physical memory that the p-System requires to set 
         up this organisation is 64K bytes.  On machines which are byte-
         addressed (such as the Terak), 64K bytes is as much they can address. 
         Word-addressed machines, such as the PDQ-3, can address twice that 
         much (because each address refers to a word, rather than a byte).  On 
         word-addressed machines with 128K bytes, the p-System shifts the 
         stack and the heap so that the extra memory lies between them, giving 
         the whole system much more room to work.  Memory beyond that is 
         inaccessible to the p-System.

  Routines Manipulating the Heap:

     NEW(VAR Creation: PointerToAnything);
     MARK(VAR TopOfHeapMark: PointerToInteger);
     RELEASE(OldJunk: PointerToAnything);


{$P}
II. p-System versions IV.0 and IV.1:

  THE STACK:

     The stack works essentially as it does under the previous versions, with
         one major exception: code segments are NOT loaded onto it.  Instead,
         they are placed in a new structure, called the "code pool" (see
         below).


  THE HEAP:

     The heap is no longer organised simply as a stack: it has become more
         like the heap supported by large operating systems.  Memory can now be
         allocated and released anywhere within it, not only at the top. The
         extent to which it has been changed is not clear: although there now
         exist new routines (VARNEW and VARDISPOSE) which handle allocations
         and disposals anywhere in the heap, there still exist the old ones
         (NEW, MARK, and RELEASE) which work only on the top of the heap.

     A particular change in MARK and RELEASE must be noted.  Under previous 
         versions of the p-System, any allocation made by a call to NEW could 
         be eliminated by a call to RELEASE.  The procedure MARK did nothing 
         whatever to the heap; it simply returned the pointer to the heap's 
         current top.  All three procedures can take any kind of pointer as a 
         parameter.

     Under version IV, MARK and RELEASE must be paired (NEW and RELEASE 
         must NOT), and they can only take pointers to integers as parameters. 
         Passing anything else to them results in the run-time error "Illegal 
         heap operation".  MARK no longer simply returns a pointer; every call 
         to it allocates a new record in a linked list through the heap, which 
         the p-System apparently uses for keeping track of allocations.

     This change is a major discrepancy between version IV and the previous 
         versions, and can lead to serious difficulty in transporting
         programs between p-System versions.


    Routines Manipulating the Heap:

       NEW(VAR Creation: PointerToAnything);
       MARK(VAR HeapTopPointer: PointerToInteger);
       RELEASE(OldJunk: PointerToInteger);  <-- note version IV change.
       VARNEW(VAR Creation: PointerToAnything; CreateSize: INTEGER): INTEGER;
       VARDISPOSE(OldJunk: PointerToAnything; DisposeSize: INTEGER): INTEGER;

     Note that VARNEW and VARDISPOSE require the user to state how large the
         allocation (or disposal) is to be; they do not automatically allocate
         for the type the pointer points to.  They are functions; they return
         the amount of memory actually allocated or disposed.


  THE CODE POOL:

     Starting with p-System version IV.0, an new entity enters the memory
         organisation: the code pool.

     Whereas all previous versions of the p-System load code segments onto the 
         stack as they are called (from previous segments), IV.0 and IV.1 
         place them in an area called the "code pool", which floats between 
         the tops of the heap and the stack.  The systems moves it up or down 
         in memory as required to give the stack and the heap room to grow.

     The size of the code pool is intentionally very variable.  Almost any
         segment whose code is not actually in the process of being executed
         can be dropped from memory (under the previous versions, a segment
         is dropped as soon as it returns to the previous segment; if it is
         called again immediately afterward, it is re-loaded from disc). 
         This even applies to segments which have called other segments. 
         When the time comes for control to return to them, they will be re-
         loaded from disc.

     Note that the term "segment" includes units, which, under verion IV, can 
         be dropped from memory just as segment routines can.

     However, if there is no demand for new memory, the p-System keeps
         resident every segment it loads.  Several calls to a particular
         segment routine will therefore usually read the disc only once,
         unless the program requires a lot of memory.

     This organisation makes the version IV p-System much more effective than 
         the previous versions in satisfying user memory requirements.


  AVAILABLE MEMORY FUNCTIONS:

     Under all previous versions of the p-System, the function MEMAVAIL 
         returns the number of words of memory currently unused.  Although 
         this function still exists in IV, its return is almost meaningless.
         It is usually a gross underestimate of the memory that the system
         could make available if it had to.  There is a new function, 
         VARAVAIL, which takes as a parameter a string containing the names of 
         segments.  It returns the amount of memory which would now be free if 
         those segments were the ones currently in memory.  Although it 
         requires more work by the programmer, to work which of the program's 
         and the system's segments should be considered resident to make the 
         return meaningful, it does at least return a result more useful than 
         MEMAVAIL's.

     Example:  
 WordsLeft := VARAVAIL('MYPROG, MYSEG1, MYSEG2, SCREENOPS, REALOPS, STRINGOPS');
            {ScreenOps, RealOps, and StringOps are 3 units in SYSTEM.PASCAL}


  EXTENDED MEMORY:

     The same addressing restrictions apply to IV as apply to all previous 
         versions (since the restrictions are a characteristic of the machines 
         themselves).  However, version IV has two ways of permitting extra 
         memory to be used (even on byte-addressed machines).

   External Code Pools:

     In the usual 64K byte organisation, the code pool is "internal": it
         floats between the stack and the heap; therefore, some competition
         for memory still exists between code, stack, and heap.  However, if
         memory is extended to 128K bytes, an option may be selected which
         places the code pool in the upper 64K bytes, and leaves the stack and
         the heap in the lower 64K (still placed at the opposite ends of it,
         and still growing toward each other, with all of addressable memory
         in between).  This not only permits the stack and the heap much more
         growth, it also improves the system's performance in running code
         files, because much more code can remain resident.  There is seldom
         any need to re-read a segment from the disc.

   RAM Disc:

     Extra memory beyond this may be used as if it were another on-line disc;
         this is called RAM disc (for Random Access Memory).  The p-System
         gives it its own directory and name (usually RAM:, or RAMDISK:); and
         it may be set up so that when the system boots, all the files from
         the boot disc (particularly the system files) are loaded onto the RAM
         disc, which not only frees a drive for use with a different disc, but
         also greatly improves the system's performance when it runs those
         system files.

     The danger with RAM disc is that it is not permanent memory, as a true 
         disc is; so anything written to it must also be written to a real 
         disc if it is to be kept.  Furthermore, if the system should abort 
         and require re-booting in the execution of a program, the contents of 
         the RAM disc are, on most systems, lost.