An Introduction to Scheme and its Implementation

Go to the first, previous, next, last section, table of contents.

`let`

One difference between a C or Pascal block and a Scheme let is that let variable bindings don't necessarily cease to exist when the let is exited, and the bindings therefore can't be allocated on a stack in the general case. (The reason for this will become clear when we talk about lambda and closures.)

One way to visualize the creation of block variables is to see it as the creation of a new table mapping names to storage, like the toplevel environment in our interpreter.

Except for the new variables, the new environment (table) is the same as the one that was in use when the block was entered. We say that the let expression "extends" the "outer" environment with bindings for the let variables.

Suppose we type a let expression at the Scheme prompt, (Assume we we're just doing the usual expression evaluation in the usual top-level environment.)

Scheme>(let ((x 10) (y 20))
          (+ x y))
30

The interpreter maintains a pointer to the "current environment" when evaluating an expression. This pointer always points to the environment the currently-executing code must execute in, i.e., the variable bindings it must see for the variables it uses. When we evaluate a variable, we look for a binding of its name in the current environment, by searching up the environment chain starting from the "current environment" pointer.

Before evaluating the let expression, Scheme's environment pointer points to the top-level environment, which contains the usual bindings holding the built-in Scheme procedures, plus any top-level variables we've defined. Supposing we've defined a variable foo, we can draw the top-level environement like this:

     +-----+        +------+-----+
envt |  * -+------->|  car |  *--+----> #<proc ...>#
     +-----+        +------+-----+
                    | cons |  *--+----> #<proc ...>#
                    +------+-----+
                    |   +  |  *--+----> #<proc ...>#
                    +------+-----+
                    |      *     |
                           * 
                    |      *     |
                    +------+-----+
                    |  foo |  +--+----> #<proc ...>#
                    +------+-----+

(Here I've drawn the environment as a simple table of names and bindings. It might actually be implemented as an association list, as in our simple example interpreter, or more likely as a hash table.)

After entering the let and creating the bindings for x and y, the interpreter changes the environment pointer to point to the resulting new environment. This is typically implemented by representing the environment as a chain of tables called frames, rather than a simple flat table. The newest frame is searched first, and so on down the chain, to find the appropriate bindings. This environment chain is used as a pointer-linked stack, for the most part, with new frames being pushed onto the stack when a let is entered, and popped off the stack when a let is exited.

Each frame holds bindings created by a particular binding construct in the program, e.g., on entering a let. A whole environment is not represented by just one frame, but by the chain of frames starting at a particular frame. In the figure below, for example, the smaller binding frame only holds bindings of x and y, but it represents an environment that includes bindings of car, etc. The environment inherits bindings from the environment it's scoped inside, and this is what the scope link is for.

                    +-------+-----+
                    |  car  |  +--+----> #<proc ...>#
                    +-------+-----+
                    | cons  |  +--+----> #<proc ...>#
                    +-------+-----+
                    |   +   |  +--+----> #<proc ...>#
                    +-------+-----+
                    |       *     |
                            *
                    |       *     |
                    +-------+-----+
                    |  foo  |  +--+----> #<proc ...>#
                    +-------+-----+
                              /|\
                               |
                               |
                               |
     +-----+        +-------+--+--+
envt |  +--+------->|[scope]|  *  |
     +-----+        +-------+-----+
                    |   x   |  10 |
                    +-------+-----+    
                    |   y   |  20 |    
                    +-------+-----+

The link that connects the two frames (tables) is called a scope link. It reflects the nesting of naming scopes in the program. In this case, when a variable is referenced inside the let, the search for a binding begins at the new frame (table). If it is not found, the search follows the scope link to the next frame and looks there. This can continue for as many levels as there are nested scopes in the program.

While we're executing in the new environment, bindings in the new frame shadow (hide) any bindings of variables with the same name in the outer environment. For example, if there's a top-level variable named x bound in the top-level environment, they won't be seen by code executing in the let environment.

When we exit the let, the current environment pointer is set back to point to the same environment frame as before entering the let. In the usual case, that environment becomes garbage because there are no pointers to it, and the garbage collector will eventually reclaim its space.

Keep in mind that an environment is a language-level entity, and just consists of a set of bindings; it is just a mapping from names to storage that can hold values. Our chain of frames is a convenient and efficient data structure we've chosen to implement this abstraction, so that environments nest properly.

The difference between frames and environments is similar to the difference between pairs and lists. A pointer to a pair that begins a list may be thought of as pointer to the pair itself, or as a pointer to the whole list of pairs connected by their cdr's. Likewise, a pointer to an environment frame points to that frame, but also points to the sequence of environment frames connected by their scope links. This corresponds to the scope rule that an expression can see bindings created by enclosing expressions, as long as they're not shadowed by another binding in between.

As with pairs and lists, environment frames can share structure, and the same frame may be part of several (nested) environments. (As we'll see later, this is important for implementing lexical scope correctly in the presence of first-class procedures.)

Unlike pairs and lists, however, a particular environment doesn't necessarily include all of the parts of all of the frames in the sequence. The scope rules of the language allow shadowing of bindings in outer environments by bindings in inner environments; our lookup routine implements this by searching an environment chain in inner-to-outer order, and taking the first binding of the name.

If we think of environments as sets of bindings, the act of pushing an environment frame on the front of an environment creates a new environment with new bindings added--and with any shadowed bindings removed. We don't actually modify the old environments, however--they're not changed by the creation of new environments with scope links to them. What effectively "removes" a binding to create a new envioronment is our search procedures--it searches an environment front-to back, to find the binding of a name, and ignores any bindings after the first one it finds.

As we will see later, it's significant that we never actually modify the structure of an environment chain once it's created--we never clobber the scope links. This allows us to save a pointer to an environment by saving a pointer to the first frame in the chain, and restore that environment later by simply setting our environment pointer to point to that frame. In effect, we can switch from one whole environment (set of bindings) to another just by changing a pointer. It also lets us have environments that share structure--nested environments simply have more frames in their chains, and the chains share structure with they're nested in. These properties of environment chains turn out to be very convenient when implementing procedure calling.

Go to the first, previous, next, last section, table of contents.