let
One difference between a C or Pascal block and a Scheme let
is that let
variable bindings don't necessarily cease
to exist when the let
is exited, and the bindings therefore
can't be allocated on a stack in the general case. (The
reason for this will become clear when we talk about lambda
and closures.)
One way to visualize the creation of block variables is to see it as the creation of a new table mapping names to storage, like the toplevel environment in our interpreter.
Except for the new variables, the new environment (table) is the same
as the one that was in use when the block was entered. We say that
the let
expression "extends" the "outer" environment with
bindings for the let
variables.
Suppose we type a let
expression at the Scheme prompt,
(Assume we we're just doing the usual expression evaluation in
the usual top-level environment.)
Scheme>(let ((x 10) (y 20)) (+ x y)) 30
The interpreter maintains a pointer to the "current environment" when evaluating an expression. This pointer always points to the environment the currently-executing code must execute in, i.e., the variable bindings it must see for the variables it uses. When we evaluate a variable, we look for a binding of its name in the current environment, by searching up the environment chain starting from the "current environment" pointer.
Before evaluating the let
expression, Scheme's environment
pointer points to the top-level environment, which contains
the usual bindings holding the built-in Scheme procedures, plus
any top-level variables we've defined. Supposing we've defined
a variable foo
, we can draw the top-level environement
like this:
+-----+ +------+-----+ envt | * -+------->| car | *--+----> #<proc ...># +-----+ +------+-----+ | cons | *--+----> #<proc ...># +------+-----+ | + | *--+----> #<proc ...># +------+-----+ | * | * | * | +------+-----+ | foo | +--+----> #<proc ...># +------+-----+
(Here I've drawn the environment as a simple table of names and bindings. It might actually be implemented as an association list, as in our simple example interpreter, or more likely as a hash table.)
After entering the let
and creating the bindings for x
and y
, the interpreter changes the environment pointer to
point to the resulting new environment. This is typically
implemented by representing the environment as a chain of
tables called frames, rather than a simple flat table.
The newest frame is searched first, and so on down the chain, to
find the appropriate bindings. This environment chain is used as a
pointer-linked stack, for the most part, with new frames
being pushed onto the stack when a let
is entered, and popped
off the stack when a let
is exited.
Each frame holds bindings created by a particular binding construct
in the program, e.g., on entering a let
. A whole environment
is not represented by just one frame, but by the chain of frames starting
at a particular frame. In the figure below, for example, the smaller
binding frame only holds bindings of x
and y
, but
it represents an environment that includes bindings of car
,
etc. The environment inherits bindings from the environment it's
scoped inside, and this is what the scope link
is for.
+-------+-----+ | car | +--+----> #<proc ...># +-------+-----+ | cons | +--+----> #<proc ...># +-------+-----+ | + | +--+----> #<proc ...># +-------+-----+ | * | * | * | +-------+-----+ | foo | +--+----> #<proc ...># +-------+-----+ /|\ | | | +-----+ +-------+--+--+ envt | +--+------->|[scope]| * | +-----+ +-------+-----+ | x | 10 | +-------+-----+ | y | 20 | +-------+-----+
The link that connects the two frames (tables) is called a scope
link. It reflects the nesting of naming scopes in the program. In this
case, when a variable is referenced inside the let
, the search
for a binding begins at the new frame (table). If it is not
found, the search follows the scope link to the next frame
and looks there. This can continue for as many levels as there
are nested scopes in the program.
While we're executing in the new environment, bindings in the
new frame shadow (hide) any bindings of variables with the same
name in the outer environment.
For example, if there's a top-level variable named x
bound in the
top-level environment, they won't be seen by code executing in the
let
environment.
When we exit the let
, the current environment pointer is set back
to point to the same environment frame as before entering the let
.
In the usual case, that environment becomes garbage because there are no
pointers to it, and the garbage collector will eventually reclaim
its space.
Keep in mind that an environment is a language-level entity, and just consists of a set of bindings; it is just a mapping from names to storage that can hold values. Our chain of frames is a convenient and efficient data structure we've chosen to implement this abstraction, so that environments nest properly.
The difference between frames and environments is similar to the difference between pairs and lists. A pointer to a pair that begins a list may be thought of as pointer to the pair itself, or as a pointer to the whole list of pairs connected by their cdr's. Likewise, a pointer to an environment frame points to that frame, but also points to the sequence of environment frames connected by their scope links. This corresponds to the scope rule that an expression can see bindings created by enclosing expressions, as long as they're not shadowed by another binding in between.
As with pairs and lists, environment frames can share structure, and the same frame may be part of several (nested) environments. (As we'll see later, this is important for implementing lexical scope correctly in the presence of first-class procedures.)
Unlike pairs and lists, however, a particular environment doesn't necessarily include all of the parts of all of the frames in the sequence. The scope rules of the language allow shadowing of bindings in outer environments by bindings in inner environments; our lookup routine implements this by searching an environment chain in inner-to-outer order, and taking the first binding of the name.
If we think of environments as sets of bindings, the act of pushing an environment frame on the front of an environment creates a new environment with new bindings added--and with any shadowed bindings removed. We don't actually modify the old environments, however--they're not changed by the creation of new environments with scope links to them. What effectively "removes" a binding to create a new envioronment is our search procedures--it searches an environment front-to back, to find the binding of a name, and ignores any bindings after the first one it finds.
As we will see later, it's significant that we never actually modify the structure of an environment chain once it's created--we never clobber the scope links. This allows us to save a pointer to an environment by saving a pointer to the first frame in the chain, and restore that environment later by simply setting our environment pointer to point to that frame. In effect, we can switch from one whole environment (set of bindings) to another just by changing a pointer. It also lets us have environments that share structure--nested environments simply have more frames in their chains, and the chains share structure with they're nested in. These properties of environment chains turn out to be very convenient when implementing procedure calling.