Suppose we want to make a list of symbols whose print names are the English words for the first five integers. We could do it using quoting, of course, like this:
Scheme>(define firstfive '(one two three four five)) #void Scheme>firstfive (one two three four five)
We don't have to quote each symbol individually. Within a quote
expression, everything is assumed to be literal data, not expressions to evaluate.
We could also do it by calling list
to construct the list, and
handing it each of the five symbols as literals. To do that, we
have to quote them, so that Scheme won't think we're referring to
variables named one
, two
, etc.
Scheme>(define firstfive (list 'one 'two 'three 'four 'five)) #void Scheme>firstfive (one two three four five)
Since list
is a procedure, its argument expressions are evaluated.
We use a quote
around each expression, so that it will return
a pointer to the appropriate symbol, rather than the value of the variable
by the same name.
This works whether or not there is a variable by that name, because names of symbols and names of variables are completely different things.
For example, even after evaluating the above expressions, attempting to
evaluate the expression four
will be an error, unless we've defined
a variable named four
. The existence of a symbol with a given print
name doesn't say anything about the existence of a variable with that name.
Since Scheme is dynamically typed, we can put any kind of object in a list. So far, we've made a list of integers and a list of symbols. We can also make a list of different kinds of things, such as a list of integers, symbols, and lists.
Scheme>(define mixed5 '(one 2 (three and a) "four" 5)) #void Scheme>mixed5 (one 2 (three and a) "four" 5)
Here we've constructed a mixed list whose first element is a symbol, the second is an integer, the third is a list of symbols, the fourth is a string, and the fifth is another integer. (The technical term for a mixed list is a "heterogeneous list.")
We can draw it like this:
+-----+ mixed5 | +--+-->+---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +-----+ | + | +-+->| + | +-+->| + | +-+->| + | +-+->| + | * | +-+-+---+ +-+-+---+ +-+-+---+ +-+-+---+ +-+-+---+ | | | | | \|/ \|/ | \|/ \|/ one 2 | "four" 5 | \|/ +---+---+ +---+---+ +---+---+ | + | +-+->| + | +-+->| + | * | +-+-+---+ +-+-+---+ +-+-+---+ | | | \|/ \|/ \|/ three and a
Notice that we draw the symbols (one
, three
, and
, and
a
) as simple sequences of characters. This is just a drawing
convention. They're really objects, like pairs are. We draw strings
similarly, but with double quotes around them. Don't be fooled--these
are objects on the heap, too. We just draw them this way to keep
the picture from getting cluttered up.
We've already seen two list-processing procedures that you'll use a lot,
car
and cdr
. car
takes a pointer to a pair, and
extracts the value of its first (car
) field. cdr
takes
a pointer to a pair and returns the value of its second (cdr
) field.
(It might have been better if car
had been called first
and cdr
had been called rest
, since that's more
suggestive of how they're used: a pointer to the first item in a
list, and a pointer to the pair that heads the rest of the list.)
Given our list stored in mixed5
, we can extract parts of the
list using car
and cdr
.
Scheme>(car mixed5) one Scheme>(cdr mixed5) (2 (three and a) "four" five)
By using car
and cdr
multiple times, we can extract
things beyond the first element. For example, taking the cdr
of the cdr
of a list skips the first two elements, and returns
the rest:
Scheme>(cdr (cdr mixed5)) ((three and a) "four" 5)
Taking the car of that list (that is, the car
of the cdr
of the cdr
) returns the first item in that list:
Scheme>(car (cdr (cdr mixed5))) (three and a)
We can keep doing this, for example taking the second element of that sublist by taking the car of its cdr.
Scheme>(car (cdr (car (cdr (cdr mixed5))))) and
This starts to get tedious and confusing--too many nestings of procedures
that do too little at each step--so Scheme provides a handful of procedures
that do two list operations at a whack. The two most important ones
are cadr
and cddr
.
cadr
takes the car
of the cdr
, which gives you the
second item in the list. cddr
takes the cdr
of the cdr
,
skipping the first two pairs in a list and returning the rest of the list.
This lets us do the same thing we did above, a little more concisely and readably:
Scheme>(cadr (car (cddr mixed5))) and
With a little practice, it's not hard to read a few nested expressions
like this. In this example, taking the cddr
of mixed5
skips down the list two places, giving us the list that starts with
the sublist we want. Then taking the car
of that gives us
the sublist itself off the front of that list, and taking the cadr
of that gives us the second item in the sublist.
Of course, even if Scheme didn't provide cadr
and cdr
,
you could write them yourself in terms of car
and cdr
:
(define (cadr x) (car (cdr x))) (define (cddr x) (cdr (cdr x)))
Scheme actually provides predefined list operations for all combinations
of up to four car
's and cdr
's. For example, cadadr
takes the cadr
of the cadr
. (The naming scheme is
that the pattern of a
's and d
's reflects the equivalent
nesting of calls to car
and cdr
.)
You probably won't want to bother with most of those, because the
names aren't very intuitive. Two procedures that are worth knowing
are list-ref
and list-tail
.
(list-ref
list n)
extracts the nth
element of a list list
, which is equivalent to n-1
applications of cdr
followed by car
. For example,
(list-ref '(a b c d e) 3)
is equivalent to
(car (cdr (cdr '(a b c d e))))
, and returns d
.)
In effect, you can index into a list as though it were an array, using
list-ref
. (Of course, the access time for an element of a list
is linear in the index of the element. If you need constant-time
access, you can use vectors, i.e., one-dimensional arrays.) Notice that
the numbering is zero-based, which is why (list-ref lis 3)
returns
the fourth element of a lis
. This is consistent with
the indexing of vectors, which are also zero-based, as well as reflecting
the number of cdr
operations.
(list-tail
list n)
skips the first n
elements of a list and returns a pointer to the rest, which is equivalent
to repeated applications of cdr
. (This is standard R4RS Scheme,
but not IEEE Scheme. If your Scheme doesn't provide list-tail
,
you can easily write your own.)
These two procedures can make it much clearer what you're doing
when you extract elements from nested lists. Suppose that we have
a list foo
, which is a triply-nested list--a list of
lists of lists, and we want to extract the second element of the
bottom-level list that is the third element of the middle-level
list that is the fourth element of the outermost list.
We could write (car (cdr (car (cdr (cdr (car (cdr (cdr (cdr foo)))))))))
,
but that's pretty hard to read. If we use cadr
, caddr
, and
cadddr
, we can make it somewhat more readable by using one function
call at each level of structure: (cadr (caddr (cadddr foo)))
.
But it's still clearer to write
(list-ref (list-ref (list-ref foo 4) 3) 2)
or (indented)
(list-ref (list-ref (list-ref foo 4) 3) 2)
list-ref
and list-tail
are much more convenient than
things like caddr
when the indexes into a list vary at run
time. For example, we might use an index variable i
(or
some other expression that returns an integer) to pick out the
ith member of a list: (list-ref foo i)
. Writing
this with car
and cdr
would require writing a loop
or recursion to perform n-1 cdr
's and a car
.
================================================================== This is the end of Hunk N At this point, you should go back to the previous chapter and read Hunk O before returning here and continuing this tutorial. ==================================================================
(Go BACK to read Hunk O, which starts at section Tail Recursion (Hunk O).)