This text is originated as an exercise for an university course about scientific writing at the Beuth University of Applied Sciences Berlin. The assignment was to choose a computer science paper, reproduce the key ideas in own words, and add some own thoughts about that topic as conclusion.
I have selected the classical paper "Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I" by John McCarthy from 1960 ([LISP]), because it permits a fascinating look into the history of programming languages and is the origin of many concepts that are still relevant today.
This text is also influenced by Paul Graham's article "Roots of Lisp" from 2002 ([ROOTS]) about that McCarthy paper. I follow Paul Graham's approach to provide code examples in actual LISP code instead of m-expressions, and I assume that quote and cond are elementary functions.
The paper ([LISP]) describes a dynamic typed and functional programming language called LISP. The name LISP is an abbreviation for LISt Processor, which is a very suitable name, because the whole syntax is completely based on a simple list notation for code and data.
LISP was developed in 1958, two years before the paper was published. The main purpose for the development was the lack of appropriate programming languages for artificial intelligence applications. At this time FORTRAN was the dominant high level programming language, but it was developed for numeric calculations and engineering tasks and therefore no good fit for AI problems.
LISP was influenced by IPL (Information Processing Language), which was an experimental programming language from 1957 (see [IPL]). IPL was dedicated to AI research, but also inappropriate because it was an assembly language. Some of the IPL concepts that LISP had adopted and heavily improved were: list-processing, higher-order functions, recursion and computation with symbols. Some other concepts were new, for example: conditional control flow, garbage collection, lazy evaluation, and dynamic typing.
At first, we will learn something about the mathematical concepts behind LISP. Then, we will see that the early LISP had only two simple data types. After that, we will define 5-7 elementary functions and we will use them as building blocks to create our own functions. Then, we will see how the memory management works. At the end, we will look, how LISP was doing in the past 55 years and how LISP is doing today.
Propositional expressions are expressions whose values are either T "true" or F "false". These expressions are often combined by connectives like ∧ "and", ∨ "or" and ¬ "not". Typical examples are:
The notation of conditional expressions was a new concept, developed by McCarthy in 1960. It is the ancestor of the "if...then...else" condition, who is part of nearly every programming language nowadays. Conditional expressions allow a recursive definition of functions in a convenient way. A conditional expression has the form:
The p’s are propositional expressions that are true or false. The e’s could be any kind of expression. One could read "if p1 then e1, else if p2 then e2, ..., else if pn then en" or "p1 yields e1, ..., pn yields en".
The p’s get evaluate from left to right. When the first true p is found, then the conditional expressions returns the e that belongs to the p.
The whole conditional expressions is undefined:
With the help of conditional expressions it is easy to define recursive functions. The factorial of a non-negative integer n could be described as follows:
The evaluation of 0! returns 1. The evaluation of 2! looks as follows:
The Lambda calculus is a formal notation, which is used in LISP to generate new functions and to use functions as arguments. It was introduced by Alonzo Church in 1941 (see [LAMBDA]).
Church distinguishes between forms and functions. An expression like $y^2 + x$ is a form. An expression like $f(3, 4)$ a function. $y^2 + x$ is not a function because the expression $y^2 + x(3, 4)$ does not determine and could turn into 19 or 13. The problem is that the order, in which the arguments 3 and 4 are inserted into the form, is undefined. To convert a form into a function we can write: is $2.50 for the first one, and $2.00 for each additional one
$\cal E$ is a form and $x_1, \cdots, x_n$ are the ordered parameters for $\cal E$. The λ-expression is a function because the variables in $\cal E$ can be substituted with arguments in the order of the parameter list $x_1, \cdots, x_n$. We say that the variables of a λ-expression are bounded. The example from above looks now like this:
And with arguments like this:
If we want to define a recursive function like
in lambda notation
we found that these definition is inadequate, because the right-hand side $sqrt$ can not serve as an expression for the whole function. Remember, a function would look like $sqrt(a,x,ε)$.
In order to define recursive λ-expressions, we must introduce a new notation.
f can be seen as the function name. The occurrence of f within $\cal E$ will be evaluated to the label-expression as if f is a parameter of the function.
The whole syntax of LISP is made of symbolic-expressions or shorter s-expressions. All code and data can be expressed by the following syntax elements.
S-expressions are formed by these characters: (, ), ., and an infinite set of distinguishable atoms. The only data types in this early version of LISP are atoms and lists.
Atoms are numbers or strings of characters. These strings are often symbols and represent an other atom or list. If LISP code gets interpreted or compiled, symbols get replaced with their referenced data items.
1 12 a ab aaaaabb t nil
Pairs are also called cons cells or cons. They are expressed in this form:
(1 . 2) (a . b)
A pair has two pointers to values. The first pointer is called car and the second pointer is called cdr. We will see later why this is relevant.
Pairs can be used to build more complex data structures like singly linked lists.
(1 . (2 . (3 . nil)))
We can see that the car pointer points to the first element of a pair and that the cdr pointer points to the second element, which is the rest of the list structure. Like (1 . rest), but the rest is the sublist (2 . (3 . nil)). The sublist (2 . (3 . nil)) is also a pair of (2 . rest), where the rest is (3 . nil). The list notation can be abbreviated.
(1 2 3)
Through the missing . dot pairs and lists are distinguishable. Now we give some examples of lists in their abbreviated and their dotted-pair notation.
(1) (1 . nil) (a b) (a . (b . nil)) (1 2 (3 4)) (1 . (2 . ((3 . (4 . nil))))) (1 2 . 3) (1 . (2 . 3)) ((a b) c) ((a . (b . nil)) . (c . nil)))
Some data structures can also be seen as binary trees.
((1 . 2) . (3 . 4))
In the original paper McCarthy didn't used s-expressions to describe LISP functions, but m-expressions, which stands for meta-expressions. M-expressions are very similar to the standard notation of functions. They take s-expressions as data and to avoid ambiguity they have brackets instead of parenthesis. A m-expression function could look like this:
M-expressions, which describes LISP functions, are called s-functions. However, to use these s-functions for computation, they had to get converted into pure s-expressions. Because the converted s-functions are much better readable in their s-expression notation, and because the additional level of indirection is unnecessary, m-expressions were almost never used for real LISP programming.
For this reason all examples from the McCarty paper had been converted into runnable LISP code. Common Lisp was the best choice for this purpose, because it implements, in contrast to Scheme or Clojure, all functions we will take a closer look at in a moment.
Now we give a very short introduction to the usage of functions. This is not part of the McCarthy paper, but helpful to follow the Common Lisp examples.
LISP functions are also lists and looking as follows:
(operator arg1 arg2 ... argn)
Some simple examples are given with arithmetic operators:
> (+ 1 2) 3 > (* 2 (- 6 3)) 6
Operators gets applied from left to right. (+ 1 2 3 4 5) is equal to 1 + 2 + 4 + 5. Nested expressions get evaluated at first. (+ (* 3 1) (* 2 3)) is equal to (+ 3 6). This notation for arithmetic operations is also known as Polish notation and was invented by Jan Łukasiewicz in 1928 (see Wikipedia).
To distinguish data from function calls we use a single quote ' to mark expressions as data. This prevents them from getting evaluated. The ' is an abbreviation for (quote x). x can be every expression.
> (1 2) Error ;; Because 1 is not a function name > '(1 2) (1 2) > (quote (1 2)) (1 2) > s Error ;; Because s is undefined > 's ;; Undefined symbols must be quoted s > (defparameter s 7) ;; Defines variable s > s 7
McCarthy defines in his Paper 5 elementary functions: atom, eq, car, cdr, and cons. This functions are implemented as assembly code and can not become substituted by other expressions. We will see later, how we can use these functions to build new functions with them.
(atom x) returns t "true" if x is an atom. Otherwise it returns nil. nil represents falsity and is equal to the empty list ().
> (atom 1) t > (atom '(1 2)) nil > (atom ()) t
(eq x y) returns t if x and y are the same atom. Otherwise it returns nil.
> (eq 1 4) nil > (eq 'a 'a) t > (eq nil ()) t
(car x) expects that x is not an atom, but a list. It returns the first element of x. car stand for "contents of the address part of register". We have seen in section 3.3.1 Pairs that the first pointer is called car too, because he points to the first element of the list.
> (car '(1 . 2)) 1 > (car '(a b c)) a
(cdr x) expects also that x is not an atom, but a list. It returns all elements after the first element of x. cdr is the abreviation of "contents of the decrement part of register". The cdr pointer of a list points to everything after the car pointer.
> (cdr '(1 . 2)) 2 > (cdr '(a b c)) (b c) > (cdr '('(a b) c)) (c)
(cons x y) creates the new pair (x . y) out of x and y. x and y can be atoms or lists. cons stands for construct.
> (cons 1 2) (1 . 2) > (cons 1 '(a b c)) (1 a b c) > (cons '(1 2) '(a b)) ((1 2) a b) > (cons '(1 c) '(3 . 3)) ((1 c) 3 . 3)
Some examples how car, cdr and cons could interact:
> (car (cons 1 'a)) 1 > (cdr (cons '(1 2 3) '(a b c))) (a b c) > (cons (car '(1 2)) (cdr '(a b c))) (1 b c)
cond is not directly one of the 5 elementary functions from the paper. McCarthy has introduced it implicitly as an conversion rule from m-expressions to s-expressions. According to Paul Graham (see [ROOTS]) cond is so essential that it can be seen as a sixth elementary function.
(cond (p1 e1)...(pn en)) is used to control the program flow. It works like described in section 2.2 Conditional expressions.
> (cond ((atom 'a) 1) ;; if 'a is an atom then return 1 ((eq 1 1) 2) ;; else if 1 is equal to 1 then return 2 (t 0)) ;; else return 0 1
For quote applies the same like for cond. It was only introduced implicitly, but it is also this important that it can be seen as the seventh elementary function. We already know how quote works, but for the sake of completeness:
(quote x) takes every expression x and prevent its evaluation. It can be abbreviate with 'x.
> (quote (cons 1 1)) (cons 1 1) > '(cons 1 1) (cons 1 1)
lambda and label provide a notation to define our own functions.
lambda is used, like described in section 2.4 Lambda calculus, to bind arguments on an anonymous function.
The s-expression of the lambda calculus looks like this: (lambda (p1...pn) e) and a function call with arguments has this form: ((lambda (p1...pn) e) a1...an).
> ((lambda (x y) (cons x y)) 1 2) (1 . 2)
For function calls of lambda we need, like in section 2.4 Lambda calculus, the label notation (label f (lambda (p1...pn) e)).
Since label is unavailable in Common Lisp, we give no code example and just abbreviate (label f (lambda (p1...pn) e)) to (defun f (p1...pn) e). defun is a Common Lisp macro that defines a function that uses these form:
(defun <function-name> (<parameter1><parameter2>...<parameterN>) <function-body>)
Now we have the power to create all computable functions!
McCarthy gives in his paper some examples of new functions like ff, subset, equal, null, cadr, caddr, list, among, pair, assoc, sublis, sub2, apply, appq, eval, maplist, and search.
With his small examples, he prepares the bigger eval/apply example. He proves with this functions that he can build a complete LISP interpreter out of the elementary functions in a few lines of code (see Paul Graham's Common Lisp version of it Listing 8.1).
We will pick some interesting smaller examples, to show the capabilities of this early version of LISP.
With the help of cond it is easy to build (not p), (and p q) and (or p q). p and q can be every expression. not, and, and or will return t or nil.
> (defun not (p) (cond (p nil) (t t))) > (not t) nil > (defun and (p q) (cond (p (cond (q t) (t nil))) (t nil))) > (and t t) t > (defun or (p q) (cond (p t) (t (cond (q t) (t nil))))) > (or nil t) t
(append x y) takes two lists and returns them concatenated.
> (defun null (x) (eq x nil)) > (defun append (x y) (cond ((null x) y) (t (cons (car x) (append (cdr x) y))))) > (append '(1 1) '(a a)) (1 1 a a)
append consists of two conditional clauses. The first condition ((null x) y) is the termination condition and returns y if x is nil. The second condition (t (cons (car x) (append (cdr x) y))) gets always called, except x is nil and does a couple of things:
Now we show a part of the stack for (append '(1 2 3) '(6 6)) to illustrate how the recursion works in detail:
(append '(1 2 3) '(6 6)) (cons 1 (append '(2 3) '(6 6))) (cons 1 (cons 2 (append 3 '(6 6)))) (cons 1 (cons 2 (cons 3 (append nil '(6 6))))) ;; x is nil (cons 1 (cons 2 (cons 3 '(6 6)))) (cons 1 (cons 2 '(3 6 6))) (cons 1 '(2 3 6 6)) '(1 2 3 6 6)
At the point where x is nil, the termination condition ((null nil) '(6 6)) gets invoked, and the last call of append returns '(6 6). Then the stack gets reduced to the resulting list '(1 2 3 6 6).
Higher-order functions are functions that take other functions as arguments, or functions that return functions.
(maplist f x) takes the function f and the list x as arguments, and applies f on every element in x.
For this example McCarthy introduces arithmetic operators like (+ a1...an), and (* a1...an), but he didn't give any further explanations how they are realized internally.
;; In Common Lisp > (defun maplist (f x) (cond ((null x) nil) (t (cons (funcall f (car x)) (maplist f (cdr x)))))) > (defun double (x) (* x 2)) > (maplist #'double '(5 6)) (10 12) ;; "funcall f" and "#'double" are "special" CL syntax ;; and theoretically unnecessary. ;; The syntax of Scheme is much cleaner: ;; In Scheme > (define (maplist f x) (cond ((null? x) null) (#t (cons (f (car x)) (maplist f (cdr x)))))) > (define (double x) (* 2 x)) > (maplist double '(5 6)) (10 12)
We have already seen in section 3.3 Lists how simple s-expressions are represented in the memory. Now we will see how more complex s-expressions can be stored. The subdivided rectangle represents a register that stores two pointers. This notation was introduce by Allen Newell and J.C. Shaw in 1957 (see [IPL]).
Substructures can occur in more than one place. The lists (1 3) and (2 3) share the pair (3 . nil). This has the advantage that subexpressions that occurs in more than one place must only stored once.
In some cases it is possible that a subexpression that occurs more than once is represented in more than one way. This is dependent on the history of the program. The list ((1 . 2) . (1 . 2)) could be an example.
Cycles in list structures are forbidden.
Association lists, also known as property lists, are lists of a special form to represent symbols. The first car field contains a special constant to tell the system that it deals with a symbol. The association list can contain further informations about the symbol like:
McCarthy describes in his paper only how print names are represented by association lists. The association list of the symbol differentiate has a substructure of these form:
Since LISP was developed on an IBM 704, which had a 36-bit architecture, the words in this example consists of six 6 bit characters.
Garbage collection is one of the most important concepts that were introduced within the paper. It is a form of automatic memory management and is used to free the memory from unnecessary list structures without any action of the programmer.
A so called free-storage list is used to catalogue the available free memory. The free-storage list starts with a register called free and holds a pointer to the first free register in the list. If the program requires a register for a new list structure, the first register of the free-storage list is taken, and the free pointer gets attached to the second register of the free-storage list. This mechanism is know today as dynamic memory allocation.
To decide which registers are still in use and which are abandoned, the base register holds pointers to all accessible list structures. If a list structure is not accessible by a chain of car and cdr operations it is garbage.
To free the memory from the unaccessible lists a mark-and-sweep algorithm is used. In the first phase of mark-and-sweep all accessible registers get marked. In the second phase all unmarked registers get appended to the end of the free-storage list and are reusable after that. In the third phase all marked registers get unmarked.
The garbage collection mechanism gets only activated if the program runs out of free storage.
McCarthy also describes a strategy for lazy evaluation of conditional expressions. Only the p’s and e’s that are required get computed.
He also introduces the stack as a new concept. Through the use of stacks recursion is possible. The problem of recursion without a stack is that the program needs the same registers, that are also used by the previous level of recursion, to store the actual recursion result. This conflict is resolved by pushing and popping the results on a stack to store them temporary.
Before we evaluate the quality of LISP as a programming language, we are going to outline the development of LISP until today. After 1958, LISP became successful as AI language. Countless new LISP dialects were developed by universities and companies. Because LISP had efficiency issues the creation of special hardware, so called Lisp machines, was necessary. During the 1970, 1980s, and the early 1990s LISP had a very good period with a lot of funding from the US government, the US military, and the industry. LISP programming was a standard topic at some universities.
In 1975, Scheme, one of the 3 major dialects that are still relevant today, was developed from Guy L. Steele and Gerald Jay Sussman at the MIT. The main characteristics of Scheme are: a very minimalistic syntax and core library, lexical scoping, and tail-call optimization.
In 1984, Common Lisp, the second of the 3 dialects, was developed to standardize some earlier LISPs. Common Lisp is characterized by an inconsistent syntax, an huge number of libraries, and an extension for object oriented programming.
In the mid-1990s the funding and the interest in AI research had dropped because of disappointment about the unsatisfying results in the AI field. During the "AI Winter" the popularity of LISP had decreased dramatically.
LISP had a small "comeback" in the 2000s where a couple of new LISPs were developed. One of them was the third important dialect Clojure, which was developed from Rich Hickey in 2007. Clojure is built upon the JVM and supports some new features like multithreading and immutable data types.
The most concepts from the McCarthy paper are still core concepts of recent LISPs. Over the years, LISP had also gained some new functionality and is today a multi-paradigm and general-purpose programming language. The DNA of almost every LISP can be described as:
It seems to me that there have been two really clean models of programming so far the C model and the Lisp model.
LISP is in many ways different than other programming languages. On the one hand, it has these extreme minimalistic syntax, on the other hand, it has nearly endless expressive power. This means that LISP code can get created or changed directly through other LISP code. These expressiveness is accomplished by the homoiconicity and macros. The most famous example is the CLOS (Common Lisp Object System), which is a language extensions that makes object oriented programming under Common Lisp possible. An other example are DSLs (Domain Specific Languages), which are easy writable in LISP. Basically, LISP is able to adapt every functionality and every paradigm without changing the compiler or interpreter. For example, Clojure supports logic programming (see core.logic on Github) and pattern matching (see core.match on Github) since 2011.
Lisp isn't a language, it's a building material.
But LISP is not as successful as one could expect from a language with all that superior functionality. If we compare the amount of LISP related questions with the amount of questions to other programming languages on stackoverflow.com, we found that LISP is not very popular. The percentage of questions about the three major LISP dialects is approximately on the same low level under 0.3%.
How low this is becomes more obvious when we compare LISP with some mainstream languages.
And even if we consider that functional programming languages are less frequently used than imperative programming languages, we must admit that it is rather unpopular.
Now we are going to discuss briefly some of the possible reasons for this situation.
The already small LISP community is not only divided into 3 main dialects, the dialects itself are also divided into many different implementations and flavors. Some examples are:
The fragmentation is a huge problem, because there is no joint effort to improve the dialects or LISP as a whole.
It appears as if Hacker News is the only relevant LISP application nowadays. Hacker News is an minimalistic social news website, which ranks articles through a voting systems. Because of its simplicity it can not serve as a promotion for the capabilities of LISP, like Dropbox is a promotion for Python or Github is a promotion for Ruby. Therefore companies and developers see no economic or technical advantages in using LISP for their projects.
The most commonly used development tools are often old and do not offer that comfort that modern tools for other languages do. Of course, Emacs is very sophisticated and offers every feature one could ever imagine, but it also seems very unfamiliar to some people, and it needs a lot of effort to learn to use it effectively. A positive exception is LightTable, a modern IDE for Clojure.
There are worldwide almost no LISP programming jobs available. functionaljobs.com lists at the moment 3 jobs and careers.stackoverflow.com lists 4 jobs. So, job opportunities can not serve as a motivation to spend time for LISP development.
We have seen in Fig. 6.1 that none of the major LISP dialects and even Clojure are able to attract new developers. The LISP approach is so different compared with imperative and object oriented languages that some people may have problems with that paradigm shift. Some people also complain that LISP code is hard to read with all the parentheses and recursion. It appears also to be a problem for newbies to find guidance for choosing a dialect and tools. Like LISP it self, a lot of LISP related content in the internet is very old and looks therefore outdated or abandoned. Even if they are not outdated or abandoned, many web sites leave a very poor first impression, like LispWorks and Racket.
However, LISP is still a beautiful programming language with an elegant syntax and endless expressiveness. This is even more impressive if one considers that it was developed 55 years ago.
Alonzo Church, The Calculi of Lambda-Conversion, Princeton University Press, Princeton, 1941
John McCathy, Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I, Massachusetts Institute of Technology, Cambridge, Mass., 1960, Source
; The Lisp defined in McCarthy's 1960 paper, translated into CL. ; Assumes only quote, atom, eq, cons, car, cdr, cond. ; Bug reports to email@example.com. (defun null. (x) (eq x '())) (defun and. (x y) (cond (x (cond (y 't) ('t '()))) ('t '()))) (defun not. (x) (cond (x '()) ('t 't))) (defun append. (x y) (cond ((null. x) y) ('t (cons (car x) (append. (cdr x) y))))) (defun list. (x y) (cons x (cons y '()))) (defun pair. (x y) (cond ((and. (null. x) (null. y)) '()) ((and. (not. (atom x)) (not. (atom y))) (cons (list. (car x) (car y)) (pair. (cdr x) (cdr y)))))) (defun assoc. (x y) (cond ((eq (caar y) x) (cadar y)) ('t (assoc. x (cdr y))))) (defun eval. (e a) (cond ((atom e) (assoc. e a)) ((atom (car e)) (cond ((eq (car e) 'quote) (cadr e)) ((eq (car e) 'atom) (atom (eval. (cadr e) a))) ((eq (car e) 'eq) (eq (eval. (cadr e) a) (eval. (caddr e) a))) ((eq (car e) 'car) (car (eval. (cadr e) a))) ((eq (car e) 'cdr) (cdr (eval. (cadr e) a))) ((eq (car e) 'cons) (cons (eval. (cadr e) a) (eval. (caddr e) a))) ((eq (car e) 'cond) (evcon. (cdr e) a)) ('t (eval. (cons (assoc. (car e) a) (cdr e)) a)))) ((eq (caar e) 'label) (eval. (cons (caddar e) (cdr e)) (cons (list. (cadar e) (car e)) a))) ((eq (caar e) 'lambda) (eval. (caddar e) (append. (pair. (cadar e) (evlis. (cdr e) a)) a))))) (defun evcon. (c a) (cond ((eval. (caar c) a) (eval. (cadar c) a)) ('t (evcon. (cdr c) a)))) (defun evlis. (m a) (cond ((null. m) '()) ('t (cons (eval. (car m) a) (evlis. (cdr m) a)))))