Code Is Data
The fact that Clojure's unconventional syntax asks you to type in the AST directly has a surprising implication: namely, that all Clojure code is data. In fact, there is no formal distinction between code and data in Clojure. When code is represented in text files, it exists as a set of nested forms (or S-expressions). When these forms are parsed by the reader (which is like a part of the compiler), they become Clojure data structures, no different in kind than the data structures you create yourself in Clojure code.
We call this property homoiconicity. In a homoiconic language, code and data are the same kind of thing. This is very different from a language like Java, in which variables and the code that manipulates them live in two separate conceptual spaces.
As a result of Clojure's homoiconicity, every Clojure program is a data structure, and every Clojure data structure can potentially be interpreted as a program. The data structure that is the program is available for the program to modify at run time. This arguably allows for the most powerful metaprogramming possible in any language.
Data Types...?
Since all Clojure code is a Clojure data structure, we don't speak of data types and data structures the way we do in a conventional language. Instead, we speak of forms. Forms are text strings processed by the Clojure Reader.
Form Name |
Description |
Examples |
String |
A string of characters, implemented by java.lang.String. |
"angry monkey", "Mutable state considered harmful" |
Number |
A numeric literal that evaluates to itself. |
6.023E23, 42 |
Ratio |
A rational number. |
22/7, 1/3, 24/601 |
Boolean |
A boolean literal form. False and nil evaluate to false; true and everything else evaluate to true. Also returned by predicate functions. |
true, false |
Character |
A single character literal, implemented by java.lang.Character. |
\z, \3, \space, \tab, \ newline, \return |
Nil |
The null value in Clojure. |
nil |
Keyword |
A form beginning with a colon that evaluates to itself. Also a function that looks itself up in a map. |
:deposed, :royalty |
Symbol |
A name that refers to something. Symbols may be function names, data, Java class names, and namespaces. |
str-join, java.lang. Thread, clojure.core, +, *source-path* |
Set |
A collection of unique elements. Also a function that looks up its own elements. |
#{:bright :copper :kettles}, #{135713} |
Map |
A collection of key/value pairs. Note that commas are optional. Also a function that looks up its own keys. |
{:species "monkey" :emotion "angry"}, {"A" 23, "B" 83} |
Vector |
An ordered collection with high-performance indexing semantics. Also a function that looks up an element by its position. |
[112358] |
List |
An ordered collection, also known as an S-expression, or sexp. The fundamental data structure of Clojure. When the list is not quoted, the first element is interpreted as a function name, and is invoked. |
'(13 17 19 23), (map str [13 17 19 23]) |
Note that all Clojure data structures are immutable, even things like maps and lists that we normally think of as mutable. Any time you perform an operation on a data structure to change it, you are actually creating a whole new structure in memory that has the modification. If this seems horribly inefficient, don't worry; Clojure represents data structures internally such that it can create modified views of immutable data structures in a performant way.
Mutability in Clojure
Clojure invites you to take a slightly different view of variables from the imperative languages you are used to. Conceptually, Clojure separates identity from value. An identity is a logical entity that has a stable definition over time, and can be represented by one of the reference types (ref, agent, and atom). An identity "points to" a value, which by contrast is always immutable. We could bind a name to a value as follows—this is analogous to assignment in Java—but there is no idiomatic way to change the value had by that name:
=> (def universal-answer 42) #'user/universal-answer
=> universal-answer
42
=> ; Doing it wrong
=> (def universal-answer 43)
#'user/universal-answer
In the example above, 42 is the value bound to the name universal-answer. If we wanted the universal answer to be able to change, we might use an atom:
=> (def universal-answer (atom "what do you get when you multiply six by nine"))
#'user/universal-answer
=> (deref universal-answer)
"what do you get when you multiply six by nine"
Note that we access the value of an atom using the deref function. To change the value pointed to by an atom, we must be explicit. For example, to change the universal answer to be a function instead of a number or a string, we use the reset! function:
=> (reset! universal-answer (fn [] (* 6 9)))
#<user$eval11$fn__12 user$eval11$fn__12@d5e92d7>
=> ((deref universal-answer))
54
The double parentheses around the deref call are necessary because the value of universal-answer is a function. Wrapping that function in parentheses causes Clojure to evaluate it, returning the value 54. Note that the number, the string, and the function above are values, and do not change. The symbol universal-answer is an identity, and changes its value over time.
In traditional concurrent programming, synchronizing access to shared variables is the limiting factor in creating correct programs, and is an intellectually daunting task besides. Clojure provides an elegant solution in its reference types. In addition to refs, we have atoms and agents for concurrently managing mutable state. Together these three types form a significantly improved abstraction over traditional threading and synchronization. You can read more about them here: http://clojure.org/refs, http://clojure.org/atoms, http://clojure.org/agents.
Sequences
Clojure's Aggregate forms (i.e., String, Map, Vector, List, and Set) can all be interpreted as sequences, or simply "seqs" (pronounced seeks). A seq is an immutable collection on which we can perform three basic operations:
- first: returns the first item in the sequence
=> (first [2 7 1 8 2 8 1 8 2 8 4 5 9 0])
2
- rest: returns a new sequence containing all elements except the first
=> (rest [2 7 1 8 2 8 1 8 2 8 4 5 9 0])
(7 1 8 2 8 1 8 2 8 4 5 9 0)
- cons: returns a new sequence containing a new element added to the beginning
=> (cons 2 [7 1 8 2 8 1 8 2 8 4 5 9 0])
(2 7 1 8 2 8 1 8 2 8 4 5 9 0)
These three functions form the backbone of seq functionality, but there is a rich library of additional seq functions in clojure.core.
See here for more details: http://clojure.org/sequences and http://clojuredocs.org/quickref/Clojure%20Core#Collections+-+SequencesSequences.
Clojure sequences can be infinite, like the set of all positive integers:
=> (def positive-integers (iterate inc 1))
Since this sequence would take infinite memory to realize, Clojure provides the concept of a lazy sequence. Lazy sequences are only evaluated when they are needed at run time. If we tried to print the sequence of all primes, we would need infinite memory and time:
=> (use '[clojure.contrib.lazy-seqs :only (primes)])
=> primes
=> ; requires extreme patience
However, we can efficiently reach into that lazy sequence and grab the members we need without constructing the whole thing:
=> (use '[clojure.contrib.lazy-seqs :only (primes)])
=> (take 10 (drop 10000 primes))
=> (104743 104759 104761 104773 104779 104789 104801 104803 104827 104831)
Of course, lazy sequences are not magical. They will require enough memory and computation to generate the values we request, but they defer that computation until needed, and usually don't attempt eager computation of the entire sequence.
{{ parent.title || parent.header.title}}
{{ parent.tldr }}
{{ parent.linkDescription }}
{{ parent.urlSource.name }}