1. Trang chủ >
  2. Công Nghệ Thông Tin >
  3. Kỹ thuật lập trình >

Chapter 8. Organizing and Building Clojure Projects

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (11.51 MB, 630 trang )


Defining and Using Namespaces

As we said in “Namespaces” on page 20,2 Clojure’s namespaces:

• Are dynamic mappings of symbols to Java class names and vars, the latter containing any value you specify (most often functions, constant data, and reference

types)

• Are roughly analogous to packages in Java and modules in Python and Ruby

All Clojure code is defined within namespaces. If you neglect to define your own, any

vars you define will be mapped into the default user namespace. While fine for a lot of

REPL interactions, that’s almost never a good idea once you want to build something

to last and be used by others. We need to know how to define namespaces idiomatically,

how they map onto individual source files, and how they are best used to provide highlevel structure and organization for your Clojure codebase. Clojure provides discrete

functions for manipulating the minutiae of namespaces (very useful at the REPL), as

well as a unification of those functions into a single macro that we can use to declare

in one place a namespace’s name, top-level documentation, and dependencies on other

namespaces and Java classes.

in-ns. def and all of its variants (like defn) define vars within the current namespace, which is always bound in *ns*:

*ns*

;= #

(defn a [] 42)

;= #'user/a



Using in-ns, we can switch to other namespaces (creating them if they don’t already

exist), thereby allowing us to define vars in those other namespaces:

(in-ns 'physics.constants)

;= #

(def ^:const planck 6.62606957e-34)

;= #'physics.constants/planck



However, we’ll quickly discover that something is awry in our new namespace:

(+ 1 1)

;= #
;= Unable to resolve symbol: + in this context, compiling:(NO_SOURCE_PATH:1)>



The + function (and all other functions in the clojure.core namespace) aren’t available

as they are in the default user namespace we’ve worked within all along—though they

are accessible using a namespace-qualified symbol:

(clojure.core/range -20 20 4)

;= (-20 -16 -12 -8 -4 0 4 8 12 16)



2. If you’ve not digested that section yet, do so now; that is where we introduce namespaces at the most

basic level, talk about symbols, vars, and how the former resolve to the latter.



322 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



Remember that namespaces are mappings of symbols to vars; while in-ns switches us

to the namespace we name, that’s all it does. Special forms remain available (including

def, var, ., and so on), but we need to load code from other namespaces and map vars

named there into our new namespace in order to use that code reasonably succinctly.

refer. Assuming a namespace is already loaded, we can use refer to add mappings

to its vars for our namespace. We defined a dummy function a in the user namespace

earlier. We can establish mappings in our empty namespace for all of the public vars

in user, allowing us to access a more easily:

user/a

;= #

(clojure.core/refer 'user)

;= nil

(a)

;= 42



a is now mapped within our current namespace to the var at user/a, and we can use it



as if it were defined locally. That’s certainly easier than having to use namespace-qualified symbols everywhere to access vars in other namespaces.

refer can be used to do more than a simple “import” though: you can specify that



certain vars be excluded, included, or renamed when they are mapped into the current

namespace by using optional keyword args of :exclude, :only, and :rename, respectively. For example, let’s refer to clojure.core, but exclude some functions and map

some of the arithmetic operators to different names locally:

(clojure.core/refer 'clojure.core

:exclude '(range)

:rename '{+ add

- sub

/ div

* mul})

;= nil

(-> 5 (add 18) (mul 2) (sub 6))

;= 40

(range -20 20 4)

;= #
;= Unable to resolve symbol: range in this context, compiling:(NO_SOURCE_PATH:1)>



Now we can use all the public functions3 from clojure.core (except for range, which

we excluded), and we’re using different names for some of the arithmetic functions.

While clojure.core is always preloaded (and refered to in the user namespace), we’ll

often need more than that, and we’ll want to define multiple namespaces ourselves in

order to organize our codebases sensibly. We need a facility for loading namespaces.



3. refer will not bring in any private vars from the source namespace. See “Vars” on page 198 for details on

private vars.



Project Geography | 323



www.it-ebooks.info



refer is rarely used directly, but its effects and options are available

through use, which is widely used.



require and use. When some code needs to make use of functions or data defined

in public vars in another namespace, require and use are used to:

1. Ensure that the namespaces in question are loaded.

2. Optionally establish aliases for those namespaces’ names.

3. Trigger the implicit use of refer to allow code to refer to other namespaces’ vars

without qualification.

require provides (1) and (2); use is built on top of it and refer to provide (3) in a succinct



way.

Let’s start with a new REPL, where we’d like to use the union function in the clo

jure.set namespace:

(clojure.set/union #{1 2 3} #{4 5 6})

;= #



Wait, that namespace isn’t loaded yet—only clojure.core is preloaded. We can use

require to load the clojure.set namespace from the classpath;4 afterward, we can use

any function within that namespace:

(require 'clojure.set)

;= nil

(clojure.set/union #{1 2 3} #{4 5 6})

;= #{1 2 3 4 5 6}



Having to use fully qualified symbols to name vars can be a pain though, especially if

the libraries you are using provide namespaces that are long or have a number of segments. Thankfully, require provides a way to specify an alias for a namespace:

(require '[clojure.set :as set])

;= nil

(set/union #{1 2 3} #{4 5 6})

;= #{1 2 3 4 5 6}



The vector arguments provided to require and use are sometimes called libspecs:

they specify how a library is to be loaded and referred to within the current

namespace.

When you need to require multiple namespaces that share a common prefix, you can

provide to require a sequential collection where the first element is the namespace

prefix and the remaining elements are the remaining segments specifying the



4. See “Namespaces and files” on page 328 for how Clojure namespaces correspond to files on disk, and

“A classpath primer” on page 331 for what the classpath is and why you should care.



324 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



namespaces you’d like to load. So, if we wanted to require both clojure.set and

clojure.string, we would not have to repeat the clojure prefix:

(require '(clojure string [set :as set]))



use provides all of the capabilities of require, except that by default, it refers the given

namespace after it is loaded. So, (use 'clojure.xml) is the equivalent of:

(require 'clojure.xml)

(refer 'clojure.xml)



In addition, use passes along all of its arguments to refer, so you can leverage the

latter’s :exclude, :only, and :rename options to their fullest. To illustrate, let’s consider

a scenario where we need to use clojure.string and clojure.set:

1. We’re happy to refer all of the vars in the latter into our current namespace, but…

2. We have a number of local functions whose names conflict with those in clo

jure.string; a simple namespace alias (using :as with require) will work there,

but…

3. We need to use clojure.string/join a lot, and it doesn’t conflict with any functions

in our current namespace, so we’d like to avoid the namespace alias in that case.

4. clojure.string and clojure.set both define a join function; attempting to refer

both of them in will result in an error, so we want to prefer clojure.string/join.

use can accommodate these criteria readily:

(use '(clojure [string :only (join) :as str]

[set :exclude (join)]))

;= nil

join

;= #

intersection

;= #

str/trim

;= #



We can now access clojure.string’s join function without any namespace qualification, but the rest of clojure.set has been refered into our namespace (including inter

section), and the entire clojure.string namespace is available via the str alias.



Using require, refer, and use Effectively

These functions in concert provide many subtle options, especially compared to the

blunt instruments that are import in Java and require in Ruby. Using them effectively

and idiomatically can be a tripping point for some new to Clojure.

A good default is to always use require, generally with an alias for each namespace:

(require '(clojure [string :as str]

[set :as set]))



This is roughly equivalent to import sys, os in Python. Because namespaces generally

have multiple segments (compared to the single-token module names common in

Project Geography | 325



www.it-ebooks.info



Python), Clojure does not provide a default alias for required namespaces, but it does

allow you to control the alias that is used. Of course, if the namespace in question is

short, or you only use vars from it a few times, then a bare require without any alias is

entirely appropriate.

Another commonly recommended pattern is to prefer use in conjunction with a namespace alias and an explicit included list of vars to refer into the current namespace:

(use '[clojure.set :as set :only (intersection)])



Insofar as this form of use provides you with a superset of all of the functionality provided by require and refer, using it means you can consolidate all your namespace

references into a single use form. Even where you might otherwise use aliasing

require forms, the equivalent use form is hardly longer and allows you to add refered

functions to the :only argument very easily.

In any case, it is generally good practice to avoid unconstrained usages of use, that is,

those that do not include an :only option to explicitly name the functions that should

be refered into the current namespace. Doing so makes it clear what parts of other

namespaces your code makes use of, and avoids any name collision warnings that may

crop up as upstream libraries change and add functions that you may have already

declared locally.



import. While Clojure namespaces primarily map symbols to vars, often canonically

defined in multiple other namespaces, they also map symbols to Java classes and interfaces. You can use import to add such mappings to the current namespace.

import expects as arguments the full names of the classes to import, or a sequential



collection describing the package and classes to import. Importing a class makes its

“short name” available for use within the current namespace:

(Date.)

;= #
;= Unable to resolve classname: Date, compiling:(NO_SOURCE_PATH:1)>

(java.util.Date.)

;= #

(import 'java.util.Date 'java.text.SimpleDateFormat)

;= java.text.SimpleDateFormat

(.format (SimpleDateFormat. "MM/dd/yyyy") (Date.))

;= "07/18/2011"



Date is in the java.util package, and so usages of its short name will cause an error

before it is imported into the current namespace.



We can use Java classes and interfaces without any explicit importing at all, but such

usage requires fully qualified classnames, which can be unpleasantly verbose.

You can import classes into the current namespace by providing import with symbols

naming the classes.

After being imported, the classes’ short names can be used to refer to them.

326 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



All classes in the java.lang package are always imported into every namespace by default; for example, java.lang.String is available via the String symbol, and does not

need to be imported separately.

When you want to import multiple classes from a single package, you can provide to

import the same kind of package-prefixed collection that require accepts for namespaces with the same prefix:

(import '(java.util Arrays Collections))

;= java.util.Collections

(->> (iterate inc 0)

(take 5)

into-array

Arrays/asList

Collections/max)

;= 4



It’s a rare case, but be aware that you cannot import two classes with the same short

name into the same namespace:

(import 'java.awt.List 'java.util.List)

;= #
;= List already refers to: class java.awt.List in namespace: user>



The workaround here (as in Java) would be to import the one that you use most frequently within your namespace, and use the other’s fully qualified classname.

While Clojure’s import is conceptually similar to Java’s import statements, there are a couple of important differences.

First, it provides no analogue to the wildcard import used frequently in

Java, such as import java.util.*;. If you need to import multiple classes

from a single package, you will need to enumerate each of them, surely

as part of a package-prefixed list as shown above.

Second,



if



you



need



to



refer



to



an



inner



class



(e.g.,



java.lang.Thread.State, java.util.Map.Entry), you need to use the

Java-internal notation for them (e.g., java.lang.Thread$State,

java.util.Map$Entry). This applies to any reference to inner classes, not

just those provided to import.



ns. All of the namespace utility functions we’ve looked at so far in this section should

generally be reserved for use in the REPL. Whenever you are working on code you

would like to reuse outside of a REPL, you should use the ns macro to define your

namespaces.5



5. It may be tempting to take a transcript of what you get working within a REPL, paste it all into a .clj file

(complete with bare in-ns, refer, et al. forms), and call it a day. We urge you to fight any such temptation.

As we’ll discuss in the next section, there are some rules of good hygiene when it comes to organizing

Clojure code, and neglecting to fully specify your namespaces by using ns would be running counter to

those guidelines for no benefit.



Project Geography | 327



www.it-ebooks.info



ns allows you to declaratively specify a namespace’s name along with its top-level documentation and what it needs to have required, refered, used, and imported to load



successfully and work properly. It is a very thin wrapper around these functions; thus,

this pile of utility function calls:

(in-ns 'examples.ns)

(clojure.core/refer 'clojure.core :exclude '[next replace remove])

(require '(clojure [string :as string]

[set :as set])

'[clojure.java.shell :as sh])

(use '(clojure zip xml))

(import 'java.util.Date

'java.text.SimpleDateFormat

'(java.util.concurrent Executors

LinkedBlockingQueue))



is equivalent to this ns declaration:

(ns examples.ns

(:refer-clojure :exclude [next replace remove])

(:require (clojure [string :as string]

[set :as set])

[clojure.java.shell :as sh])

(:use (clojure zip xml))

(:import java.util.Date

java.text.SimpleDateFormat

(java.util.concurrent Executors

LinkedBlockingQueue)))



All the semantics for require, refer, and so on remain the same, but since ns is a macro,

(notice that keywords are being used here, e.g., :use instead of use), the extensive

quoting of names is unnecessary.

In the previous examples, we are excluding vars from clojure.core because their names (next, replace, and remove) conflict with same-named

vars defined in clojure.zip, which we use without exclusions a few lines

down. Our use of clojure.zip would override the mappings to the vars

referred from clojure.core (with a warning), but explicitly excluding

them here makes it clear to later maintainers that we’re aware of the

conflict.



Once defined, namespaces may be inspected and modified at runtime, usually via a

REPL. We talk about the different tools available for working with namespaces at runtime in “The Bare REPL” on page 399.



Namespaces and files

There are some hard-and-fast rules about how Clojure source files must be organized:6

6. Like all rules, most of these can be broken if you have a good reason to do so, but such reasons are rare.



328 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



Use one file per namespace. Each namespace should be defined in a separate file,

and this file’s location within your project’s Clojure source root must correspond with

the namespace’s segments. For example, the code for the com.mycompany.foo namespace

should be in a file located at com/mycompany/foo.clj.7 When that namespace is

required or used, e.g., by (require 'com.mycompany.foo), the file at com/mycompany/

foo.clj will be loaded, after which the namespace must be defined or an error will result.

Use underscores in filenames when namespaces contain dashes. Very simply,

if your namespace is to be com.my-project.foo, the source code for that namespace

should be in a file located at com/my_project/foo.clj. Only the filename and directories

corresponding to the namespace’s segments are affected—you would continue to refer

to the namespace in Clojure code using its declared name (e.g., (require 'com.myproject.foo), not (require 'com.my_project.foo)). This is necessary because the JVM

does not allow for dashes in class or package names, but it is generally idiomatic to use

dashes instead of underscores when naming Clojure entities, including namespaces,

vars, locals, and so on.

Start every namespace with a comprehensive ns form. The first Clojure form in

every namespace’s “root” (and usually only) file should be a well-tended ns form; bare

usages of namespace-manipulating functions like require and refer are entirely unnecessary outside of a REPL environment. Aside from just being good form, using ns:

1. Encourages the consolidation of what might otherwise be disparate usages of

require, et al.

2. Makes it easy for readers and later maintainers of your code to get an immediate

impression of how a given namespace relates to its dependencies since it is always

positioned at the top of each file.

3. Leaves the door open for refactoring and other code-manipulation tools that need

to modify sets of required namespaces, functions, and imported classes, since ns

is a macro that can accept only unevaluated names for these things.8 The unrestricted evaluation possible in conjunction with lower-level namespacemodification forms makes such tools infeasible.

Avoid cyclic namespace dependencies. The dependencies among Clojure namespaces within any application must form a directed acyclic graph; meaning, namespace

X cannot require a namespace Y which itself requires namespace X (either directly or

via one of its dependencies). Attempting to do this will result in an error like this:

#
Cyclic load dependency:

[ /some/namespace/X ]->/some/namespace/Y->[ /some/namespace/X ]>



7. These paths are relative to whatever source root you’re using. We get into the physical layout of Clojure

projects on disk in “Location, Location, Location” on page 332.

8. For example, slamhound, which adjusts which namespaces are required and used and which classes are

imported in an ns form based on the code in a given file: https://github.com/technomancy/slamhound.



Project Geography | 329



www.it-ebooks.info



Use declare to enable forward references. Clojure loads each form in each namespace’s files sequentially, resolving references to previously defined vars as it goes. This

means that referring to an undefined var will cause an error:

(defn a [x] (+ constant (b x)))

;= #
;= Unable to resolve symbol: constant in this context, compiling:(NO_SOURCE_PATH:1)>



Many languages define compilation units that allow them to find all of the “dangling”

identifiers within a program before resolving references to them; Clojure does not do

this. However, all is not lost if, for the sake of clarity or style, you want to define higherlevel functions before the lower-level ones they reference: use declare to intern a var in

the current namespace, define your higher-level function (referring freely to the

declared vars), and then define the vars that you had previously only declared:

(declare constant b)

;= #'user/b

(defn a [x] (+ constant (b x)))

;= #'user/a

(def constant 42)

;= #'user/constant

(defn b [y] (max y constant))

;= #'user/b

(a 100)

;= 142



The one wrinkle to be aware of is that, if you neglect to actually define a previously

declared var, that var will yield an unusable placeholder value when dereferenced at

runtime that will almost surely result in an exception when your higher-level code

attempts to do something with it.

Avoid single-segment namespaces. Namespaces should have multiple segments;

for example, the com.my-project.foo namespace has three segments. The reason for

this is twofold:

1. If you AOT-compile a single-segment namespace, that process will yield at least

one class file that is in the default package (i.e., is a “bare” class not in a Java

package). This can prevent the namespace from being loaded in some environments, and will always prevent the namespace’s corresponding class from being

usable from Java, due to that language’s restrictions on use of classes in the default

package.

2. Even if you’re absolutely, positively sure you’re never going to want to redistribute

AOT-compiled class files for your single-segment namespace, you still run a higher

risk of namespace clashes than is prudent, no matter how clever you are at naming

things.

Don’t think that we’re recommending that you reach for the heights of absurdity when

it comes to namespace segment depth; no one likes names like com.foo.bar.baz.fac

tory.factory.factories.Factory. However, there is some happy middle ground between that and a single segment, readily clashing namespace like app or util.

330 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



Regardless of how you organize your namespaces, they (and all other code and resources your library or application depends upon) will end up being loaded via the classpath.



A classpath primer

For programmers unfamiliar with Java, the classpath can often be a source of confusion.

The classpath is the search path that the JVM will use when looking for user-defined

libraries and resources. This path can include both directories and .zip archives, including .jar files. Clojure being hosted on the JVM, it inherits Java’s classpath system.

The classpath has its own idiosyncrasies, but it is not unique, and has many similarities

to other search path mechanisms you are surely familiar with. For example, shells in

both Unix and Windows environments define a PATH environment variable, which

stores a concatenated set of paths where executables may be found. Ruby and Python

also have search paths: Ruby stores its in the runtime variable $LOAD_PATH,9 while Python

relies upon the PYTHONPATH environment variable. In all of these cases, the search path

tends to be automatically handled by a combination of system-wide settings and dependency management tools (such as Ruby Gems or Python’s easy_install and pip).

The same autoconfiguration of the classpath is available through Leiningen and Maven,

the tools most often used for managing dependencies in Clojure projects, as well as

most popular Java IDEs and Emacs. For example, once you have defined your dependencies in your project.clj or pom.xml file, starting a REPL through either of these tools

will result in those dependencies being added to the REPL’s classpath automatically.

The same applies if you are using Leiningen or Maven plug-ins that bootstrap full applications, such as when running web applications locally via lein-ring or jetty:run.10

However, if you need to start a Java process directly within a shell, you need to construct

the classpath manually. Even if you never use Clojure from the command line, knowing

how the classpath is defined in the most fundamental way will help you understand

what more advanced tools are doing for you.

Defining the classpath. By default, the classpath is empty. This is an inconvenient

difference compared to the other search path mechanisms we mentioned, which all

include the current working directory (.) by default, so that libraries rooted there will

be found at runtime.

To set the classpath for a Java process, specify it on the command line with the -cp flag.

For example, to include the current working directory, the src directory, the clojure.jar archive file, and all .jar files in the lib directory, on Unix-like systems we’d do

this:

java -cp '.:src:clojure.jar:lib/*' clojure.main



9. Also known as $:.

10. See “Running Web Apps Locally” on page 565 for details.



Project Geography | 331



www.it-ebooks.info



As with all other search path mechanisms, the classpath is defined in a

platform-dependent manner due to differences in filename conventions

on different systems. On Unix-like systems, the classpath is

a :-delimited list of /-defined paths; on Windows, it’s a ;-delimited list

of \-defined paths. So, our example classpath above for Unix-like systems would translate to this one on Windows:

'.;src;clojure.jar;lib\*'



Classpath and the REPL.



The classpath can be inspected from Clojure at runtime:



$ java -cp clojure.jar clojure.main

Clojure 1.3.0

(System/getProperty "java.class.path")

;= "clojure.jar"



The primary classpath (held by the java.class.path system property) is defined when

the JVM process starts via command-line parameter or environment variable, but it

unfortunately cannot be changed at runtime. This is at odds with Clojure’s normal

development cycle, which tends to involve opening a persistent REPL session and leaving it open. Changes to the classpath require a JVM restart, and therefore a REPL

restart.11



Location, Location, Location

There are two predominant project layout conventions used in Clojure projects, the

defaults for which are defined by the predominant build tools used by Clojure

projects.12

First, there’s the “Maven style,” which puts all source files under a top-level src directory

but separates source files into separate subdirectories based on language and role within

a project. Primary source code that defines public APIs or shipped features goes in src/

main; code that defines unit and functional tests that isn’t generally distributed goes in

src/test, and so on:



11. There are ways to get around this. Clojure itself provides an add-classpath function, though it is

deprecated and generally not recommended. Another is pomegranate (https://github.com/cemerick/

pomegranate), which provides a maintained replacement for add-classpath that provides a way to

add .jar files and transitive Leiningen/Maven dependencies to a Clojure runtime. Finally, all sorts of JVM

module systems, including OSGi, the NetBeans module system, and JVM application servers of all stripes

provide easy ways to augment or redefine the classpath within applications or individual modules. All of

these mechanisms use facilities built in to the JVM (such as managed ClassLoader hierarchies) in order

to enable such capabilities.

12. All (decent) build tools (including Leiningen and Maven) allow you to put source files wherever you want.

These layouts are just the defaults, although it’s hard to imagine a case where it’d be worth the trouble

to not use those defaults.



332 | Chapter 8: Organizing and Building Clojure Projects



www.it-ebooks.info



Xem Thêm
Tải bản đầy đủ (.pdf) (630 trang)

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×