dasomoli/Subversion-Server Configuration

[edit]

Chapter 6. Server Configuration ¶

A Subversion repository can be accessed simultaneously by clients running on the same machine on which the repository resides using the file:/// method. But the typical Subversion setup involves a single server machine being accessed from clients on computers all over the office—or, perhaps, all over the world.

This section describes how to get your Subversion repository exposed outside its host machine for use by remote clients. We will cover Subversion's currently available server mechanisms, discussing the configuration and use of each. After reading this section, you should be able to decide which networking setup is right for your needs, and understand how to enable such a setup on your host computer.

[edit]

1.1. Overview ¶

Subversion was designed with an abstract network layer. This means that a repository can be programmatically accessed by any sort of server process, and the client “repository access” API allows programmers to write plugins that speak relevant network protocols. In theory, Subversion can sport an infinite number of network implementations. In practice, there are only two servers at the time of writing.

Apache is an extremely popular webserver; using the mod_dav_svn module, Apache can access a repository and make it available to clients via the WebDAV/DeltaV protocol, which is an extension of HTTP. In the other corner is svnserve: a small, standalone server program that speaks a custom protocol with clients. Table 6-1 presents a comparison of the two servers.

Note that Subversion, as an open-source project, does not officially endorse any server as “primary” or “official”. Neither network implementation is treated as a second-class citizen; each server has distinct advantages and disadvantages. In fact, it's possible for different servers to run in parallel, each accessing your repositories in its own way, and each without hindering the other (see the section called “Supporting Multiple Repository Access Methods”). Here's a brief overview and comparison of the two available Subversion servers—as an administrator, it's up to you to choose whatever works best for you and your users.

Table 6.1. Network Server Comparison

Feature	Apache + mod_dav_svn	svnserve
Authentication options	HTTP(S) basic auth, X.509 certificates, LDAP, NTLM, or any other mechanism available to Apache httpd	CRAM-MD5 or SSH
User account options	private 'users' file	private 'users' file, or existing system (SSH) accounts
Authorization options	blanket read/write access, or per-directory access control	blanket read/write access
Encryption	via optional SSL	via optional SSH tunnel
Interoperability	partially usable by other WebDAV clients	not interoperable
Web viewing	limited built-in support, or via 3rd-party tools such as ViewCVS	via 3rd-party tools such as ViewCVS
Speed	somewhat slower	somewhat faster
Initial setup	somewhat complex	fairly simple

[edit]

1.2. Network Model ¶

This section is a general discussion of how a Subversion client and server interact with one another, regardless of the network implementation you're using. After reading, you'll have a good understanding of how a server can behave and the different ways in which a client can be configured to respond.

[edit]

1.2.1. Requests and Responses ¶

The Subversion client spends most of its time managing working copies. When it needs information from a repository, however, it makes a network request, and the server responds with an appropriate answer. The details of the network protocol are hidden from the user; the client attempts to access a URL, and depending on the URL schema, a particular protocol is used to contact the server (see Repository URLs). Users can run svn --version to see which URL schemas and protocols the client knows how to use.

When the server process receives a client request, it typically demands that the client identify itself. It issues an authentication challenge to the client, and the client responds by providing credentials back to the server. Once authentication is complete, the server responds with the original information the client asked for. Notice that this system is different from systems like CVS, where the client pre-emptively offers credentials (“logs in”) to the server before ever making a request. In Subversion, the server “pulls” credentials by challenging the client at the appropriate moment, rather than the client “pushing” them. This makes certain operations more elegant. For example, if a server is configured to allow anyone in the world to read a repository, then the server will never issue an authentication challenge when a client attempts to svn checkout.

If the client's network request writes new data to the repository (e.g. svn commit), then a new revision tree is created. If the client's request was authenticated, then the authenticated user's name is stored as the value of the svn:author property on the new revision (see the section called “Unversioned Properties”). If the client was not authenticated (in other words, the server never issued an authentication challenge), then the revision's svn:author property is empty. 19

[edit]

1.2.2. Client Credentials Caching ¶

Many servers are configured to require authentication on every request. This can become a big annoyance to users, who are forced to type their passwords over and over again.

Happily, the Subversion client has a remedy for this: a built-in system for caching authentication credentials on disk. By default, whenever the commandline client successfully authenticates itself to a server, it saves the credentials in the user's private runtime configuration area—in ~/.subversion/auth/ on Unix-like systems or %APPDATA%/Subversion/auth/ on Windows. (The runtime area is covered in more detail in the section called “Runtime Configuration Area”.) Successful credentials are cached on disk, keyed on a combination of hostname, port, and authentication realm.

When the client receives an authentication challenge, it first looks for the appropriate credentials in the disk cache; if not present, or if the cached credentials fail to authenticate, then the client simply prompts the user for the information.

The security-paranoid people may be thinking to themselves, “Caching passwords on disk? That's terrible! You should never do that!” But please remain calm. First, the auth/ caching area is permission-protected so that only the user (owner) can read data from it, not the world at large. If that's still not safe enough for you, you can disable credential caching. To disable caching for a single command, pass the --no-auth-cache option:

$ svn commit -F log_msg.txt --no-auth-cache Authentication realm: <svn://host.example.com:3690> example realm Username: joe Password for 'joe':

Adding newfile Transmitting file data . Committed revision 2324.

# password was not cached, so a second commit still prompts us

$ svn rm newfile $ svn commit -F new_msg.txt Authentication realm: <svn://host.example.com:3690> example realm Username: joe ...

Or, if you want to disable credential caching permanently, you can edit your runtime config file (located next to the auth/ directory). Simply set store-auth-creds to no, and no credentials will be cached on disk, ever.

auth store-auth-creds = no

Sometimes users will want to remove specific credentials from the disk cache. To do this, you need to navigate into the auth/ area and manually delete the appropriate cache file. Credentials are cached in individual files; if you look inside each file, you will see keys and values. The svn:realmstring key describes the particular server realm that the file is associated with:

$ ls ~/.subversion/auth/svn.simple/ 5671adf2865e267db74f09ba6f872c28 3893ed123b39500bca8a0b382839198e 5c3c22968347b390f349ff340196ed39

$ cat ~/.subversion/auth/svn.simple/5671adf2865e267db74f09ba6f872c28

K 8 username V 3 joe K 8 password V 4 blah K 15 svn:realmstring V 45 <https://svn.domain.com:443> Joe's repository END

Once you have located the proper cache file, just delete it.

One last word about client authentication behavior: a bit of explanation about the --username and --password options is needed. Many client subcommands accept these options; however it is important to understand using these options does not automatically send credentials to the server. As discussed earlier, the server “pulls” credentials from the client when it deems necessary; the client cannot “push” them at will. If a username and/or password are passed as options, they will only be presented to the server if the server requests them. 20 Typically, these options are used when:

the user wants to authenticate as a different user than her system login name, or
a script wants to authenticate without using cached credentials.

Here is a final summary that describes how a Subversion client behaves when it receives an authentication challenge:

Check whether the user specified any credentials as commandline options, via --username and/or --password. If not, or if these options fail to authenticate successfully, then
Look up the server's realm in the runtime auth/ area, to see if the user already has the appropriate credentials cached. If not, or if the cached credentials fail to authenticate, then
Resort to prompting the user.

If the client successfully authenticates by any of the methods listed above, it will attempt to cache the credentials on disk (unless the user has disabled this behavior, as mentioned earlier.)

[edit]

1.3. svnserve, a custom server ¶

The svnserve program is a lightweight server, capable of speaking to clients over TCP/IP using a custom, stateful protocol. Clients contact an svnserve server by using URLs that begin with the svn:// or svn+ssh:// schema. This section will explain the different ways of running svnserve, how clients authenticate themselves to the server, and how to configure appropriate access control to your repositories.

[edit]

1.3.1. Invoking the Server ¶

There a few different ways to invoke the svnserve program. If invoked with no options, you'll see nothing but a help message. However, if you're planning to have inetd launch the process, then you can pass the -i (--inetd) option:

$ svnserve -i ( success ( 1 2 ( ANONYMOUS ) ( edit-pipeline ) ) )

When invoked with the --inetd option, svnserve attempts to speak with a Subversion client via stdin and stdout using a custom protocol. This is the standard behavior for a program being run via inetd. The IANA has reserved port 3690 for the Subversion protocol, so on a Unix-like system you can add lines to /etc/services like these (if they don't already exist):

svn 3690/tcp # Subversion svn 3690/udp # Subversion

And if your system is using a classic Unix-like inetd daemon, you can add this line to /etc/inetd.conf:

svn stream tcp nowait svnowner /usr/local/bin/svnserve svnserve -i

Make sure “svnowner” is a user which has appropriate permissions to access your repositories. Now, when a client connection comes into your server on port 3690, inetd will spawn an svnserve process to service it.

A second option is to run svnserve as a standalone “daemon” process. Use the -d option for this:

$ svnserve -d $ # svnserve is now running, listening on port 3690

When running svnserve in daemon mode, you can use the --listen-port= and --listen-host= options to customize the exact port and hostname to “bind” to.

There's still a third way to invoke svnserve, and that's in “tunnel mode”, with the -t option. This mode assumes that a remote-service program such as RSH or SSH has successfully authenticated a user and is now invoking a private svnserve process as that user. The svnserve program behaves normally (communicating via stdin and stdout), and assumes that the traffic is being automatically redirected over some sort of tunnel back to the client. When svnserve is invoked by a tunnel agent like this, be sure that the authenticated user has full read and write access to the repository database files. (See Servers and Permissions: A Word of Warning.) It's essentially the same as a local user accessing the repository via file:/// URLs.

Servers and Permissions: A Word of Warning

First, remember that a Subversion repository is a collection of BerkeleyDB database files; any process which accesses the repository directly needs to have proper read and write permissions on the entire repository. If you're not careful, this can lead to a number of headaches. Be sure to read the section called “Supporting Multiple Repository Access Methods”.

Secondly, when configuring svnserve, Apache httpd, or any other server process, keep in mind that you might not want to launch the server process as the user root (or as any other user with unlimited permissions). Depending on the ownership and permissions of the repositories you're exporting, it's often prudent to use a different—perhaps custom—user. For example, many administrators create a new user named svn, grant that user exclusive ownership and rights to the exported Subversion repositories, and only run their server processes as that user.

Once the svnserve program is running, it makes every repository on your system available to the network. A client needs to specify an absolute path in the repository URL. For example, if a repository is located at /usr/local/repositories/project1, then a client would reach it via svn://host.example.com/usr/local/repositories/project1 . To increase security, you can pass the -r option to svnserve, which restricts it to exporting only repositories below that path:

$ svnserve -d -r /usr/local/repositories …

Using the -r option effectively modifies the location that the program treats as the root of the remote filesystem space. Clients then use URLs that have that path portion removed from them, leaving much shorter (and much less revealing) URLs:

$ svn checkout svn://host.example.com/project1 …

[edit]

1.3.2. Built-in authentication and authorization ¶

When a client connects to an svnserve process, the following things happen:

The client selects a specific repository.
The server processes the repository's conf/svnserve.conf file, and begins to enforce any authentication and authorization policies defined therein.
Depending on the situation and authorization policies,
- the client may be allowed to make requests anonymously, without ever receiving an authentication challenge, OR
- the client may be challenged for authentication at any time, OR
- if operating in “tunnel mode”, the client will declare itself to be already externally authenticated.

At the time of writing, the server only knows how to send a CRAM-MD5 21 authentication challenge. In essence, the server sends a bit of data to the client. The client uses the MD5 hash algorithm to create a fingerprint of the data and password combined, then sends the fingerprint as a response. The server performs the same computation with the stored password to verify that the result is identical. At no point does the actual password travel over the network.

It's also possible, of course, for the client to be externally authenticated via a tunnel agent, such as SSH. In that case, the server simply examines the user it's running as, and uses it as the authenticated username.

As you've already guessed, a repository's svnserve.conf file is the central mechanism for controlling authentication and authorization policies. The file has the same format as other configuration files (see the section called “Runtime Configuration Area”): section names are marked by square brackets ( and ), comments begin with hashes (#), and each section contains specific variables that can be set (variable = value). Let's walk through this file and learn how to use them.

[edit]

1.3.2.1. Create a 'users' file and realm ¶

For now, the general section of the svnserve.conf has all the variables you need. Begin by defining a file which contains usernames and passwords, and an authentication realm:

general password-db = userfile realm = example realm

The realm is a name that you define. It tells clients which sort of “authentication namespace” they're connecting to; the Subversion client displays it in the authentication prompt, and uses it as a key (along with the server's hostname and port) for caching credentials on disk (see the section called “Client Credentials Caching”.) The password-db variable points to a separate file that contains a list of usernames and passwords, using the same familiar format. For example:

users harry = foopassword sally = barpassword

The value of password-db can be an absolute or relative path to the users file. For many admins, it's easy to keep the file right in the conf/ area of the repository, alongside svnserve.conf. On the other hand, it's possible you may want to have two or more repositories share the same users file; in that case, the file should probably live in a more public place. The repositories sharing the users file should also be configured to have the same realm, since the list of users essentially defines an authentication realm. Wherever the file lives, be sure to set the file's read and write permissions appropriately. If you know which user(s) svnserve will run as, restrict read access to the user file as necessary.

[edit]

1.3.2.2. Set access controls ¶

There are two more variables to set in the svnserve.conf file: they determine what unauthenticated (anonymous) and authenticated users are allowed to do. The variables anon-access and auth-access can be set to the values none, read, or write. Setting the value to none restricts all access of any kind; read allows read-only access to the repository, and write allows complete read/write access to the repository. For example:

general password-db = userfile realm = example realm

# anonymous users can only read the repository anon-access = read

# authenticated users can both read and write auth-access = write

The example settings are, in fact, the default values of the variables, should you forget to define them. If you want to be even more conservative, you can block anonymous access completely:

general password-db = userfile realm = example realm

# anonymous users aren't allowed anon-access = none

# authenticated users can both read and write auth-access = write

Notice that svnserve only understands “blanket” access control. A user either has universal read/write access, universal read access, or no access. There is no detailed control over access to specific paths within the repository. For many projects and sites, this level of access control is more than adequate. However, if you need per-directory access control, you'll need to use Apache instead of svnserve as your server process.

[edit]

1.3.3. SSH authentication and authorization ¶

svnserve's built-in authentication can be very handy, because it avoids the need to create real system accounts. On the other hand, some administrators already have well-established SSH authentication frameworks in place. In these situations, all of the project's users already have system accounts and the ability to “SSH into” the server machine.

It's easy to use SSH in conjunction with svnserve. The client simply uses the svn+ssh:// URL schema to connect:

$ whoami harry

$ svn list svn+ssh://host.example.com/repos/project harry@host.example.com's password: *****

foo bar baz …

What's happening here is that the Subversion client is invoking a local ssh process, connecting to host.example.com, authenticating as the user harry, then spawning a private svnserve process on the remote machine, running as the user harry. The svnserve command is being invoked in tunnel mode (-t) and all network protocol is being “tunneled” over the encrypted connection by ssh, the tunnel-agent. svnserve is aware that it's running as the user harry, and if the client performs a commit, the authenticated username will be attributed as the author of the new revision.

When running over a tunnel, authorization is primarily controlled by operating system permissions to the repository's database files; it's very much the same as if Harry were accessing the repository directly via a file:/// URL. If multiple system users are going to be accessing the repository directly, you may want to place them into a common group, and you'll need to be careful about umasks. (Be sure to read the section called “Supporting Multiple Repository Access Methods”.) But even in the case of tunneling, the svnserve.conf file can still be used to block access, by simply setting auth-access = read or auth-access = none.

You'd think that the story of SSH tunneling would end here, but it doesn't. Subversion allows you to create custom tunnel behaviors in your run-time config file (see the section called “Runtime Configuration Area”.) For example, suppose you want to use RSH instead of SSH. In the tunnels section of your config file, simply define it like this:

tunnels rsh = rsh

And now, you can use this new tunnel definition by using a URL schema that matches the name of your new variable: svn+rsh://host/path. When using the new URL schema, the Subversion client will actually be running the command rsh host svnserve -t behind the scenes. If you include a username in the URL (for example, svn+rsh://username@host/path) the client will also include that in its command (rsh username@host svnserve -t.) But you can define new tunneling schemes to be much more clever than that:

tunnels joessh = $JOESSH /opt/alternate/ssh -p 29934

This example demonstrates a couple of things. First, it shows how to make the Subversion client launch a very specific tunneling binary (the one located at /opt/alternate/ssh) with specific options. In this case, accessing a svn+joessh:// URL would invoke the particular SSH binary with -p 29934 as arguments—useful if you want the tunnel program to connect to a non-standard port.

Second, it shows how to define a custom environment variable that can override the name of the tunneling program. Setting the SVN_SSH environment variable is a convenient way to override the default SSH tunnel agent. But if you need to have several different overrides for different servers, each perhaps contacting a different port or passing a different set of options, you can use the mechanism demonstrated in this example. Now if we were to set the JOESSH environment variable, its value would override the entire value of the tunnel variable—$JOESSH would be executed instead of /opt/alternate/ssh -p 29934.

A final note: when using svn+ssh:// URLs to access a repository, remember that it's the ssh program prompting you for authentication, and not the svn program. That means there's no automatic password caching going on (see the section called “Client Credentials Caching”). If you want to prevent ssh from repeatedly asking for your password, you'll need to use a separate memory-caching tool like ssh-agent on a Unix-like system, or pageant on Windows.

[edit]

1.4. httpd, the Apache HTTP server ¶

The Apache HTTP Server is a “heavy duty” network server that Subversion can leverage. Via a custom module, httpd makes Subversion repositories available to clients via the WebDAV/DeltaV protocol, which is an extension to HTTP 1.1 (see http://www.webdav.org/ for more information.) This protocol takes the ubiquitous HTTP protocol that is the core of the World Wide Web, and adds writing—specifically, versioned writing—capabilities. The result is a standardized, robust system that is conveniently packaged as part of the Apache 2.0 software, is supported by numerous operating systems and third-party products, and doesn't require network administrators to open up yet another custom port. 22 While an Apache-Subversion server has more features than svnserve, it's also a bit more difficult to set up. With flexibility often comes more complexity.

Much of the following discussion includes references to Apache configuration directives. While some examples are given of the use of these directives, describing them in full is outside the scope of this chapter. The Apache team maintains excellent documentation, publicly available on their website at http://httpd.apache.org. For example, a general reference for the configuration directives is located at http://httpd.apache.org/docs-2.0/mod/directives.html.

Also, as you make changes to your Apache setup, it is likely that somewhere along the way a mistake will be made. If you are not already familiar with Apache's logging subsystem, you should become aware of it. In your httpd.conf file are directives that specify the on-disk locations of the access and error logs generated by Apache (the CustomLog and ErrorLog directives, respectively). Subversion's mod_dav_svn uses Apache's error logging interface as well. You can always browse the contents of those files for information that might reveal the source of a problem that is not clearly noticeable otherwise.

Why Apache 2?

If you're a system administrator, it's very likely that you're already running the Apache web server and have some prior experience with it. At the time of writing, Apache 1.3 is by far the most popular version of Apache. The world has been somewhat slow to upgrade to the Apache 2.X series for various reasons: some people fear change, especially changing something as critical as a web server. Other people depend on plug-in modules that only work against the Apache 1.3 API, and are waiting for a 2.X port. Whatever the reason, many people begin to worry when they first discover that Subversion's Apache module is written specifically for the Apache 2 API.

The proper response to this problem is: don't worry about it. It's easy to run Apache 1.3 and Apache 2 side-by-side; simply install them to separate places, and use Apache 2 as a dedicated Subversion server that runs on a port other than 80. Clients can access the repository by placing the port number into the URL:

$ svn checkout http://host.example.com:7382/repos/project …

[edit]

1.4.1. Prerequisites ¶

To network your repository over HTTP, you basically need four components, available in two packages. You'll need Apache httpd 2.0, the mod_dav DAV module that comes with it, Subversion, and the mod_dav_svn filesystem provider module distributed with Subversion. Once you have all of those components, the process of networking your repository is as simple as:

getting httpd 2.0 up and running with the mod_dav module,
installing the mod_dav_svn plugin to mod_dav, which uses Subversion's libraries to access the repository, and
configuring your httpd.conf file to export (or expose) the repository.

You can accomplish the first two items either by compiling httpd and Subversion from source code, or by installing pre-built binary packages of them on your system. For the most up-to-date information on how to compile Subversion for use with the Apache HTTP Server, as well as how to compile and configure Apache itself for this purpose, see the INSTALL file in the top level of the Subversion source code tree.

[edit]

1.4.2. Basic Apache Configuration ¶

Once you have all the necessary components installed on your system, all that remains is the configuration of Apache via its httpd.conf file. Instruct Apache to load the mod_dav_svn module using the LoadModule directive. This directive must precede any other Subversion-related configuration items. If your Apache was installed using the default layout, your mod_dav_svn module should have been installed in the modules subdirectory of the Apache install location (often /usr/local/apache2). The LoadModule directive has a simple syntax, mapping a named module to the location of a shared library on disk:

LoadModule dav_svn_module modules/mod_dav_svn.so

Note that if mod_dav was compiled as a shared object (instead of statically linked directly to the httpd binary), you'll need a similar LoadModule statement for it, too. Be sure that it comes before the mod_dav_svn line:

LoadModule dav_module modules/mod_dav.so LoadModule dav_svn_module modules/mod_dav_svn.so

At a later location in your configuration file, you now need to tell Apache where you keep your Subversion repository (or repositories). The Location directive has an XML-like notation, starting with an opening tag, and ending with a closing tag, with various other configuration directives in the middle. The purpose of the Location directive is to instruct Apache to do something special when handling requests that are directed at a given URL or one of its children. In the case of Subversion, you want Apache to simply hand off support for URLs that point at versioned resources to the DAV layer. You can instruct Apache to delegate the handling of all URLs whose path portions (the part of the URL that follows the server's name and the optional port number) begin with /repos/ to a DAV provider whose repository is located at /absolute/path/to/repository using the following httpd.conf syntax:

DAV svn SVNPath /absolute/path/to/repository

</Location>

If you plan to support multiple Subversion repositories that will reside in the same parent directory on your local disk, you can use an alternative directive, the SVNParentPath directive, to indicate that common parent directory. For example, if you know you will be creating multiple Subversion repositories in a directory /usr/local/svn that would be accessed via URLs like http://my.server.com/svn/repos1, http://my.server.com/svn/repos2, and so on, you could use the httpd.conf configuration syntax in the following example:

DAV svn

# any "/svn/foo" URL will map to a repository /usr/local/svn/foo SVNParentPath /usr/local/svn

</Location>

Using the previous syntax, Apache will delegate the handling of all URLs whose path portions begin with /svn/ to the Subversion DAV provider, which will then assume that any items in the directory specified by the SVNParentPath directive are actually Subversion repositories. This is a particularly convenient syntax in that, unlike the use of the SVNPath directive, you don't have to restart Apache in order to create and network new repositories.

Be sure that when you define your new Location, it doesn't overlap with other exported Locations. For example, if your main DocumentRoot is /www, do not export a Subversion repository in <Location /www/repos>. If a request comes in for the URI /www/repos/foo.c, Apache won't know whether to look for a file repos/foo.c in the DocumentRoot, or whether to delegate mod_dav_svn to return foo.c from the Subversion repository.

Server Names and the COPY Request

Subversion makes use of the COPY request type to perform server-side copies of files and directories. As part of the sanity checking done by the Apache modules, the source of the copy is expected to be located on the same machine as the destination of the copy. To satisfy this requirement, you might need to tell mod_dav the name you use as the hostname of your server. Generally, you can use the ServerName directive in httpd.conf to accomplish this.

ServerName svn.example.com

If you are using Apache's virtual hosting support via the NameVirtualHost directive, you may need to use the ServerAlias directive to specify additional names that your server is known by. Again, refer to the Apache documentation for full details.

At this stage, you should strongly consider the question of permissions. If you've been running Apache for some time now as your regular web server, you probably already have a collection of content—web pages, scripts and such. These items have already been configured with a set of permissions that allows them to work with Apache, or more appropriately, that allows Apache to work with those files. Apache, when used as a Subversion server, will also need the correct permissions to read and write to your Subversion repository. (See Servers and Permissions: A Word of Warning.)

You will need to determine a permission system setup that satisfies Subversion's requirements without messing up any previously existing web page or script installations. This might mean changing the permissions on your Subversion repository to match those in use by other things that Apache serves for you, or it could mean using the User and Group directives in httpd.conf to specify that Apache should run as the user and group that owns your Subversion repository. There is no single correct way to set up your permissions, and each administrator will have different reasons for doing things a certain way. Just be aware that permission-related problems are perhaps the most common oversight when configuring a Subversion repository for use with Apache.

[edit]

1.4.3. Authentication Options ¶

At this point, if you configured httpd.conf to contain something like

DAV svn SVNParentPath /usr/local/svn

</Location>

...then your repository is “anonymously” accessible to the world. Until you configure some authentication and authorization policies, the Subversion repositories you make available via the Location directive will be generally accessible to everyone. In other words,

anyone can use their Subversion client to checkout a working copy of a repository URL (or any of its subdirectories),
anyone can interactively browse the repository's latest revision simply by pointing their web browser to the repository URL, and
anyone can commit to the repository.

[edit]

1.4.3.1. Basic HTTP Authentication ¶

The easiest way to authenticate a client is via the HTTP Basic authentication mechanism, which simply uses a username and password to verify that a user is who she says she is. Apache provides an htpasswd utility for managing the list of acceptable usernames and passwords, those to whom you wish to grant special access to your Subversion repository. Let's grant commit access to Sally and Harry. First, we need to add them to the password file.

$ ### First time: use -c to create the file $ ### Use -m to use MD5 encryption of the password, which is more secure $ htpasswd -cm /etc/svn-auth-file harry New password: ***** Re-type new password: ***** Adding password for user harry $ htpasswd /etc/svn-auth-file -m sally New password: ******* Re-type new password: ******* Adding password for user sally $

Next, you need to add some more httpd.conf directives inside your Location block to tell Apache what to do with your new password file. The AuthType directive specifies the type of authentication system to use. In this case, we want to specify the Basic authentication system. AuthName is an arbitrary name that you give for the authentication domain. Most browsers will display this name in the pop-up dialog box when the browser is querying the user for his name and password. Finally, use the AuthUserFile directive to specify the location of the password file you created using htpasswd.

After adding these three directives, your <Location> block should look something like this:

DAV svn SVNParentPath /usr/local/svn AuthType Basic AuthName "Subversion repository" AuthUserFile /etc/svn-auth-file

</Location>

This <Location> block is not yet complete, and will not do anything useful. It's merely telling Apache that whenever authorization is required, Apache should harvest a username and password from the Subversion client. What's missing here, however, are directives that tell Apache which sorts of client requests require authorization. Wherever authorization is required, Apache will demand authentication as well. The simplest thing to do is protect all requests. Adding Require valid-user tells Apache that all requests require an authenticated user:

DAV svn SVNParentPath /usr/local/svn AuthType Basic AuthName "Subversion repository" AuthUserFile /etc/svn-auth-file Require valid-user

</Location>

Be sure to read the next section (the section called “Authorization Options”) for more detail on the Require directive and other ways to set authorization policies.

One word of warning: HTTP Basic Auth passwords pass in very nearly plain-text over the network, and thus are extremely insecure. If you're worried about password snooping, it may be best to use some sort of SSL encryption, so that clients authenticate via https:// instead of http://; at a bare minimum, you can configure Apache to use a self-signed server certificate. 23 Consult Apache's documentation (and OpenSSL documentation) about how to do that.

[edit]

1.4.3.2. SSL Certificate Management ¶

Businesses that need to expose their repositories for access outside the company firewall should be conscious of the possibility that unauthorized parties could be “sniffing” their network traffic. SSL makes that kind of unwanted attention less likely to result in sensitive data leaks.

If a Subversion client is compiled to use OpenSSL, then it gains the ability to speak to an Apache server via https:// URLs. The Neon library used by the Subversion client is not only able to verify server certificates, but can also supply client certificates when challenged. When the client and server have exchanged SSL certificates and successfully authenticated one another, all further communication is encrypted via a session key.

It's beyond the scope of this book to describe how to generate client and server certificates, and how to configure Apache to use them. Many other books, including Apache's own documentation, describe this task. But what can be covered here is how to manage server and client certificates from an ordinary Subversion client.

When speaking to Apache via https://, a Subversion client can receive two different types of information:

a server certificate
a demand for a client certificate

If the client receives a server certificate, it needs to verify that it trusts the certificate: is the server really who it claims to be? The OpenSSL library does this by examining the signer of the server certificate, or certifying authority (CA). If OpenSSL is unable to automatically trust the CA, or if some other problem occurs (such as an expired certificate or hostname mismatch), the Subversion commandline client will ask you whether you want to trust the server certificate anyway:

$ svn list https://host.example.com/repos/project

Error validating server certificate for 'https://home.example.com:443':

- The certificate is not issued by a trusted authority. Use the

fingerprint to validate the certificate manually!

Certificate information:

- Hostname: host.example.com - Valid: from Jan 30 19:23:56 2004 GMT until Jan 30 19:23:56 2006 GMT - Issuer: CA, example.com, Sometown, California, US - Fingerprint: 7d:e1:a9:34:33:39:ba:6a:e9:a5:c4:22:98:7b:76:5c:92:a0:9c:7b

(R)eject, accept (t)emporarily or accept (p)ermanently?

This dialogue should look familiar; it's essentially the same question you've probably seen coming from your web browser (which is just another HTTP client like Subversion!). If you choose the (p)ermanent option, the server certificate will be cached in your private run-time auth/ area in just the same way your username and password are cached (see the section called “Client Credentials Caching”.) If cached, Subversion will automatically remember to trust this certificate in future negotiations.

Your run-time servers file also gives you the ability to make your Subversion client automatically trust specific CAs, either globally or on a per-host basis. Simply set the ssl-authority-files variable to a semicolon-separated list of PEM-encoded CA certificates:

global ssl-authority-files = /path/to/CAcert1.pem;/path/to/CAcert2.pem

Many OpenSSL installations also have a pre-defined set of “default” CAs that are nearly universally trusted. To make the Subversion client automatically trust these standard authorities, set the ssl-trust-default-ca variable to true.

When talking to Apache, a Subversion client might also receive a challenge for a client certificate. Apache is asking the client to identify itself: is the client really who it says it is? If all goes correctly, the Subversion client sends back a private certificate signed by a CA that Apache trusts. A client certificate is usually stored on disk in encrypted format, protected by a local password. When Subversion receives this challenge, it will ask you for both a path to the certificate and the password which protects it:

$ svn list https://host.example.com/repos/project

Authentication realm: https://host.example.com:443 Client certificate filename: /path/to/my/cert.p12 Passphrase for '/path/to/my/cert.p12': ******** …

Notice that the client certificate is a “p12” file. To use a client certificate with Subversion, it must be in PKCS#12 format, which is a portable standard. Most web browsers are already able to import and export certificates in that format. Another option is to use the OpenSSL commandline tools to convert existing certificates into PKCS#12.

Again, the runtime servers file allows you to automate this challenge on a per-host basis. Either or both pieces of information can be described in runtime variables:

groups examplehost = host.example.com

examplehost ssl-client-cert-file = /path/to/my/cert.p12 ssl-client-cert-password = somepassword

Once you've set the ssl-client-cert-file and ssl-client-cert-password variables, the Subversion client can automatically respond to a client certificate challenge without prompting you. 24

[edit]

1.4.4. Authorization Options ¶

At this point, you've configured authentication, but not authorization. Apache is able to challenge clients and confirm identities, but it has not been told how to allow or restrict access to the clients bearing those identities. This section describes two strategies for controlling access to your repositories.

[edit]

1.4.4.1. Blanket Access Control ¶

The simplest form of access control is to authorize certain users for either read-only access to a repository, or read/write access to a repository.

You can restrict access on all repository operations by adding the Require valid-user directive to your <Location> block. Using our previous example, this would mean that only clients that claimed to be either harry or sally, and provided the correct password for their respective username, would be allowed to do anything with the Subversion repository:

DAV svn SVNParentPath /usr/local/svn

# how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file
# only authenticated users may access the repository Require valid-user

</Location>

Sometimes you don't need to run such a tight ship. For example, Subversion's own source code repository at http://svn.collab.net/repos/svn allows anyone in the world to perform read-only repository tasks (like checking out working copies and browsing the repository with a web browser), but restricts all write operations to authenticated users. To do this type of selective restriction, you can use the Limit and LimitExcept configuration directives. Like the Location directive, these blocks have starting and ending tags, and you would nest them inside your <Location> block.

The parameters present on the Limit and LimitExcept directives are HTTP request types that are affected by that block. For example, if you wanted to disallow all access to your repository except the currently supported read-only operations, you would use the LimitExcept directive, passing the GET, PROPFIND, OPTIONS, and REPORT request type parameters. Then the previously mentioned Require valid-user directive would be placed inside the <LimitExcept> block instead of just inside the <Location> block.

DAV svn SVNParentPath /usr/local/svn

# how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file

# For any operations other than these, require an authenticated user. <LimitExcept GET PROPFIND OPTIONS REPORT>

Require valid-user

</LimitExcept>

</Location>

These are only a few simple examples. For more in-depth information about Apache access control and the Require directive, take a look at the Security section of the Apache documentation's tutorials collection at http://httpd.apache.org/docs-2.0/misc/tutorials.html.

[edit]

1.4.4.2. Per-Directory Access Control ¶

It's possible to set up finer-grained permissions using a second Apache httpd module, mod_authz_svn. This module grabs the various opaque URLs passing from client to server, asks mod_dav_svn to decode them, and then possibly vetoes requests based on access policies defined in a configuration file.

If you've built Subversion from source code, mod_authz_svn is automatically built and installed alongside mod_dav_svn. Many binary distributions install it automatically as well. To verify that it's installed correctly, make sure it comes right after mod_dav_svn's LoadModule directive in httpd.conf:

LoadModule dav_module modules/mod_dav.so LoadModule dav_svn_module modules/mod_dav_svn.so LoadModule authz_svn_module modules/mod_authz_svn.so

To activate this module, you need to configure your Location block to use the AuthzSVNAccessFile directive, which specifies a file containing the permissions policy for paths within your repositories. (In a moment, we'll discuss the format of that file.)

Apache is flexible, so you have the option to configure your block in one of three general patterns. To begin, choose one of these basic configuration patterns. (The examples below are very simple; look at Apache's own documentation for much more detail on Apache authentication and authorization options.)

The simplest block is to allow open access to everyone. In this scenario, Apache never sends authentication challenges, so all users are treated as “anonymous”.

Example 6.1. A sample configuration for anonymous access.

DAV svn SVNParentPath /usr/local/svn

# our access control policy AuthzSVNAccessFile /path/to/access/file

</Location>

On the opposite end of the paranoia scale, you can configure your block to demand authentication from everyone. All clients must supply credentials to identify themselves. Your block unconditionally requires authentication via the Require valid-user directive, and defines a means to authenticate.

Example 6.2. A sample configuration for authenticated access.

DAV svn SVNParentPath /usr/local/svn

# our access control policy AuthzSVNAccessFile /path/to/access/file

# only authenticated users may access the repository Require valid-user

# how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file

</Location>

A third very popular pattern is to allow a combination of authenticated and anonymous access. For example, many administrators want to allow anonymous users to read certain repository directories, but want only authenticated users to read (or write) more sensitive areas. In this setup, all users start out accessing the repository anonymously. If your access control policy demands a real username at any point, Apache will demand authentication from the client. To do this, you use both the Satisfy Any and Require valid-user directives together.

Example 6.3. A sample configuration for mixed authenticated/anonymous access.

DAV svn SVNParentPath /usr/local/svn

# our access control policy AuthzSVNAccessFile /path/to/access/file

# try anonymous access first, resort to real # authentication if necessary. Satisfy Any Require valid-user

# how to authenticate a user AuthType Basic AuthName "Subversion repository" AuthUserFile /path/to/users/file

</Location>

Once your basic Location block is configured, you can create an access file and define some authorization rules in it.

The syntax of the access file is the same familiar one used by svnserve.conf and the runtime configuration files. Lines that start with a hash (#) are ignored. In its simplest form, each section names a repository and path within it, and the authenticated usernames are the option names within each section. The value of each option describes the user's level of access to the repository path: either r (read-only) or rw (read-write). If the user is not mentioned at all, no access is allowed.

To be more specific: the value of the section-names are either of the form repos-name:path or the form path. If you're using the SVNParentPath directive, then it's important to specify the repository names in your sections. If you omit them, then a section like /some/dir will match the path /some/dir in every repository. If you're using the SVNPath directive, however, then it's fine to only define paths in your sections—after all, there's only one repository.

calc:/branches/calc/bug-142 harry = rw sally = r

In this first example, the user harry has full read and write access on the /branches/calc/bug-142 directory in the calc repository, but the user sally has read-only access. Any other users are blocked from accessing this directory.

Of course, permissions are inherited from parent to child directory. That means that we can specify a subdirectory with a different access policy for Sally:

calc:/branches/calc/bug-142 harry = rw sally = r

# give sally write access only to the 'testing' subdir calc:/branches/calc/bug-142/testing sally = rw

Now Sally can write to the testing subdirectory of the branch, but can still only read other parts. Harry, meanwhile, continues to have complete read-write access to the whole branch.

It's also possible to explicitly deny permission to someone via inheritance rules, by setting the username variable to nothing:

calc:/branches/calc/bug-142 harry = rw sally = r

calc:/branches/calc/bug-142/secret harry =

In this example, Harry has read-write access to the entire bug-142 tree, but has absolutely no access at all to the secret subdirectory within it.

The thing to remember is that the most specific path always matches first. The mod_authz_svn module tries to match the path itself, and then the parent of the path, then the parent of that, and so on. The net effect is that mentioning a specific path in the accessfile will always override any permissions inherited from parent directories.

By default, nobody has any access to the repository at all. That means that if you're starting with an empty file, you'll probably want to give at least read permission to all users at the root of the repository. You can do this by using the asterisk variable (*), which means “all users”:

/ * = r

This is a common setup; notice that there's no repository name mentioned in the section name. This makes all repositories world readable to all users, whether you're using SVNPath or SVNParentPath. Once all users have read-access to the repositories, you can give explicit rw permission to certain users on specific subdirectories within specific repositories.

The asterisk variable (*) is also worth special mention here: it's the only pattern which matches an anonymous user. If you've configured your Location block to allow a mixture of anonymous and authenticated access, all users start out accessing Apache anonymously. mod_authz_svn looks for a * value defined for the path being accessed; if it can't find one, then Apache demands real authentication from the client.

The access file also allows you to define whole groups of users, much like the Unix /etc/group file:

groups calc-developers = harry, sally, joe paint-developers = frank, sally, jane everyone = harry, sally, joe, frank, sally, jane

Groups can be granted access control just like users. Distinguish them with an “at” (@) prefix:

calc:/projects/calc @calc-developers = rw

paint:/projects/paint @paint-developers = rw jane = r

...and that's pretty much all there is to it.

[edit]

1.4.5. Extra Goodies ¶

We've covered most of the authentication and authorization options for Apache and mod_dav_svn. But there are a few other nice features that Apache provides.

[edit]

1.4.5.1. Repository Browsing ¶

One of the most useful benefits of an Apache/WebDAV configuration for your Subversion repository is that the youngest revisions of your versioned files and directories are immediately available for viewing via a regular web browser. Since Subversion uses URLs to identify versioned resources, those URLs used for HTTP-based repository access can be typed directly into a Web browser. Your browser will issue a GET request for that URL, and based on whether that URL represents a versioned directory or file, mod_dav_svn will respond with a directory listing or with file contents.

Since the URLs do not contain any information about which version of the resource you wish to see, mod_dav_svn will always answer with the youngest version. This functionality has the wonderful side-effect that you can pass around Subversion URLs to your peers as references to documents, and those URLs will always point at the latest manifestation of that document. Of course, you can even use the URLs as hyperlinks from other web sites, too.

You generally will get more use out of URLs to versioned files—after all, that's where the interesting content tends to lie. But you might have occasion to browse a Subversion directory listing, where you'll quickly note that the generated HTML used to display that listing is very basic, and certainly not intended to be aesthetically pleasing (or even interesting). To enable customization of these directory displays, Subversion provides an XML index feature. A single SVNIndexXSLT directive in your repository's Location block of httpd.conf will instruct mod_dav_svn to generate XML output when displaying a directory listing, and to reference the XSLT stylesheet of your choice:

DAV svn SVNParentPath /usr/local/svn SVNIndexXSLT "/svnindex.xsl" …

</Location>

Using the SVNIndexXSLT directive and a creative XSLT stylesheet, you can make your directory listings match the color schemes and imagery used in other parts of your website. Or, if you'd prefer, you can use the sample stylesheets provided in the Subversion source distribution's tools/xslt/ directory. Keep in mind that the path provided to the SVNIndexXSLT directory is actually a URL path—browsers need to be able to read your stylesheets in order to make use of them!

Can I view older revisions?

With an ordinary web browser? In one word: nope. At least, not with mod_dav_svn as your only tool.

Your web browser only speaks ordinary HTTP. That means it only knows how to GET public URLs, which represent the latest versions of files and directories. According to the WebDAV/DeltaV spec, each server defines a private URL syntax for older versions of resources, and that syntax is opaque to clients. To find an older version of a file, a client must follow a specific procedure to “discover” the proper URL; the procedure involves issuing a series of WebDAV PROPFIND requests and understanding DeltaV concepts. This is something your web browser simply can't do.

So to answer the question, one obvious way to see older revisions of files and directories is by passing the --revision argument to the svn list and svn cat commands. To browse old revisions with your web browser, however, you can use third-party software. A good example of this is ViewCVS (http://viewcvs.sourceforge.net/). ViewCVS was originally written to display CVS repositories through the web, and the latest bleeding-edge versions (at the time of writing) are able to understand Subversion repositories as well.

[edit]

1.4.5.2. Other Features ¶

Several of the features already provided by Apache in its role as a robust Web server can be leveraged for increased functionality or security in Subversion as well. Subversion communicates with Apache using Neon, which is a generic HTTP/WebDAV library with support for such mechanisms as SSL (the Secure Socket Layer, discussed earlier) and Deflate compression (the same algorithm used by the gzip and PKZIP programs to “shrink” files into smaller chunks of data). You need only to compile support for the features you desire into Subversion and Apache, and properly configure the programs to use those features.

Deflate compression places a small burden on the client and server to compress and decompress network transmissions as a way to minimize the size of the actual transmission. In cases where network bandwidth is in short supply, this kind of compression can greatly increase the speed at which communications between server and client can be sent. In extreme cases, this minimized network transmission could be the difference between an operation timing out or completing successfully.

Less interesting, but equally useful, are other features of the Apache and Subversion relationship, such as the ability to specify a custom port (instead of the default HTTP port 80) or a virtual domain name by which the Subversion repository should be accessed, or the ability to access the repository through a proxy. These things are all supported by Neon, so Subversion gets that support for free.

Finally, because mod_dav_svn is speaking a semi-complete dialect of WebDAV/DeltaV, it's possible to access the repository via third-party DAV clients. Most modern operating systems (Win32, OS X, and Linux) have the built-in ability to mount a DAV server as a standard network “share”. This is a complicated topic; for details, read Appendix C, WebDAV and Autoversioning.

[edit]

1.5. Supporting Multiple Repository Access Methods ¶

You've seen how a repository can be accessed in many different ways. But is it possible—or safe—for your repository to be accessed by multiple methods simultaneously? The answer is yes, provided you use a bit of foresight.

At any given time, these processes may require read and write access to your repository:

regular system users using a Subversion client (as themselves) to access the repository directly via file:/// URLs;
regular system users connecting to SSH-spawned private svnserve processes (running as themselves) which access the repository;
an svnserve process—either a daemon or one launched by inetd—running as a particular fixed user;
an Apache httpd process, running as a particular fixed user.

The most common problem administrators run into is repository ownership and permissions. Does every process (or user) in the previous list have the rights to read and write the Berkeley DB files? Assuming you have a Unix-like operating system, a straightforward approach might be to place every potential repository user into a new svn group, and make the repository wholly owned by that group. But even that's not enough, because a process may write to the database files using an unfriendly umask—one that prevents access by other users.

So the next step beyond setting up a common group for repository users is to force every repository-accessing process to use a sane umask. For users accessing the repository directly, you can make the svn program into a wrapper script that first sets umask 002 and then runs the real svn client program. You can write a similar wrapper script for the svnserve program, and add a umask 002 command to Apache's own startup script, apachectl. For example:

$ cat /usr/local/bin/svn

#!/bin/sh

umask 002 /usr/local/subversion/bin/svn "$@"

Another common problem is often encountered on Unix-like systems. As a repository is used, BerkeleyDB occasionally creates new logfiles to journal its actions. Even if the repository is wholly owned by the svn group, these newly created files won't necessarily be owned by that same group, which then creates more permissions problems for your users. A good workaround is to set the group SUID bit on the repository's db directory. This causes all newly-created logfiles to have the same group owner as the parent directory.

Once you've jumped through these hoops, your repository should be accessible by all the necessary processes. It may seem a bit messy and complicated, but the problems of having multiple users sharing write-access to common files are classic ones that are not often elegantly solved.

Fortunately, most repository administrators will never need to have such a complex configuration. Users who wish to access repositories that live on the same machine are not limited to using file:// access URLs—they can typically contact the Apache HTTP server or svnserve using localhost for the server name in their http:// or svn:// URLs. And to maintain multiple server processes for your Subversion repositories is likely to be more of a headache than necessary. We recommend you choose the server that best meets your needs and stick with it!

The svn+ssh:// server checklist

It can be quite tricky to get a bunch of users with existing SSH accounts to share a repository without permissions problems. If you're confused about all the things that you (as an administrator) need to do on a Unix-like system, here's a quick checklist that resummarizes some of things discussed in this section:

All of your SSH users need to be able to read and write to the repository. Put all the SSH users into a single group. Make the repository wholly owned by that group, and set the group permissions to read/write.
Your users need to use a sane umask when accessing the repository. Make sure that svnserve (/usr/local/bin/svnserve, or wherever it lives in $PATH) is actually a wrapper script which sets umask 002 and executes the real svnserve binary. Take similar measures when using svnlook and svnadmin. Either run them with a sane umask, or wrap them as described above.
When BerkeleyDB creates new logfiles, they need to be owned by the group as well, so make sure you run chmod g+s on the repository's db directory.

19 This problem is actually a FAQ, resulting from a misconfigured server setup.

20 Again, a common mistake is to misconfigure a server so that it never issues an authentication challenge. When users pass --username and --password options to the client, they're surprised to see that they're never used, i.e. new revisions still appear to have been committed anonymously!

21 See RFC 2195.

22 They really hate doing that.

23 While self-signed server certificates are still vulnerable to a “man in the middle” attack, such an attack is still much more difficult for a casual observer to pull off, compared to sniffing unprotected passwords.

24 More security-conscious folk might not want to store the client certificate password in the runtime servers file.