Deployment with Capistrano

by Omar Meeky

Issue: Vol 2, Issue 1 - All Stuff, No Fluff

published in June 2010

Omar is a software developer living in Cairo, Egypt. His interests are every thing related to technology, sports or science. He is currently a partner in Mash Ltd. located in Egypt and enjoys writing about Rails from time to time. Omar can be reached at cousine.tumblr.com and via twitter @cousine.

Introduction

Deployment

Since the dawn of software development, developers have always considered the “Deployment” phase of the software life-cycle, (although not explicitly defined but inferred), by the last two activities of the life-cycle; Verification and Maintenance.

Deployment has been (and most probably for many developers still is) a manual, tedious and error prone task. Personally, I always feared the day, I had to deploy an application for a client. One had to actually edit the configurations appropriate for the production server, upload the application to the server, run the tests and setup the web server configuration. You can imagine the frustration if one of those steps failed over a remote connection.

This is where Rails and Capistrano come in. Rails being very organized and having a lovely environment setup; Development, Test and Production, and Capistrano making use of source-control and versioning tools like Git and SVN to automate the deployment process.

What is Capistrano?

As stated on the project's website, "Capistrano is a tool for automating tasks on one or more remote servers. It executes commands in parallel on all targeted machines, and provides a mechanism for rolling back changes across multiple machines."

In short, Capistrano lets you write in pure Ruby, a series of tasks to be performed at deployment, which makes it easy to perform tests, run tasks, migrate databases and configure your web server with just one command, fully automating the process on many remote servers, without the need of SSH-ing or scripting.

Why should you use Capistrano for deployment?

Personally, the previous features were just enough for me to start digging into Capistrano and convincing my company to use it, but if you still need reasons to start using Capistrano then get ready because the next reason is a gift for all developers.

Imagine having an application running on several remote servers, and your team discovers a bug. It's insane to try and re-deploy the application on each and every server, wasting the team's precious time.

For our team, this was the case for our off the shelf CMS. Once we were required to update the system with a patch/feature, we had to go through each installation we had done since the last update. This took several days of effort, to sync each server with our development environment.

Capistrano lets you forget all of that and with just one command, empowers you to move from one version of your application to another and install all the required packages on multiple servers in parallel.

With these powerful features, you may think that you will have to learn a new DSL or scripting language to use Capistrano, but you couldn't have been more mistaken. Capistrano lets you write all the tasks and configurations in pure Ruby, just as you would write rake tasks.

Server requirements

Capistrano expects a few requirements before you can use it for deploying your application:

SSH based access, neither Telnet nor FTP are supported by Capistrano
Your server has a POSIX-compatible shell installed and named "sh" residing in the default system path. (If like most, you are using a unix based server, you shouldn't worry about this requirement)
You have one password used for all password protected areas and tasks on your server (if you are using one), unless (preferably) you are using public/private key based authentication, have a good password set on your key
Having some familiarity with command-line, since Capistrano has no GUI, all tasks are run in command-line
Familiarity with the Ruby language (obviously)
Rubygems v1.3.x for Ruby

Those are the official requirements mentioned for Capistrano. For this article, I will be using the following setup:

Ubuntu Hardy Heron (8.04 LTS) server setup with git and public/private key authentication
GIT source-control and version management
Ruby 1.8.6
Rails 2.3.5
Windows development environment (yes it's true :D)

Installing Capistrano

As I mentioned earlier, I am a Windows user (though not happy), and even though all the steps I mentioned in the article are platform agnostic, I will not be going into the details of Windows development environment setup. I am using cygwin, which is pretty much a port of the unix shell on Windows, so if you are a happy OSX user or *nix user, you will be able to follow along easily.

Capistrano comes in GEM packaging, so in order to install it; we would simply type in the command:

gem install capistrano (use sudo for *nix systems and OSX)

This will install Capistrano and it's dependencies on your system, and now you can capify your application. Capifying your application is simply configuring it for deployment with Capistrano, and to do that, run the following command in your application's root directory:

capify .

This will create a “capfile” under your application's root directory and a deploy.rb file inside your config directory. The “capfile” is simply a ruby script, that tells Capistrano, the servers you wish to connect to and the tasks you wish to perform. But for DRY and organization, “capfile” command sets up your “capfile” as an entry point for Capistrano, loading other files according to the namespaces you provide when you deploy just as you would with rake tasks. The file with all these configuration settings is deploy.rb and we will be covering that in the next section.

Capifying your application

Now that you have installed Capistrano and created the Capfile for your application, it's time to start writing tasks for deployment. So fireup your favorite IDE or texteditor and open config/deploy.rb, and you will find that Capistrano has filled out the file with some basics settings:

set :application, "set your application name here"
set :repository, "set your repository location here"

set :scm, :subversion

# Or: 'accurev', 'bzr', 'cvs', 'darcs', 'git',

# 'mercurial', 'perforce', 'subversion' or 'none'

# Your HTTP server, Apache/etc

role :web, "your web-server here"

# This may be the same as your `Web` server

role :app, "your app-server her"

# This is where Rails migrations will run

role :db, "your primary db-server here",

:primary => true

role :db, "your slave db-server here"

# If you are using Passenger mod_rails uncomment

# this. If you're still using the

# script/reapear helper you will need these

# http://github.com/rails/irs_process_scripts

# namespace :deploy do
#   task :start {}
#   task :stop {}
#   task :restart, :roles => :app,

# :except => { :no_release => true } do
#     run "#{try_sudo} touch #{

# File.join(current_path,'tmp','restart.txt')}"
#   end
# end

The file is divided into two sections, the first is the configuration section where you tell capistrano all the information it needs, and the second is the tasks section where you define tasks to perform remotely on your server(s).

General options

To configure your deployment script you will need to setup some options. Beginning with the application option, this is where you would specify your application name.

set :application, "My App"

You will also need to set the domain attribute, this is the URL where your application will be hosted.

set :domain, "www.your-app.com"

Capistrano additionally needs to know where your application would be deployed, you can do that by setting the deploy_to attribute to the deploy path on your server(s).

set :deploy_to, "/path/to/your/deployed/app"

By default Capistrano will prefix all commands performed on the remote servers with sudo, if you wish to override this behavior simply set the use_sudo attribute to false.

set :use_sudo, false

Usually, servers use non-conventional ports for critical protocols, if you are not using the default ssh port (22), you will need to set the port attribute so Capistrano can connect to your servers.

set :port, 999 # replace with your port number

The repository attribute configures your repository location, this location is where your application resides, which can be a simple “.” to denote the current directory, or your repository URL if you are using source control, and in the case, it should look something like this (if you are using git):

set :repository, "[email protected]:YOUR-APPLICATION.git"

Now let's have a look at some more specific configuration options.

Source control

To use source control, you would need to tell Capistrano which source control manager you are using, which can be done via the :scm attribute:

set :scm, :git

Capistrano also supports AccuRev, Bazaar, Darcs, CVS, Subversion, Mercurial, and Perforce, and will use your repository trunk. If you wish to use a specific branch, you can set the branch attribute:

set :branch, "BRANCH_NAME"

If you are not using public/private keys to access your repository, you will need to setup the :scm_passphrase attribute or else you will be prompted while deploying your application:

set :scm_passphrase, "YOUR_PASSWORD"

set :scm_password, "YOUR_PASSWORD"

If both the attributes are the same, just use the one you prefer. On the other hand, if you are using public/private keys, Capistrano will use the the default key found in your ssh configuration directory, unless you define something different in ssh_options[:keys] hash.

ssh_options[:keys] = %w(path/to/your/key)

Additionally, Capistrano will use the username you are logged in as locally to deploy, which of course can be a problem if your server (like mine) has a special user setup for deployment (for security reasons) or you are working as part of a team (of course every member has his/her own username). To solve this, we can use the user attribute.

set :user, “USERNAME”

Though not recommended, you can also just have Capistrano copy your files over to the server, and to do so you can set :scm to :none, and use the copy deployment strategy.

set :repository, "."

set :scm, :none

set :deploy_via, :copy

Don't worry if you don't understand deployment strategies just yet, we are going to explain those in the next section.

Deployment strategies

Deployment strategies define how Capistrano would upload your code to your servers, there are four deployment strategies that Capistrano offer: checkout, copy, export, and remote cache, and each has its pros and cons, and really depends on your project and network setup.

Checkout and export reflect their functions in SVN, so if you use SVN expect the same behavior; checkout performs checkout command on your repository, this makes it easy to update your code in subsequent deploys.

Export on the other hand, performs an export, which extracts a copy of the HEAD, minus the source control meta data (.git, .svn, …, etc), but the exported version cannot be updated from the repository afterwards.

Copy deployment strategy as covered above, performs a simple copy/paste operation to upload your application. This strategy was mainly added for those who struggle with firewalls or network problems. It's not only limited to those not using source control; but in case you are using scm, Capistrano performs a checkout by default on your repository, compresses the code and copies it over scp to your servers, where it is extracted again.

If you prefer using export than checkout, you can set the copy_strategy to export.

set :copy_strategy, :export

For faster deployments, you could also set the copy_cache attribute to true; this will checkout (or export) your code once to a new directory on the server and just re-sync that directory in subsequent deployments. Additionally you could exclude files by specifying them in the copy_exclude attribute, notice that the copy_exclude attribute takes a file glob (or an array of globs).

set :copy_cache, true

set :copy_exclude, ".git/*"

Note that setting the copy_cache attribute to true will ignore the copy_strategy set. Also if you would like your cache to be placed somewhere specific, you can specify the path instead of true in the copy_cache attribute.

set :copy_cache, “path/to/your/cache”

Another useful customization is the copy_compression option, which specifies which type of compression to be used between gzip, zip or bz2, and gzip is used by default.

set :copy_compression, :gzip

The last strategy that Capistrano offers is remote cache, remote cache uses a working copy of your code stored in a 'cache' on the target server(s) to speed up deployment.

This uses the repository_cache attribute to identify the path where the cache will be stored, which by default is :shared_path + 'cached-copy/' (by default :deploy_to + 'shared/')

Remote cache works by targeting the cache directory and making sure it matches your repository by updating it to the latest version via git pull or svn update depending on your scm. It then copies the cache to your :deploy_to location. You could also use the copy_exclude attribute to exclude files from the copy process.

set :deploy_via, :remote_cache

set :deploy_to, "/path/to/www"

Now our configuration part is complete, next we will have a look at the roles and how we can define multiple servers.

Roles

Roles are named sets of servers, you can execute against in your tasks, and three roles are defined by default, namely, app, web and db.

role :app, "your app-server here"

role :web, "your web-server here"

role :db, "your db-server here", :primary => true

If you are using a single server for your application, those three roles are identical,

role :app, "www.your-app.com"
role :web, "www.your-app.com"
role :db, "www.your-app.com", :primary => true

The app role is where your application is run, i.e. this is where your ruby/rails daemon runs. The web role defines where your incoming requests are handled, and is usually the frontend URL where your web server is running.

The db role as stated in Capistrano's documentation, is just used for specifying which server should be used to run Rails migrations, and by setting the primary attribute to true, Capistrano will only run the migrations on that box.

Note that the db role was not meant to specify a database server that is not running Rails application code.

You can also create custom roles like so

role :multiserver_role, "www.your-app.com", “www.another-url.com"

role :single_server, "www.a-server.com"

Now you can use those roles in your tasks,. We will have a look at tasks next, to see how they are defined.

Tasks

Capistrano has a unique way of naming deployment scripts; a recipe is a collection of tasks, and a task is just like a rake task, or in other words, you create deployment recipes.

To create a task, you just add it to the end of your deploy script:

desc "task description"

task :do_something_interesting do
# your interesting code here
end

You can then execute that task by running a simple rake-like command in your application root directory:

cap do_something_interesting

This will by default execute the task on all roles defined, and to override this behavior, you can specify which role you want your task to run against:

desc "task description”"

task :do_something_interesting, :role => :app do

# your interesting code here

end

Now when you run your cap command, do_something_interesting task will run only against the app role.

You could also want your tasks to run automatically with default tasks, for example you want to run the touch_restart_txt after each deploy, and to do so you can simply use the before and after callbacks.

before "deploy:update_code",

:do_something_interesting

after "deploy:update_code",

:touch_restart_txt

after "deploy:update_code",

:do_another_interesting_thing

This will execute all those tasks when a deploy:update_code is performed in the following order,

:do_something_interesting
:touch_restart_txt
:do_another_interesting_thing

One more important thing to cover next is dependencies. Let's take a look.

Dependencies

Most probably you are using gems, directories or commands that your application depends on, Capistrano lets you define those dependencies, whether local or remote, using the depend method:

depend :remote, :gem, "cucumber", ">=0.3.5"

depend :local, :command, "git"

depend :remote, :directory,

"/path/to/dependency/directory"

Capistrano can now use that information to check your dependencies on different machines when you deploy, and this is useful when you want to check if the server is ready for your application. You can do so by running the deploy:check task

cap deploy:check

This will check directory permissions, necessary utilities, etc, along with your custom defined dependencies.

Our deploy file is now ready, but before we deploy just yet, we have to attend to some limitations in Capistrano.

Setting up the database

Currently Capistrano doesn't fully automate the process of setting up your database (yet), so to prepare your database you will need to login to your server and create the databases you're going to use.

Assuming you are using MySQL, here is a short example:

$ ssh <user>@yourserver.com
yourserver.com$ mysql -p
Enter password:
………
mysql > CREATE DATABASE <db-name>;
Query OK, 1 row affected (0.00 sec)
mysql> exit;

Starting your application

After deploying, Capistrano will try to run your application, and for that to work you either need to create a “spin” script in your script folder of your application, or override the deploy:start task. However the “spin” approach is much handy and cleaner.

Create a file in script folder named spin, and lets make it call our “spawner” script which resides in the script/process folder The spawner script is no longer included in core Rails starting Rails 2.3, and to get the scripts you need to install the irs_process_scripts plugin.

#!/bin/sh
/var/www/your_app/current/script/process/spawner \
mongrel \
--environment=production \
--instances=1 \
--address=127.0.0.1 \
--port=#{port}

Next we need to mark the file as executable (if running on *nix or osx)

$ chmod +x script/spin

Then add the file to your source control repository.

Deploying the application

Now that our deploy recipe is complete, we now need to deploy our application, but first lets create our directory structure on the server(s)

cap deploy:setup

This will create the directory structure for the deployment process as follows:

(deploy_to)/
(deploy_to)/releases
(deploy_to)/shared
(deploy_to)/shared/system
(deploy_to)/shared/pids
(deploy_to)/shared/log

The releases folder holds every version that you deploy, and it is quite useful when you want to revert back to a previous version of your application. However, the shared folder is static, and it shares all it's contents with all the releases. It is useful to put stuff like images which do not change that often between releases.

Before we deploy we need to check if the server is ready for deployment

cap deploy:check

If any problem occurs, we should fix it and then re-run the check task again. Once it passes, we can push the code to the server, by running the following command in your application root folder:

cap deploy:update

This will copy your code to the server(s) and set a symlink in your deploy_to path to the release, called “current”, but it will not start your application just yet, and it is useful to detect problems.

The next step would be loading your schema, to do so log into your server and change into the current release directory (i.e. deploy_to/current) and run the following command

$ rake RAILS_ENV=production db:schema:load

If that succeeds, we can test if the application starts up normally by running the console

$ script/console production

once the application is started normally, we can test an HTTP request by using the app helper in the console.

>> app.get(“/”)

If the return is 2xx, 3xx, or even 4xx, then all is ok, if it is in the 500s, then you should track down the problem in your production.log file.

We can now safely start our application by running the command:

cap deploy:start

Once the command is done, you should - theoretically - be able to access your application through the browser. If you have an error such as a “proxy” error then that means the webserver is trying to proxy to a wrong port, or that your dispatcher is not running.

Now that our application is running fine, we can test restarting the application, and we can do that by running the following command:

cap deploy:restart

If you can still access your application through the browser then the restart is successful, otherwise you should troubleshoot the problems by examining your production log file.

Finally, we can perform a full deploy, and nothing should go wrong using the following command:

cap deploy

As usual, track down any problems that occur and fix them. Once that is done, you have deployed your application successfully!! Congratulations.

Conclusion

In this article, we have seen how Capistrano is a handy tool for deployment, and how to setup a basic deployment recipe. We have only scratched the surface, and Capistrano is very rich in useful features to manage the production stage of your applications. So I encourage you to check the Rdocs and the FAQ on the project's website.

Resources

Capistrano website
http://www.capify.org/index.php/Frequently_Asked_Questions#How_do_I_prepare_the_database.3F
http://www.capify.org/index.php/Frequently_Asked_Questions
irs_process_scripts plugin
Glob