A Shining Ruby in Production Environments

Even the most beautiful Rails application can lose its elegance if not deployed correctly. Like other Ruby frameworks or languages, such as Sinatra, Rails is based on the Rack interface. This article provides a basic introduction to Rack hosting and Rack-based application deployments.

When Rails first was released in 2005, developers exulted. Finally, a comprehensive open-source framework for Web applications was available, packed with a set of tools making Web development fast, productive and fun. Rails has the reputation of being a "heaven for developers", but despite the many facilities it provides for avoiding typical and repetitive tasks, there is still a weak spot: deployment. Deploying a Rails application is not a smooth matter. Everyone knows that Rails applications will be published on-line one day, but not precisely how.

Platform as a Service (PaaS)

Developers often choose to purchase hosting space as Platform as a Service (for example, Heroku, OpenShift or EngineYard). PaaS is marvelous as it provides a ready-to-use environment containing a full stack of software dependencies. Publishing on a PaaS platform is, as a rule, easy, fast and everything tends to work (almost) immediately. But there are at least two cases when PaaS won't fit your needs: when applications must be kept in the customer's private infrastructure or when applications have superior hardware or software requirements—for instance, when you need a specific software service not supported by your PaaS provider.

In such situations, you must implement custom virtual server configurations and custom deployment procedures. You can deploy Rails applications on servers or on virtual machines. The availability of entire cloud services like Amazon Web Services (AWS), which allow you to create complex infrastructures made of several Web servers, database servers and front-end balancing machines, is hugely growing in popularity. This approach is very flexible, although you must access, install and manage the operating system and the distribution packages, configure the network, activate the services, and so on and so forth. In this article, I describe the Rack-based hosting software requirements and some basic example configurations to implement automated Ruby hosting on a GNU/Linux server.

RVM

First, if you want to host Ruby software, you must install the Ruby platform. You can install Ruby and gems with apt-get or yum. It's easy, but when your application requires specific gem versions or specific interpreter versions, you will face a common problem. How can you satisfy these requests if your GNU/Linux distribution doesn't package those specific versions? Furthermore, how can you maintain multiple Ruby versions in a clean and repeatable manner?

You may think you can just download the Ruby platform and compile it manually. It's guaranteed that you can install the interpreter versions and the gem versions you need. Unfortunately, this is totally inconvenient. This kind of software management makes your configuration hard to update.

There are several solutions for overcoming these common issues. The one I find more reliable for server environments is named Ruby enVironment Manager (RVM). RVM comes packed with a set of scripts that helps you install and update the Ruby ecosystem.

Download RVM by issuing the following command as root:


# \curl -L https://get.rvm.io | bash -s stable

Despite the fact that it's recommended that you work with RVM using security facilities as sudo, the rvm executable must be available in your root $PATH environment, so install it as root. For a multiuser RVM installation, typical for servers, the software is kept by default in the /usr/local/rvm directory, so you can remove the whole distribution safely with an rm -fr /usr/local/rvm command.

Before proceeding with the Ruby installation, make sure your system is ready to compile Ruby. Check that you have the rvm command available in your PATH (if not, log out and log in again or reload your shell with bash -s), and execute the following command:


$ sudo rvm requirements

RVM will install, through yum or apt-get, the required packages to compile the Ruby distribution. In this article, I use the stable official Ruby distribution called MRI, Matz Ruby Interpreter (derived from the name of Ruby's creator, Yukihiro Matsumoto).

Now, you'll likely need to add to your future Rubies some basic libraries typically needed by some complex gems or software. Setting up such libraries immediately will guarantee that the Ruby software will never complain that the system libraries are old or incompatible, generating annoying errors. Previously, you would have installed these extra packages via the rvm pkg install <pkg> command, but now RVM deprecates this. Instead, simply enable autolibs to delegate to RVM the responsibility to build coherent and not-buggy distributions:


$ sudo rvm autolibs enable

You finally are ready to provide your environment a full Ruby distribution. For example, let's install the latest stable version of the official MRI interpreter, the 2.0.0 version:


$ sudo rvm install 2.0.0

If everything goes well, the distribution is available for root and the system users. If not, it's commonly a $PATH problem, so adjust it in the /etc/profile.d, and also to avoid deployment pitfalls, verify that the $GEM_HOME variable is exported to the correct gem path. In practice, if something is not working properly, set the following variables like this:


if [ -d "/usr/local/rvm/bin" ] ; then
    PATH="/usr/local/rvm/gems/ruby-2.0.0-p353@global/bin:
↪/usr/local/rvm/bin:$PATH"
    GEM_HOME="/usr/local/rvm/gems/ruby-2.0.0-p353@global"
fi

You can list the available Ruby versions with this command:


$ rvm list known

On a system running multiple Rubies, users and system processes may load other environment versions with a command like this:


$ rvm use jruby-1.7.1

And set the default system distribution in this way:


$ rvm --default use 2.0

The Web Server

Ruby on Rails, like Sinatra and many other popular Ruby frameworks or Domain Specific Languages, is based on an interface named Rack. Rack provides the minimal abstraction possible between Web servers supporting Ruby and Ruby frameworks. Rack is responsible for invoking the main instance of your application as specified in the startup file, config.ru.

So, a Web server hosting Ruby Web applications will have to understand how Rack talks. With a stable and clean Ruby environment, you're ready to build your Web server that is capable of speaking Rack.

With Ruby, you can choose between many Web servers. You may have heard of Mongrel, Unicorn, Thin, Reel or Goliath. For typical Rails deployments, Passenger is one of the most popular choices. It integrates well with Apache and Nginx, so in this example, let's set up an Apache + Passenger configuration.

Passenger Installation

Passenger, developed by Phusion, also formerly known as mod_rails or mod_rack, is a module that allows you to publish Ruby applications in the popular Web server containers Apache or Nginx. Passenger is available as a "community" free edition and as an enterprise release, which includes commercial support and advanced features.

If you chose to install Ruby through packages, Passenger is conveniently available through RPM or DEB repositories, and yum or apt-get will install all the required software.

On an RVM-customized system, to install the free version of Passenger, you need to add the gem through Ruby gems:


$ sudo gem install passenger

Now you can install the server module (the latest version at the time of this writing is 4.0.33) by executing a script provided by the gem:


# passenger-install-apache2-module

Let's select Ruby only, and let's skip Python, Node.js and Meteor support. If your system misses software requirements, the script will give you a tip to the exact command line for yum or apt-get to meet those dependencies.

After some compile time, you will be introduced to Passenger configuration with useful and self-explanatory output. Specifically, copy to the directives that load Passenger into Apache in your main Apache configuration file (apache2.conf or httpd.conf):


LoadModule passenger_module
/usr/local/rvm/gems/ruby-2.0.0-p353/gems/passenger-4.0.33/
↪buildout/apache2/mod_passenger.so
PassengerRoot /usr/local/rvm/gems/ruby-2.0.0-p353/gems/
↪passenger-4.0.33
PassengerDefaultRuby /usr/local/rvm/wrappers/ruby-2.0.0-p353/ruby

Finally, restart Apache. Et voilà, now you can host Ruby Web applications.

Virtual Hosts

If your goal is to host one or more Ruby applications on the same server, you should activate each instance as a virtual host. The most significant directive with Ruby hosting is the DocumentRoot. It's mandatory that it points to the public/ directory in the application's root project directory. The public/ directory is the default public path of a Rails application. So let's say you have a Kolobok application made in Rails, and you have to deploy it to the DNS zone kolobok.example.com on the kolobok.example.com server. Here is an example VirtualHost:


<VirtualHost *:80>
      ServerName kolobok.example.com
      DocumentRoot /srv/www/kolobok/public
      <Directory /srv/www/kolobok/public>
         # This relaxes Apache security settings.
         AllowOverride all
         # MultiViews must be turned off.
         Options -MultiViews
      </Directory>
</VirtualHost>
]]>
</code></pre>


Now, if you have put your application in /srv/www/kolobok, and it's well configured (configured and binded to the database and so on), enable the virtual host, reload Apache, and your application is published.

Automating Software Deployments

Ages ago, it was common to deploy Web applications by doing a bulk copy of files via FTP, from the developer's desktop to the server hosting space, or by downloading through Subversion or Git. Although this approach still works for simpler PHP applications, it won't fit more complex projects made using more complex frameworks, such as Rails.

In fact, a Rails application is not made only of the source code files. To make a Rails application ready, you have to download and compile its dependencies as gems (by running bundle), safely manage database access and other configurations, migrate the database (create the database and the schema by executing a list of files containing SQL instructions in the Ruby language), adjust paths for shared content (like images, videos and so on), precompile the assets (that is, optimizing static content, such as JavaScript and CSS), and perform many other steps in a large and complex work flow. You can execute these steps by writing your own scripts, maybe in Ruby or bash, but this task is tedious and wastes your time. You should instead invest your time by writing good tests.

The Ruby community provides several ways to accomplish the whole deploy task, and one very popular method uses Capistrano. Capistrano lets you write a set of "recipes" that will "cook" your application in the production environment. Common tasks executed by Capistrano are: 1) pulling the source code from a git or svn repository; 2) putting it in the right location; 3) checking if a bundle is needed and, if yes, bundling your gems; 4) checking if migrations are required and, if yes, running them; 5) checking if assets precompile is required and, if yes, precompiling; and 6) checking other Rake tasks you have defined and running them in order. If the whole recipe fails, Capistrano will keep the current software release in production; otherwise, it will substitute the latest release with the one you've just deployed. Capistrano is a largely tested and very reliable tool. You definitely can trust it.

Configuring Capistrano

To use Capistrano, you just need to install it through Ruby gems on the system where the deploy will be done (not on the server):


$ gem install capistrano

When Capistrano is available, you'll have two new binaries in your PATH: capify and cap. With capify, you build your deploy skeleton. So, cd to the project directory and type:


$ capify .

This command creates a file named Capfile and a config/deploy.rb file. Capfile tells Capistrano where the right deploy.rb configuration file is. This is the file that includes your recipes, and typically it's kept in the project's config/ directory.

Next, verify that Capistrano is installed correctly, and see the many useful tasks it comes with:


$ cap -T
cap deploy                # Deploys your project.
cap deploy:check          # Tests deployment dependencies.
cap deploy:cleanup        # Cleans up old releases.
cap deploy:cold           # Deploys and starts a 'cold' application.
cap deploy:create_symlink # Updates the symlink to the most recently
                          # deployed...
cap deploy:migrations     # Deploys and runs pending migrations.
cap deploy:pending        # Displays the commits since your last 
                          # deploy.
cap deploy:pending:diff   # Displays the 'diff' since your last 
                          # deploy.
cap deploy:rollback       # Rolls back to a previous version and 
                          # restarts.
cap deploy:rollback:code  # Rolls back to the previously deployed 
                          # version.
cap deploy:setup          # Prepares one or more servers for 
                          # deployment.
cap deploy:symlink        # Deprecated API.
cap deploy:update         # Copies your project and updates the 
                          # symlink.
cap deploy:update_code    # Copies your project to the remote 
                          # servers.
cap deploy:upload         # Copies files to the currently deployed 
                          # version.
cap invoke                # Invokes a single command on the remote 
                          # servers.
cap link_shared           # Link cake, configuration, themes, upload, 
                          # tool
cap shell                 # Begins an interactive Capistrano session.

The user that will deploy the application will need valid SSH access to the server (in order to perform remote commands with Capistrano) and write permissions to the directory where the project will be deployed. The directory structure created on the server in this directory allows you to maintain software releases. In the project's document root, Capistrano keeps two directories, one that contains the released software (releases/, by default it keeps the latest ten releases), and another that contains shared or static data (shared/). Moreover, Capistrano manages a symbolic link named current that always points to the most recent successfully deployed release.

In practice, each time Capistrano is invoked to deploy an application, it connects via SSH, creates a temporary release directory named with the current timestamp (for example, releases/20140115120050), and runs the process (pull, bundle, migrate and so on). If it finishes with no errors, as final step, Capistrano links the symlink "current" to releases/20140115120050. Otherwise, it keeps "current" symlinked with the latest directory where the deploy was successful.

So with Capistrano, the system administrator will set the virtual server DocumentRoot directive to the current directory of the released application version:


DocumentRoot /srv/www/kolobok/current/public

The Anatomy of a deploy.rb File

A deploy.rb file is virtually made of two parts: one that defines the standard configurations, like the repository server or the path to deploy files physically, and another that includes custom tasks defined by the developer responsible for deploying the application.

Let's deploy the Kolobok application. Open the kolobok/config/deploy.rb file with your favourite editor, delete the example configuration and begin to code it from scratch. A deploy.rb file is programmed in Ruby, so you can use Ruby constructs in your tasks, beyond the Capistrano "keywords".

Let's start by requiring a library:


require "bundler/capistrano"

This statement orders Capistrano to do the gem bundle each time it's necessary. Good gem files separate required dependency gems in this way:


group :test do
  gem 'rspec-rails'
  gem 'capybara'
  gem 'factory_girl_rails'
end


group :production do    
  gem 'execjs'
  gem 'therubyracer'  
  gem 'coffee-rails', '~> 3.1.1'
end

Only the gems common to all environments and included in the :production group are bundled. Gems belonging to :development and :test environments are not. And the first time you deploy your application, a bundle install is executed to bundle all the requirements as specified. The next time you deploy the software, gems are downloaded, compiled or removed only if the Gemfile and the Gemfile.lock have changed. The complete bundle is installed in shared/ and soft-linked into the current instance. By following this approach, less disk space is required.

Then, from Rails 3.1, it's common to release applications with the assets pipeline. The pipeline is active if in config/environments/production.rb the following variable is set to true:


config.assets.compile = true

If your application will use the pipeline, you need to precompile it. The Rake task to precompile assets is bundle exec rake assets:precompile. To insert this task into your work flow and keep the generated assets pipeline in shared/ and linked into the current release, load the standard assets functionality:


load "deploy/assets"

After loading the main requirements, specify the application name, the path on the server where it will be deployed, and the user allowed to SSH:


set :application, "kolobok"
set :deploy_to, "/srv/www/kolobok" 
et :user, "myuser"

With Rails > 3, it's recommended to invoke Rake (it's used to do the database migrations and to precompile the assets pipeline) with the correct bundled Rake version in the bundle. So, specify the exact rake command:


set :rake, 'bundle exec rake'

Now it's time to configure the repository from which to pull the project source code:


set :scm, :git
set :branch, "master"
set :repository, "git://github.com/myusername/kolobok.git"

Finally, set the server names:


role :web, "kolobok.example.com"
role :app, "kolobok.example.com"
role :db,  "mydb.example.com", :primary => true 

web is the address of the responding Web server, and app is where the application will be deployed. These roles are the same if the application runs on only one host rather than on a cluster. db is the address of the database, and primary => true means that migrations will be run there.

Now you have a well-defined deploy.rb and the right server configurations. Begin by creating the structure tree (releases/ and static/) on the server, from the desktop host:


$ cap deploy:setup

Releasing Software

After having set up the project directory on the server, run the first deploy:


$ cap deploy:cold

The actions performed by Capistrano follow the standard pattern: git checkout, bundle, execute migrations, assets precompile. If everything is fine, your application is finally published as a reliable versioned release, with a current symlink.

Normal deploys (skipping the first Rails app configuration, such as creating the database) will be done in the future by invoking:


$ cap deploy

If you notice that some errors occurred with the current application in production, you immediately can roll back to the previous release by calling Capistrano like this:


$ cap deploy:rollback

Easy, reliable and smart, isn't it?

Custom Tasks

When you deploy a more complex application, you'll normally be handling more complex recipes than the standard Capistrano procedure. For example, if you want to publish an application on GitHub and release it open source, you won't put configurations there (like credentials to access databases or session secret tokens). Rather, it's preferable to copy them in shared/ on the server and link them on the fly before modifying the database or performing your tasks.

In Capistrano, you can define hooks to actions to force the tool to execute required actions before or after other actions. It might be useful, for instance, to link a directory where users of kolobok have uploaded files. If you move the current directory to another release path, you might discover that those files are no longer available to users. So, you can define a final task that, after having deployed code, links the shared/uploads into your current release in public/uploads directory. Notice how this can be managed with ease by exploiting the presence of the shared_path and release_path paths variables:


desc "Link uploaded directory"
task :link_uploads do
  run "ln -nfs #{shared_path}/uploads 
   ↪#{release_path}/public/uploads"
end

Finally, another common task to perform is to restart the application instance into the server container. In case of Passenger, it's enough to touch the tmp/restart.txt file. So, you can write:


desc "Restart Passenger" 
task :restart do
  run "cd #{current_path} && touch tmp/restart.txt" 
end

You execute these two tasks automatically by hooking them at the end of the deploy flow. So add this extra line just before the tasks definitions:


after "deploy:update_code", :link_uploads, :restart

Performance Issues?

People often complain of Rails' performance in production environments. This is a tricky topic. Tuning servers and application responsiveness are rather hard tasks that cannot be discussed briefly, so I don't cover them here. To make your application faster, you should involve several technologies and engineering patterns, like setting intermediate caching services, serving static and dynamic content with different server containers and monitoring the application with tools like New Relic to find bottlenecks. After having set up the right environment to host the application, this is the next challenge—optimizing. Happy deploys!

Resources

Rack: https://rack.github.com

RVM: https://rvm.io

Phusion Passenger: https://www.phusionpassenger.com

Capistrano: https://github.com/capistrano/capistrano

Load Disqus comments