At the Forge - Fixtures and Factories

by Reuven M. Lerner

One of the points of pride in the Ruby community is the degree to which developers are focused on testing. As I wrote last month, tests in a dynamic language have more potential to correct more errors and keep your code trim and functional than even the best compliers. Rails developers are used to working with three different types of tests: unit (for database models), functional (for controller classes) and integration (for testing things from a user's perspective). Combined with coverage and analysis tools, such as the metric_fu gem I described last month, these tests can help ensure that your code is as solid as possible before it is seen by the general public.

Testing your code requires that you provide it with inputs and that you then match those inputs with expected outputs. When it comes to a Web application, those inputs most likely will come from either a relational database or from a user's form submission. Testing form submissions is not particularly difficult, especially in a framework such as Rails, which has extensive testing support built in. Testing data that comes from a database, however, can be a bit more challenging, because it means that you must somehow store the data in the database so that the tests can access it.

One possible solution, of course, is to pre-populate the database tables with test data directly. But, as simple and obvious as that solution might appear at first glance, it assumes that you have a source from which you can pre-populate the database. You could do it by hand, but then you'll find that any modifications your program makes to the database—creating, updating and deleting rows—either will stay in effect for the next test or will need to be reloaded from scratch from another source.

In other words, you need a way to put the test database into a known state before you begin your tests. If you know this beginning state, you can write tests that check subsequent states.

The question is, how do you create that initial state? From the time that Rails was first released, the answer was fixtures—text files containing YAML-formatted hand-crafted data. Fixtures are nice, but as a number of Rails developers have written over the years, they can be hard to write, hard to keep track of and generally brittle.

This month, I take a look at the current state of loading data into a test database. I start by examining fixtures, exploring some ways you still might be able to make them useful inside your tests. Then, I cover a newer approach to test data, known as factories, looking at the Factory Girl gem and then taking a quick peek at the Machinist gem, both of which are in widespread use among Rails developers and might be a better fit than plain-old fixtures for your project.

Creating Your Application

Fixtures, as I mentioned above, are YAML files containing data that can be loaded into a database. Rails actually allows you to put your fixture data in formats other than YAML, such as CSV. However, my guess is that CSV is mostly unused, and that YAML is the format used by almost everyone working with fixtures.

I created a simple Rails application (using SQLite) on my computer with:

rails --database=sqlite3 appointments

Then, I generated a RESTful resource for people:

./script/generate scaffold person \
      first_name:string last_name:string email:string

This not only created a model for working with people, but also a controller for handling the basic RESTful functions, views for all of those controller actions, a database migration that uses Ruby to describe my model and even some rudimentary tests. I can import the database migrations with:

rake db:migrate

And, voilà! I now have a working application that allows me to add, delete, modify and list a bunch of people. You might have noticed that I named my Rails application appointments. My plan is to create a very simple appointment calendar, so that I can keep track of with whom I'll be meeting. So, I create another resource, named meetings:

./script/generate scaffold meeting \
      starting_at:timestamp ending_at:timestamp location:text

(It should go without saying that if I were creating this for real, I would not store the location as a text field, but rather as an ID pointing to another table of locations. Keeping data in such normalized form, so that the text appears in a single place and is referred to from elsewhere in the database using foreign keys, makes the application more robust, as well as more efficient.)

Finally, I create a third table, meeting_person, which allows one or more people to have a meeting. If I were willing to restrict appointments to a single participant (or two participants, if I include the person using this software), I simply could have a person_id field in the meeting table. To get this, I create a new model:

./script/generate model meeting_person \
      person_id:integer meeting_id:integer

Now that the three models are in place, I can add associations—those declarations in the model classes that link them to one another. While I'm editing the model, I also will add some validations, which ensure that the data fits my standards. The final version of the models is shown in Listing 1. Perhaps the only particularly interesting part of the models is the custom validation that I placed in the Meeting model:

def validate
  if starting_at > ending_at
    errors.add_to_base("Starting time is later than ending time!")
  end
end

Listing 1. Model Files, with Associations and Validations


class Person < ActiveRecord::Base
  has_many :meeting_people
  has_many :meetings, :through => :meeting_people

  validates_presence_of :first_name, :last_name, :email
  validates_uniqueness_of :email

  def fullname
    "#{first_name} #{last_name}"
  end

end


class Meeting < ActiveRecord::Base
  has_many :meeting_people
  has_many :people, :through => :meeting_people

  validates_presence_of :starting_at, :ending_at, :location

  def validate
    if starting_at > ending_at
      errors.add_to_base("Starting time is later than ending time!")
    end

    if self.people.empty?
      errors.add_to_base("You must meet with at least one person!")
    end
  end

  def people_as_sentence
    return self.people.map { |p| p.fullname}.to_sentence
  end

end

class MeetingPerson < ActiveRecord::Base
  belongs_to :person
  belongs_to :meeting

end

Listing 2. views/meetings/new.html.erb, Modified from the Default Scaffold to Allow the User to Enter One or More People


<h1>New meeting</h1>

<% form_for(@meeting) do |f| %>
 <%= f.error_messages %>

 <p>
  <%= f.label :starting_at %><br />
  <%= f.datetime_select :starting_at %>
 </p>
 <p>
  <%= f.label :ending_at %><br />
  <%= f.datetime_select :ending_at %>
 </p>
 <p>
  <%= f.label :location %><br />
  <%= f.text_area :location %>
 </p>

 <p>With:
   <%= select("person",
              "person_id",
              Person.all.collect { |p| [p.fullname, p.id] },
              {},
              {:multiple => true}) %>
 </p>
 <p>
  <%= f.submit 'Create' %>
 </p>

<% end %>

<%= link_to 'Back', meetings_path %>

I also created a convenience function that returns an array of names with whom the appointment will be:

def people_as_sentence
  return self.people.map {|p| p.fullname}.to_sentence
end

This validation, which is run whenever I try to save an instance of Meeting, checks to make sure that the starting time is earlier than the ending time. If this is not the case, the validation fails, and the data is not stored. (The fact that I can treat times as full-fledged objects, with access to the > and < operators, is one of my favorite parts of both Ruby and SQL.)

Finally, I'm going to enhance this application by modifying the existing scaffolded controller actions to be more useful. First, I modify the new and create actions, such that they will allow someone to create an appointment, simultaneously indicating the person or people with whom the appointment will take place. Then, I modify the index action, so that the user will get a list of all upcoming appointments.

Fixtures

Now that I've created a simple application, the time has come to test it. As I wrote above, testing the application requires that I have some sample data with which to test it. By default, the generators for Rails models create basic fixtures, which have long been the standard way to import data into Rails tests. By basic, I mean that they contain some very, very basic data—too basic, actually, for any real testing I might want to do. For example, here is the automatically generated fixture for people:

one:
  first_name: MyString
  last_name: MyString
  email: MyString

two:
  first_name: MyString
  last_name: MyString
  email: MyString

Even if you are new to reading YAML, let alone fixture files, the format should be easy enough to understand. YAML consists of name-value pairs within a hierarchy, and indentation indicates where in the hierarchy a particular name-value pair exists. (You also can associate a list of values with the key, by separating values with commas.) Thus, there are two people defined in the fixture, one and two, and each has three name-value pairs.

However, these name-value pairs are close to useless. They might contain valid data, or they might contain data that fails to adhere to the standards laid out in my model validations. If I had defined a validator for the email field, ensuring that the field always would contain a valid e-mail address, the tests would fail right away, before they even ran. Rails would load the fixtures into ActiveRecord, the database would reject them as being invalid and I'd be left scratching my head.

Things get even hairier when you start to make fixtures that depend on associations. I obviously want my meeting_people fixtures to point to valid people and meetings, but using the numeric IDs can get confusing very quickly. Fortunately, recent versions of Rails allow me to name the fixture to which an object is associated, rather than its numeric ID. Thus, although the default fixtures for meeting_people is this:

one:
  person_id: 1
  meeting_id: 1

two:
  person_id: 1
  meeting_id: 1

instead, I can say this:

one:
  person: one
  meeting: one

two:
  person: two
  meeting: two

Obviously, you would want to choose more descriptive names for your fixtures. But, I now have indicated that meeting #1 is with person #1, and meeting #2 is with person #2. This is obviously more descriptive than the simple numbers would be.

You can even do one better than this, because fixtures understand the has_many :through associations that I defined in the models. Just as in the Ruby code, I can add a person to a meeting with:


meeting.people << a_person

I can put the same sorts of information in the fixture file. For example:

one:
  starting_at: 2009-05-10 00:48:12
  ending_at: 2009-05-10 01:48:12
  location: MyText
  people: one, two
two:
  starting_at: 2009-05-10 00:48:12
  ending_at: 2009-05-10 01:48:12
  location: MyText
  people: two

If you do things this way, you don't want to define things in both the meeting_people fixture and in the meetings fixture. Otherwise, you might be in for some very strange errors. Note that fixture files are ERb (embedded Ruby) files, so you can have dynamically generated entries, such as:


one:
  starting_at: <%= 5.minutes.ago %>
  ending_at: <%= Time.now %>
  location: MyText
  people: one, two

Now, how do you use these fixtures in your tests? It's actually pretty straightforward. You need to load the fixtures you want with the fixtures method:

fixtures :meetings

By default, all fixtures are imported, thanks to:

fixtures :all

in test/test_helper.rb, which is imported automatically into all tests. Then, in your test, you can say something like this:

get :edit, :id => people(:one).id

This example (of a functional test) will load the person object identified as one in people.yml, invoking the edit method and passing it the ID of the appropriate fixture.

Factory Girl

For a small site, or when you can keep everything in your head, fixtures are just fine. I've certainly used them over the years, and I've found them to be an invaluable part of my testing strategy. But, factories are an alternative to fixtures that have become increasingly popular, both because they're written in Ruby code, and they allow you to do all sorts of things that are difficult or impossible with YAML fixtures.

Factory Girl is one of the best known factories, written and distributed by the Thoughtbot company, and it is available as a Ruby gem. After installing Factory Girl on your system and bringing it into your application's environment with:

config.gem "thoughtbot-factory_girl",
             :lib    => "factory_girl",
             :source => "https://gems.github.com"

in config/environment.rb, you will be able to use it. Basically, Factory Girl allows you to create objects in Ruby, rather than load them from fixture files. No defaults are created for you by the generator, but that's not a big deal, given how easy it is to use Factory Girl to create test objects.

Above, I showed how in a test environment using fixtures, you can grab the person object with a name of one by using the people method, and then passing a symbol:

get :edit, :id => people(:one).id

people(:one) is a full-fledged ActiveRecord object, with everything you might expect from such an object. Factory Girl works in a different way. First, you need to create a test/factories.rb file, in which your factories are defined. (You also may create a test/factories/ directory, the contents of which will be Ruby files defining factories.)

To create a factory for people (that is, in place of people.yml), insert people.rb inside test/factories:

Factory.define :person do |p|
  p.first_name 'Reuven'
  p.last_name  'Lerner'
  p.email 'reuven@lerner.co.il'
end

Now, inside the tests, you can say:

get :edit, :id => Factory.build(:person).id

or:

person = Factory.build(:person)
get :edit, :id => person.id

At first glance, this doesn't seem too exciting. After all, you could have done roughly the same thing with your fixture, right? But factories allow you to override the defaults:

person = Factory.build(:person, :first_name => 'Foobar')
get :edit, :id => person.id

But wait, there's more. You can set associations as follows:

Factory.define :person do |p|
  p.first_name 'Reuven'
  p.last_name  'Lerner'
  p.email 'reuven@lerner.co.il'
  p.meetings {|meetings| meetings.association(:meeting)}
end

In other words, if you have created a meeting factory, you can incorporate it into your person factory, taking advantage of the association, using a fairly natural syntax.

An even more interesting idea is that of sequences. If your application needs to create a large number of test people, you might want each of those people to have a unique e-mail address. (Never mind that the e-mail never will be sent.) You can do this with a sequence:

Factory.define :person do |p|
  p.first_name 'Reuven'
  p.last_name  'Lerner'
  p.sequence(:email) {|n| "person#{n}@example.com" }
end

The first person created with this factory will have an e-mail address of person1@example.com; the second will be person2@example.com and so forth.

As you can see, Factory Girl is as easy to use as YAML fixtures, but it offers a great many capabilities that come in handy when testing Rails applications.

Factory Girl is a terrific library for factories, and it has become quite popular since it was first released. But, not everyone liked its basic syntax, and one of those people was Pete Yandell, who decided that although the basic idea behind factories was sound, he wanted to use a different (and more compact) syntax for his factories. Thus was born Machinist, which uses a Sham object to describe fields in an object, which are then assembled into blueprints for specific objects. For example:

require 'faker'

# Define the fields that we will need
Sham.first_name  { Faker::Name.first_name }
Sham.last_name  { Faker::Name.last_name }
Sham.email { Faker::Internet.email }

# Now use these field definitions to create a blueprint
Person.blueprint do
  first_name
  last_name
  email
end

Now you can use these blueprints to create test objects. For example:

person = Person.make()

As with Factory Girl, you also can override the defaults:

person = Person.make(:email => 'foo@example.com')
Conclusion

Fixtures have been a part of Rails testing practices since the beginning, and they still can be quite useful. But, if you're finding yourself frustrated by YAML files, or if you want to experiment with something that offers more flexibility and features, you might well want to try looking into factories. This month, I looked at two different libraries for creating Rails factories, both of which are in popular use and might be a good fit for your project.

Resources

The home page for Ruby on Rails is www.rubyonrails.com. Information about testing, including the use of fixtures, is in one of the excellent, community-written Rails guides at guides.rubyonrails.org/testing.html.

If you are interested in learning more about factories, a good starting point (as is often the case) is the Railscast site, with weekly screencasts by Ryan Bates. The Railscast that talks about fixtures is at railscasts.com/episodes/158-factories-not-fixtures.

Finally, the home page for Factory Girl is at dev.thoughtbot.com/factory_girl, and the home page for Machinist is at github.com/notahat/machinist/tree/master.

Reuven M. Lerner, a longtime Web/database developer and consultant, is a PhD candidate in learning sciences at Northwestern University, studying on-line learning communities. He recently returned (with his wife and three children) to their home in Modi'in, Israel, after four years in the Chicago area.

Load Disqus comments