2010/12/26

Simple database migrations for chicken

Since i'm back to chicken, i had the opportunity to have some time for hacking on it.
I started to port, and hopefully finish up, the website for lisp-unleashed.de. Of course i couldn't stand it and decided to implement the stuff i need with chicken.

There's work to do

In the very early stage i realized that chicken was missing a feature that i learned to love, when i worked with other webstacks like Ruby on Rails. They all provide some way to control your database-schema. Although you can easily just store plain SQL files and attach versions to them, it quickly gets messy. You'd have to store different SQL files for the up and down migration and the rollback or rollout to a specific version is cumbersome.

Me and a sunday afternoon

I didn't plan to implement this nitty gritty kind of abstraction over migrations as the rails guys did. I took a way simpler, but in my opinion also cleaner, route. There are some core-observations that i made, that drove the overall architecture.


  1. It's not a good idea to hide the fact that we're working with SQL
  2. A migration may consist of more than just a bunch of SQL-statements
  3. A migration is a path from a starting version to a requested version, where each version between is one step closer to the desired state

Don't hide SQL

Ok, so SQL is not a bad thing, why should we pretend that we aren't using it? While i do see the general benefit of layering abstractions over SQL to achieve a certain degree of "database independence", i do also think that it must not be overused. And migrations are a place where i don't see a good reason for abstracting away the SQL. So my migrations hold SQL-Fragments. A typical migration file would look like this:



It's as simple as this. As you see there are two sections inside a migration file. One section for the rollout and one section for the rollback. The SQL-Fragments of the appropriate section get executed during up and down migration.

Beyond SQL

Sometimes a migration may involve more than firing a bunch of simple SQL-statements. Maybe some calculations need to be done upfront or you need to gather some information about your environment before you're able to actually create the next schema version.
In either case i wanted the developer to have as much liberty as she can get.
So lambda the ultimate to the rescue. In fact the migration files are read and evaluated inside a quasiquote. If the library recognizes that the body of a migration holds a procedure, it invokes it and passes the database-adapter as its sole argument.



Which route do we go?


When i played around with the library i did what i always try to do, when i'm implenting something. I'm trying to develop a metaphor and try to find matching wordings within this metaphor. For migrations i realized that a migration can be seen as a route or a track if you like. The starting point is a specific version, as well as the target point. Each version between are what i call checkpoints. In order to get from one version to another you have to traverse each checkpoint until you reach the final checkpoint. Each checkpoint brings you closer to the desired state.

I don't really know if it is the best way to see it. But it helped me a lot to reason about what i was going to implement.


The nomads in us(e)


Now that i had the overall architecture, i was able to implement what i needed. This was a lot of fun actually.
Peter Bex from #chicken came up with the cool name nomads for the library.
And having a name hacking became even more fun. I implemented some more stuff which i found useful. Some of the features i'm going to outline here, for the entire functionality check out the nomads documentation


The versioning scheme

I guess most of you have worked in teams on the same project, where you quickly realized that a simple versioning scheme like incrementing a numeric prefix, isn't the best choice, as it may lead to concurrency issues when two developers generate migrations at the same time.
Though the revision control system will catch this, it might be better to use a different versioning scheme. The rails guys have chosen to implement a time stamp based scheme which works better in those cases. Nomads is designed with this problem in mind an allows the developer to implement whatever versioning scheme suites best. You can easily parameterize the appropriate configuration to implement the algorithm. See the nomads documentation, where i showed how to implement a time stamp based algorithm


Irreversible migrations


Another common feature are migrations, that aren't reversible. Of course the library allows to express this situation. The migration file may simply hold #f to be considered irreversible. The developer can also add an optional message for irreversible migrations.



The migration tool


Now we can put it all together and create a simple program that can be used to manage the migrations. Here is a simple version that works like RoRs db:migrate rake task. It uses a custom callback that is invoked on each check point, which enables us to customize the way we report migrations. We're using Alex Shin's very cool fmt-egg to do so.



All you need to do is adjust the path to your database and compile the program like this:
$ csc migrate.scm

Now you can use it like this:
$ VERSION=3 migrate

This was it! Check out chicken and nomads if you like. Also have a look at all the other great eggs. Or better yet, provide some more.

No comments:

Post a Comment