Wednesday, January 27, 2010

What to do when ActiveRecord thinks an Oracle key is a decimal

I recently created a model for an existing database table using the legacy_data gem and was confused when my primary key showed up in scientific notation. It turned out the issue was due to sloppiness in the table definition and could be easily fixed once I understood what ActiveRecord was doing.

I created a Person model connected to the people table


class Person < ActiveRecord::Base
end


but when I went into script/console the primary key showed up as a BigDecimal when I expected an integer.


$ script/console
Loading development environment (Rails 2.3.4)
>> Person.first.id
=> #


This wasn't what I wanted and would cause problems in my app when it tried to build a url with that id like http://localhost:3000/people/10024844425.0. The rails routing engine would see the .0, treat it as a format (like .xml or .json) and get confused. Let's look at why this is happening.



>> Person.columns_hash['id']
=> #
>> Person.columns_hash['id'].type
=> :decimal
>> Person.columns_hash['id'].sql_type
=> "NUMBER"


We see that ActiveRecord is treating this column as a :decimal because it's sql_type is NUMBER. It turns out this is correct because an Oracle number is a decimal unless you specify it to have 0 digits after the decimal point (scale of 0). Here's the documentation from Oracle (the last sentence is my bold)


NUMBER Datatype

The NUMBER datatype stores fixed and floating-point numbers. Numbers of virtually any magnitude can be stored and are guaranteed portable among different systems operating Oracle, up to 38 digits of precision.

The following numbers can be stored in a NUMBER column:

Positive numbers in the range 1 x 10-130 to 9.99...9 x 10125 with up to 38 significant digits

Negative numbers from -1 x 10-130 to 9.99...99 x 10125 with up to 38 significant digits

Zero

Positive and negative infinity (generated only by importing from an Oracle Version 5 database)

For numeric columns, you can specify the column as:

column_name NUMBER

Optionally, you can also specify a precision (total number of digits) and scale (number of digits to the right of the decimal point):

column_name NUMBER (precision, scale)

If a precision is not specified, the column stores values as given. If no scale is specified, the scale is zero.

Oracle guarantees portability of numbers with a precision equal to or less than 38 digits. You can specify a scale and no precision:

column_name NUMBER (*, scale)

In this case, the precision is 38, and the specified scale is maintained.

When you specify numeric fields, it is a good idea to specify the precision and scale. This provides extra integrity checking on input.


Let's look in my database and sure enough the ID is a number


$ sqlplus myusername/mypassword@localhost:1521/mydatabase.world

SQL*Plus: Release 10.2.0.4.0 - Production on Wed Jan 27 09:15:09 2010

Copyright (c) 1982, 2007, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Release 10.2.0.4.0 - Production

SQL> desc people;
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NUMBER
NAME VARCHAR2(10)



If you are allowed to change your database you can create a migration like


$ script/generate migration change_person_id_to_integer
STUBBING MckinseyLDAP
exists db/migrate
create db/migrate/20100127145747_change_person_id_to_integer.rb


now edit the migration


class ChangePersonIdToInteger < ActiveRecord::Migration
def self.up
change_column(:people, :id, :integer)
end
def self.down
change_column(:people, :id, :decimal)
end
end


In my case there were other applications using this table and I was not allowed to change it so I implemented a fix in Ruby to tell my model to treat this column as an integer even though it was defined as a decimal in the database.


#config/initializers/legacy_data_type_coercion.rb
module LegacyDataTypeCoercion
def set_integer_columns *col_names
col_names.each do |col_name|
columns_hash[col_name.to_s].instance_eval do
@type = :integer
end
end
end
end
ActiveRecord::Base.extend(LegacyDataTypeCoercion)


#app/models/person.rb
class Person < ActiveRecord::Base
set_integer_columns :id
end


We defined a method set_integer_columns that will force ActiveRecord to treat the columns we specify as integers. In our Person model we declare :id is an integer column. Let's test it out!


$ script/console
Loading development environment (Rails 2.3.4)
>> Person.first.id
=> 10024844425
>> Person.columns_hash['id'].type
=> :integer


Just as expected id is now an integer and we can go ahead building the rest of our application.

This is not an issue with all Oracle tables as if the column was defined as NUMBER(10) (with a precision and implicit scale of 0) then ActiveRecord will interpret it as an integer automatically based on the parentheses in the data type - i.e NUMBER(10) ActiveRecord or Oracle Enhanced).

Wednesday, January 13, 2010

Freezing a gem that has native extensions

I like to freeze all the gems I use as we run in a shared hosting environment and need to our apps isolated from each other. Deployments are also handled by an operational team that does not intimately understand our applications so keeping our deployments to a single capistrano command cap deploy:migrations has been a big win for us. Freezing most gems is pretty straightforward and has been built in since Rails 2.1. When dealing with a gem that requires native extensions to be built there's only one additional step to add to your Capfil.

Let's say we want to localize hpricot which does include native C extensions.

First tell Rails about your gem by adding a config.gem line to your environment.rb

Rails::Initializer.run do |config|
...
config.gem 'hpricot'
...
end


Now we can ask rails about its configured gems


$ rake gems
(in /Users/alexrothenberg/ruby/my_project)
- [I] hpricot

I = Installed
F = Frozen
R = Framework (loaded before rails starts)


The 'I' means its hpricot is installed on my system but not frozen in the application. If you see '[]' instead you need to run sudo gem install hpricot (add '--source http://gemcutter.org' if necessary). At this point you could write some code to use hpricot and your application will work. But if hpricot (or the version you're expecting) is not installed on your production server you'll be in trouble.

To freeze the gem into your vendor directory run rake gems:unpack (optionally you can add 'GEM=hpricot' if you just want to unpack one gem).


$ rake gems:unpack
(in /Users/alexrothenberg/ruby/my_project)
Unpacked gem: '/Users/alexrothenberg/ruby/my_project/vendor/gems/hpricot-0.8.2'


We can ask rails again...


$ rake gems
(in /Users/alexrothenberg/ruby/my_project)
The following gems have native components that need to be built
hpricot

You're running:
ruby 1.8.6.287 at /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby
rubygems 1.3.2 at /Users/alexrothenberg/.gem/ruby/1.8, /Library/Ruby/Gems/1.8, /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/gems/1.8

Run `rake gems:build` to build the unbuilt gems.


Oops our vendored gem is missing hasn't built the native extensions. Not to worry the message tells us what to do and we run rake gems:build


$ rake gems:build
(in /Users/alexrothenberg/ruby/my_project)
Built gem: '/Users/alexrothenberg/ruby/mars-admin/vendor/gems/hpricot-0.8.2'
alex-rothenbergs:mars-admin alexrothenberg$ rake gems
(in /Users/alexrothenberg/ruby/my_project)
- [F] hpricot

I = Installed
F = Frozen
R = Framework (loaded before rails starts)


We can ask rails again to see that the gem is now frozen and also look in our vendor folder


$ rake gems
(in /Users/alexrothenberg/ruby/my_project)
- [F] hpricot

I = Installed
F = Frozen
R = Framework (loaded before rails starts)

$ ls vendor/gems/hpricot-0.8.2/
total 72
-rw-r--r-- 1 alexrothenberg staff 4672 Jan 13 12:33 CHANGELOG
-rw-r--r-- 1 alexrothenberg staff 1048 Jan 13 12:33 COPYING
-rw-r--r-- 1 alexrothenberg staff 9216 Jan 13 12:33 README
-rw-r--r-- 1 alexrothenberg staff 8242 Jan 13 12:33 Rakefile
drwxr-xr-x 4 alexrothenberg staff 136 Jan 13 12:33 ext/
drwxr-xr-x 3 alexrothenberg staff 102 Jan 13 12:33 extras/
drwxr-xr-x 6 alexrothenberg staff 204 Jan 13 12:39 lib/
drwxr-xr-x 11 alexrothenberg staff 374 Jan 13 12:33 test/


Everything looks good and you can check this into git and now have a frozen version of the hpricot gem stored with your application.
But if we stop here, when we deploy to our production server we'd be using the native extensions we built on your laptop which may not work on the server if you have one is 32bit and the other 64bit or you have different OS libraries installed or any number of other reasons.

To be safe, we need to rebuild the native extensions on the server when we deploy. This is not as hard as it sounds as rails gave us the rake task rake gems:build. We can ask capistrano to run that command on the server by adding the following to your Capfile.


after "deploy:finalize_update" do
# build the native extensions for hpricot gem
run "cd #{release_path} && #{rake} RAILS_ENV=#{rails_env} gems:build GEM=hpricot"
end


Now when capistrano deploys in with all the other messages you'll see something like


...
* executing "cd /opt/apps/my_project/releases/20100108185109 && rake RAILS_ENV=production gems:build"
servers: ["your.server.com"]
[your.server.com] executing command
** [out :: your.server.com] (in /opt/apps/my_project/releases/20100108185109)
** [out :: your.server.com] Built gem: '/opt/apps/my_project/releases/20100108185109/vendor/gems/hpricot-0.8.2'
command finished
...


So rails give us a few simple patterns to follow to freeze our gems in the vendor folder and with a few lines in you Capfile you can use this pattern to vendor a gem with native extensions.

Thursday, November 5, 2009

Generate Models from Tables - Legacy Data Gem

Today I'd like to announce the release of a gem I've been working on Legacy Data

Getting started on a Rails project with a large existing database can be daunting. How to you extract all the information that's
encoded in the database? Do you have to understand the entire data model before you get started? The models_from_tables generator
in the legacy_data gem can help! This generator looks into your existing database and generates ActiveRecord models based on the
information encoded in it.

How to use it



  • To generate an ActiveRecord model for each table in the database just type
    script/generate models_from_tables

  • If you don't want all tables in the database tell it which table to model
    script/generate models_from_tables --table-name comments
    This uses any foreign_key constraints in the database to spider the database and model the comments table and all associated tables.

  • If you really only want the comments table tell it not to follow any foreign_keys
    script/generate models_from_tables --table-name comments --skip-associated

  • If you use factory girl (and everyone should) it will generate a simple factory for each model it generates
    script/generate models_from_tables --table-name comments --with-factories


(You do need to install the plugin gem install legacy_data as long as http://gemcutter.org is one of your gem sources)

Examples


Several examples come with the gem source in the examples folder on github. These include

  • A simple blog database tested with MySQL and Sqlite3

  • The Drupal 6.14 database tested with MySQL

  • The J2EE Petstore example tested with MySQL, Sqlite3 and Oracle


What kind of information can it extract from the database?


Associations


If the database contains foreign_key constraints it uses them to build has_many or belongs_to associations
in your ActiveRecord models

Validation constraints


It will generate the following types of validation constraints in your models

  • validates_uniqueness_of - For columns where the database has an index that enforces uniqueness

  • validates_presence_of - When the database column is non-nullable

  • validates_inclusion_of - For non-nullable boolean columns and custom constraints with a SQL rule flag IN ('Y', 'N')

  • validates_numericality_of - For integer columns (nullable and non-nullable)

  • custom validation - For custom SQL validation rules in the database it puts a placeholder in your model with the original SQL for you to translate into Ruby


Non-Rails naming conventions


Since the database is existing it's likely that it doesn't follow Rails naming conventions. Not to worry as the generator will
put the non-standard name into the generated models if it needs to.


What kinds of non-standard names can it generate?

Let's look at a sample output


class Post < ActiveRecord::Base

set_table_name :tbpost
set_primary_key :postid

# Relationships
has_many :comments, :foreign_key => :postid

# Constraints
validates_presence_of :title, :body

end



  • Class Names - It named the model Post instead of the Rails convention Tbpost. The generator could not do this itself but knowing the conventions will often not apply to legacy databases it pauses after spidering the database giving you a chance to override the table to class name mapping. It generates a yaml file app/models/table_mappings.yml where you can verify or change any class name before proceeding to generate the models.

  • Table Names - It overrode the table name since the actual name tbpost does not match the Rails naming convention posts

  • Primary Keys - It overrode the primary key since the actual column postid does not match the Rails naming convention id

  • Foreign Keys - It overrode the foreign key on the comment table to be postid instead of the Rails naming convention id

Friday, October 2, 2009

PL/SQL debugging information now a part of Oracle Enhanced Adapter v1.2.2

Raimonds Simanovskis has just published a version 1.2.2 of the Oracle Enhanced Adapter that includes the ability to capture dbms_output debug statements from your pl/sql code in the Rails log file. This is a bit of code that I wrote and blogged about a few months ago so not only do I think its useful but am very excited to have contributed to something many others use.

Wednesday, September 30, 2009

Using the Whenever gem to manage scheduled cron jobs without installing it on the server

I've been using Javan's Whenever gem to manage scheduled jobs in my project and its fantastic!! There are many existing resources where you can learn more (readme, railscast google group) but I'd like to describe the specific way I'm using it

  • When the gem is not installed on my server

  • How administrators can use Capistrano to both schedule and unschedule your jobs

  • How to use a library such as the Oracle client that requires certain environment variables



At the end of the day we want to have 2 capistrano tasks we can run to have cron call a rake task of ours on the schedule we want.


cap schedule_jobs
cap unschedule_jobs


If we want to get fancy and pass a custom configuration argument that will be passed into the rake task.

cap schedule_jobs SOME_CONFIGURATION=false


When the gem is not installed on my server


Let's start with the Capfile

#Capfile
desc "Schedule the jobs"
task :schedule_job, :roles => :app, :only => { :primary => true } do
some_configuration = ENV['SOME_CONFIGURATION'] || true #default to true
arguments = ["RAILS_ENV=#{rails_env}",
"APP_PATH=#{current_path}",
"SOME_CONFIGURATION=#{some_configuration}"].join(' ')
run "cd #{current_path} && #{rake} whenever:update_crontab #{arguments}"
end

desc "Unschedule the jobs"
task :unschedule_job, :roles => :app, :only => { :primary => true } do
run "cd #{current_path} && #{rake} whenever:update_crontab UNSCHEDULE=true"
end


How does this differ from the example on the whenever site? Since the gem is not installed on the server we cannot call whenever from the command line so invoke the whenever:update_crontab rake task instead and to allow administrators to easily disable the scheduled jobs we define the unschedule_jobs capistrano task. Let's take a look at the whenever:update_crontab Rake task that gets this all done.


#lib/tasks/whenever.rake
namespace :whenever do
desc "updates crontab with our scheduled jobs"
task :update_crontab => :load_whenever_gem do
Whenever::CommandLine.execute({:update=>true, :identifier=>'YOUR_APP_NAME'})
end

task :load_whenever_gem do
begin
gem_dir_root = "#{RAILS_ROOT}/vendor/gems/"
chronic_gem_dir = Dir["#{RAILS_ROOT}/vendor/gems/*"].detect do |subdir|
subdir.gsub(gem_dir_root,"") =~ /^(\w+-)?chronic-(\d+)/ && File.exist?("#{subdir}/lib/chronic.rb")
end
require "#{chronic_gem_dir}/lib/chronic"

whenever_gem_dir = Dir["#{RAILS_ROOT}/vendor/gems/*"].detect do |subdir|
subdir.gsub(gem_dir_root,"") =~ /^(\w+-)?whenever-(\d+)/ && File.exist?("#{subdir}/lib/whenever.rb")
end
require "#{whenever_gem_dir}/lib/whenever"

rescue MissingSourceFile => e
raise "Cannot find Whenever or Chronic : #{e}"
end
end
end


The actual whenever:update_crontab task just does the same the command line does but unless you add config.gem 'whenever' to your environment (which I don't since whenever is not needed by my app at runtime) we also have the other task that loads whenever and chronic from the vendor/gems directory.

At this point we've gotten Capistrano calling a rake task to invoke whenever even though the gem is localized in my application but not installed on the server.

How administrators can use capistrano to both schedule and unschedule your jobs



The whenever gem does not have any support for unscheduling but it will schedule whatever is included in your schedule.rb file so if that file tells it to schedule nothing that's the same as unscheduling. Its easy to do that by wrapping the entire file with unless ENV['UNSCHEDULE'] (remember the UNSCHEDULE parameter we passed in the Capfile?)


#config/schedule.rb
unless ENV['UNSCHEDULE']

module MyApp
module Job
class CronRakeTask < Whenever::Job::Default
def output
path_required
"cd #{@path} && /usr/bin/env /usr/local/bin/cron_rake #{task} SOME_CONFIGURATION=#{ENV['SOME_CONFIGURATION']} RAILS_ENV=#{@environment}"
end
end
end
end

set :path, ENV['APP_PATH'] || RAILS_ROOT
set :environment, RAILS_ENV || 'production'
every 1.day, :at => '10:00pm' do
command 'do_something', :class => MyApp::Job::CronRakeTask,
:environment => @environment,
:path => @path
end
end


What's going on with the weird CronRakeTask? This brings us to the final point

How to use a library such as the Oracle client that requires certain environment variables



Cron loads its environment settings differently than an interactive shell and typically does not have all the environment variables you may have in your profile file. You could solve this by adding them to the top of your crontab file but I prefer to leave that file as simple as possible and create a wrapper script to call instead of rake. Basically the CronRakeTask does the same as Whenever's built in RakeTask except it calls /usr/local/bin/cron_rake instead of rake.

The cron_rake file just sets the environment variables I need then calls rake.


#!/bin/bash

export PATH=/usr/local/lib/ruby-enterprise/bin:$PATH
export ORACLE_HOME=/opt/oracle
export LD_LIBRARY_PATH=/opt/oracle:$LD_LIBRARY_PATH

rake $@


As I said in the beginning since I've been using the Whenever gem I no longer need to manually edit my crontab files ever and I can enable my jobs as part of my normal deployment process. Its wonderful and I think everyone should use it!

Wednesday, August 19, 2009

How to capture Oracle's dbms_output in your Rails log file

I have been writing a rails application on top of a large existing Oracle database where each table has 5+ triggers that each call several stored procedures and each of those PL/SQL stored procedures is hundreds of lines long. Often a simple update statement fails with an ORA-xxxx exception coming from deep in the PL/SQL code and it can be tough to figure out what's gone wrong.

The usual way Oracle database folks figure out what's going on is to put print statements in their code. In oracle this looks like
  dbms_output.put_line('hi i hit this line of pl/sql');
When you're using an Oracle editor like TOAD or SQLDeveloper you have to turn output on and then will see anything that's printed.
  set serveroutput on;
This is great if you divide the application between Rails
and Database developers and assume each group can work independently to write perfect code but what about the real world!
Today I want to show you how I monkey patched the Oracle Enhanced Adapter to stick the dbms_output into
the rails log file.
Let's start with an simple example of a simple PL/SQL function that tells you if a string is more than 5 characters long (with some simple debugging print statements).
  CREATE OR REPLACE FUNCTION
MORE_THAN_FIVE_CHARACTERS_LONG (some_text VARCHAR2) RETURN INTEGER
AS
longer_than_five INTEGER;
BEGIN
dbms_output.put_line('before the if -' || some_text || '-');
IF length(some_text) > 5 THEN
dbms_output.put_line('it is longer than 5');
longer_than_five := 1;
ELSE
dbms_output.put_line('it is 5 or shorter');
longer_than_five := 0;
END IF;
dbms_output.put_line('about to return: ' || longer_than_five);
RETURN longer_than_five;
END;
Now we can run the following in TOAD
  set serveroutput on;
select MORE_THAN_FIVE_CHARACTERS_LONG('a long string') from dual;
select MORE_THAN_FIVE_CHARACTERS_LONG('short') from dual;
And we get this output
  MORE_THAN_FIVE_CHARACTERS_LONG('ALONGSTRING') 
---------------------------------------------
1

1 rows selected

before the if -a long string-
it is longer than 5
about to return: 1

MORE_THAN_FIVE_CHARACTERS_LONG('SHORT')
---------------------------------------
0

1 rows selected

before the if -short-
it is 5 or shorter
about to return: 0
To get the same in our Rails app we just need to monkey patch the OracleEnhancedAdapter by copying what's below into your project as config/initializers/oracle_enhanced_adapter.rb

  module ActiveRecord
module ConnectionAdapters
class OracleEnhancedAdapter < AbstractAdapter
DBMS_OUTPUT_BUFFER_SIZE = 10000 #can be 1-1000000
DBMS_LINE_MAX_SIZE = 1000
def enable_dbms_output
@enable_dbms_output = true
execute "BEGIN dbms_output.enable(#{DBMS_OUTPUT_BUFFER_SIZE}); END;"
end
def disable_dbms_output
@enable_dbms_output = false
execute "BEGIN dbms_output.disable(); END;"
end
def dbms_output_enabled?
@enable_dbms_output
end

protected
def log(sql, name)
super(sql, name)
ensure
log_all_dbms_output if dbms_output_enabled?
end

private
def log_next_line_of_dbms_output
dbms_output_text, status = @connection.exec "BEGIN dbms_output.get_line(:return, :status); END;", ' '*DBMS_LINE_MAX_SIZE, 1
got_text = (status == 0)
@logger.debug "DBMS_OUTPUT: #{dbms_output_text}" if got_text
got_text
end

def log_all_dbms_output
while log_next_line_of_dbms_output do
end
end
end
end
end

To use it let's show a quick script/console session
  >> ActiveRecord::Base.connection.enable_dbms_output
=> []
>> ActiveRecord::Base.connection.select_all("select MORE_THAN_FIVE_CHARACTERS_LONG('a long string') from dual")
=> [{"more_than_five_characters_long('alongstring')"=>1}]
And what's in log/development.log

SQL (27.0ms) BEGIN dbms_output.enable(10000); END;
SQL (25.9ms) select MORE_THAN_FIVE_CHARACTERS_LONG('a long string') from dual
DBMS_OUTPUT: before the if -a long string-
DBMS_OUTPUT: it is longer than 5
DBMS_OUTPUT: about to return: 1

This is admittedly a very simple example but I have used this in a real application where I am updating several related ActiveRecord objects and seeing the DBMS_OUTPUT inline with the various SQL update statements has been extremely useful in tracking down a real bug in the PL/SQL procedure that has been in our system for over a year!

I've submitted this as a patch to the Oracle Enhanced Adapter so perhaps it will be included at some point so you wont have to do the monkey patching yourself. I was somewhat surprised not to find anything similar out there so if you know of something please leave a comment.

Wednesday, July 22, 2009

One year of blogging

I just noticed that its been a year since my first post and thought I'd take a moment to reflect on the experience....

I set out with a goal of writing 2-3 posts a month and have managed to write 28 articles over 12 months so I feel pretty good about that. Most of the time I've gotten excited about something I accomplished or learned during the week. The process of writing it up turned out to be as much of a learning experience for myself as the original discovery. As they say "Teaching is the best way to learn"!

I've been most surprised how easy it is to get to the top of a Google Search results list (for very specific searches of course). With my Google Analytics tracking I can see more than 1/2 of my hits coming from search engines and not all of them are me searching for my own name :) Its been great when I've gotten comments from people I don't know (especially when the comments are positive).

Writing this blog has also helped turn my focus outward so I'm not only working on internal projects at my company but feel (in a small way) a part of the broader community.

I hope to continue in the coming year as I'm having a lot of fun so far ...