Investigating how Symbol to_proc works

February 07, 2009
One of the things I love about Ruby is how expressive it is and how with open classes it can be optimized to become even more expressive. Since I started using Ruby I don't think I've written a single for or while loop - something I couldn't have imagined saying with any other language! Of course I do this by using iterators and writing code like
user_names = User.all.collect {|user| user.name }

I recently started discovered I could write the same thing even more concisely (as long as I'm using Rails or Ruby 1.9)
user_names = User.all.collect(&:name)

I decided to investigate how this works.

First I found some good posts by Prag Dave and Ryan Bates and at InfoQ. This helped but I still didn't understand it all so decided to dig further.

First I took a look at how Rails extends Symbol

unless :to_proc.respond_to?(:to_proc)
  class Symbol
    # Turns the symbol into a simple proc, which is especially useful for enumerations. Examples:
    #
    #   # The same as people.collect { |p| p.name }
    #   people.collect(&:name)
    #
    #   # The same as people.select { |p| p.manager? }.collect { |p| p.salary }
    #   people.select(&:manager?).collect(&:salary)
    def to_proc
      Proc.new { |*args| args.shift.__send__(self, *args) }
    end
  end
end


So they defined the to_proc method on symbol and that means the new code will be called when we write &:name because it magically gets transformed into :name.to_proc. I learned something but still needed to learn more to understand how it all works.

Why does the & cause Ruby to call to_proc? I knew that & in the last parameter declaration will pass a provided block as a parameter but this seems to be doing the reverse. Calling a method as an argument but having it interpreted as a block. I tried a couple of experiments in irb

def was_block_given?
  block_given?
end

# As expected
was_block_given? {}
=>  true 
# Passing a proc is not the same as having a block
was_block_given? Proc.new{}
ArgumentError: wrong number of arguments (1 for 0)
  from (irb):195:in 'was_block_given?'
  from (irb):195
  
# Prefixing the proc with an & makes it like a block
was_block_given? &Proc.new{}
=> true

It was not all as I expected but some reading through the PickAxe book led me to a better understanding. I found this paragraph in the Calling A Method section (page 115 in my copy)

If the last argument to a method is preceded by an ampersand, Ruby assumes that it is a Proc object.
It removes it from the parameter list, converts the Proc object into a block, and associates it with the method.


Ok so now I know why when Ruby sees User.all.collect(&:name) it invokes the collect method with name.to_proc as a block. Next, it was time to figure out why the code Rails put in the to_proc method worked. I took a look at the Rubinius implementation Enumerable

def collect
  ary = []
  if block_given?
    each { |o| ary << yield(o) }
  else
    each { |o| ary << o }
  end
  ary
end

Again I decided to experiment with irb to see what each part of the to_proc implementation was doing. First I redefined the Symbol to_proc again with a puts so I could confirm what was going on.

class Symbol
  # Turns the symbol into a simple proc, which is especially useful for enumerations. Examples:
  #
  #   # The same as people.collect { |p| p.name }
  #   people.collect(&:name)
  #
  #   # The same as people.select { |p| p.manager? }.collect { |p| p.salary }
  #   people.select(&:manager?).collect(&:salary)
  def to_proc
    Proc.new do |*args| 
      puts "to_proc args: #{args.inspect}"
      args_shift = args.shift
      puts "to_proc: #{args_shift.inspect}.__send__(#{self.inspect}, *#{args.inspect})"
      result = args_shift.__send__(self, *args) 
      puts "to_proc result: #{result.inspect}"
      result
    end
  end
end

# Make the call and see what happens
[1].collect(&:to_s)
# to_proc args: [1]
# to_proc: 1.__send__(:to_s, *[])
# to_proc result: "1"
# => ["1"]

Its brute force but tells us everything we need to know. As expected collect yields to our proc/block with the element in a variable length argument [1], it extracts the 1 and sends it the to_s method with no arguments returning the string "1". At this point I think I understand how it all works and decide to confirm by running a few more (more complicated) tests in irb


It all works as expected and I decide I know as much as I need to about this and call it a day.

So why did I bother figuring all this out and then writing it up? Mostly because I didn't know how it worked and thought there was some 'magic' going on. I could have continued using this feature without understanding how it worked but now that I understand how it works if some need ever arises for me to do some similar magic I know how to go about it. As for writing it up I hope someone else may read this and find it useful but by I increased my own understanding through the act of writing.