Lambda, procs and ActiveRecord scopes - Part 2

Tags: ActiveRecord, ruby, rails, lambda
Publish Date: 2016-06-19

As mentioned in my previous post, there are many "Rails Programmers" who simply follow the examples without truly understanding the Rails API. So as a folllow up to my previous post, here is a look at what the scope method does within ActiveRecord.

 

scope method within ActiveRecord(::Scoping::Named::ClassMethods)

Very often, when we use the scope method within ActiveRecord, we only provide 2 arguments: The name of the scope and a lambda.

Let's dig into the Rails API and see what the scope method does with the arguments you pass into it.

# File activerecord/lib/active_record/scoping/named.rb, line 141
def scope(name, body, &block)
  unless body.respond_to?(:call)
    raise ArgumentError, 'The scope body needs to be callable.'
  end

  if dangerous_class_method?(name)
    raise ArgumentError, "You tried to define a scope named \"#{name}\" "                "on the model \"#{self.name}\", but Active Record already defined "                "a class method with the same name."
  end

  extension = Module.new(&block) if block

  singleton_class.send(:define_method, name) do |*args|
    scope = all.scoping { body.call(*args) }
    scope = scope.extending(extension) if extension

    scope || all
  end
end

First, it performs some checks: in line 3-5 above, it checks whether the second argument responds to the method call. So technically, we could also use a Proc.new for our second argument... but it's better to keep it with lambda (explained in the next post). Then In line 7-9, it checks whether the name of your scope clashes with class methods that are already defined by ActiveRecord.

Line 13-15 is where the scope definition happens: it defines a class method named after the first argument, its method parameters are the block parameters from the lambda and the lambda's body becomes the method body. The returned value (captured by local variable scope) is an instance of ActiveRecord::Relation. 

If you provide a block to the scope method, the block's body will be used to define a anonymous module (line 11), which would be extended by the ActiveRecord::Relation object in line 15.

In case the above was confusing, let's take the classical BlogPost class as example

 

BlogPost <ActiveRecord::Base example

We could define some scopes to filter out posts, so we might have the following:

class BlogPost < ActiveRecord::Base
  scope :title_matches, lambda{ |str| where("title LIKE ?", "%#{str}%") }
  scope :published, lambda{ where(published: true) }
end

Behind the scenes, what the scope invocation above do, is the following:

class BlogPost < ActiveRecord::Base
  def self.title_matches(str)
    where("title LIKE ?", "%#{str}%")
  end
  
  def self.published
    where(published: true)
  end
end
 
Extending scope by adding a block

For the BlogPosts returned from :published, we might want to apply further scoping, which wouldn't make sense to the BlogPosts returned from other scopes (e.g. :title_matches). In that case, we could provide a block to the scope method as a third argument.

Here is a scenario where an additional block to the scope method would come in handy: Let's say we've applied tagging to our BlogPost class and for setting/getting its tags, we use the ActAsTaggableOn gem. By doing so, each instance of BlogPost responds to the :tag_list method, which would return an array of strings, which are the tags of a BlogPost.

Now, among the published post, we would like to count how many times each tag has been used. For that, we define a tags_count method within the scope of :published. See below:

class Post < ActiveRecord::Base
  acts_as_taggable # Specific for the ActAsTaggableOn gem

  scope :published, lambda{ where(published: true) } do
    def tags_count
      map(&:tag_list).inject({}) do |hash, arr|
        arr.each do |elem|  
          hash[elem] ? hash[elem] += 1 : hash[elem] = 1
        end
        hash
      end
    end
  end
end

BlogPost.published.tags_count
# => {"ruby"=>1, "javascript"=>2}

The code above actually produces the following:

class BlogPost < ActiveRecord::Base
  # Define an anonymous Module:
  extensions = Module.new do
    def tags_count
      map(&:tag_list).inject({}) do |hash, arr|
        arr.each do |elem|  
          hash[elem] ? hash[elem] += 1 : hash[elem] = 1
        end
        hash
      end      
    end
  end

  # Then define a class method and extend the ActiveRecord::Relation object with the anonymous Module from above
  def self.published
    scope = where(published: true)
    scope.extend extensions
    scope
  end
end

So in a nutshell, by extending the ActiveRecord::Relation object with the newly created anonymous module, all methods defined within this module now becomes available to the ActiveRecord::Relation object which was returned by the original scope.

In case the example with inject({}) was confusing, I would recommend you to check my earlier post for an example. 

As a continuation of this topic, in the next post, I will explain what the differences are between a Proc and a Lambda and why we should invoke the scope method with a lambda rather than a Proc.

Lambda, procs and ActiveRecord scopes - Part 1

Tags: ruby, lambda
Publish Date: 2016-06-19
Contents:
  1. The lambda method vs "stabby lambda" constructor ->(){}
  2. scope method within ActiveRecord(::Scoping::Named::ClassMethods)
  3. Differences between lambda and Proc

 

During a pair-programming session some time ago, while writing a rather long ActiveRecord scope, my coding-partner was asking me whether "it would work if we replace the curly braces {} with do / end".

scope :scope_name, -> (block-param1, block-param2 ){ some_invocation(block-param1).some_other_invocation(block-param2) }

I was a bit surprised when he asked this. As an experienced programmer with a strong acedemic background in CS, it should be clear to him that the curly braces were wrapping a code-block, right?

However, I can also see why my coding-partner was asking this:

  1. The stabby lambda's lack of explicity can create confusion.
  2. Many "Rails developers" simply follow code examples without looking seriously into the API. As a result, they don't know what they are doing.

 

The lambda method and "stabby lambda" constructor -> (){}

Both the lambda method and the "stabby lambda" produce the same result, so why are there two notations?

The reason is that, in Ruby versions older than 1.9, the interpreter had problems parsing lambda's that have block parameters with default values. The old interpreters could not figure out whether the second pipe (|) was a delimiter for block-parameters or a Bitwise OR operator:

# Cannot compile lambdas with block parameters that have default values
lambda { |a,b=1| puts a* b }

# No problems when block parameters have no default values
lambda { |a,b| puts a* b }
 => #<Proc:0x00007fd48490a7f8@(irb):5> 

As a result of this problem, the stabby lambda constructor has been added:

-> (a, b=0) { puts a * b }

However, this parsing problem has been solved from 1.9 and therefore the stabby lambda was no longer necessary. But in the meantime, it has created a cult following among Rails developers.

If you look at the Rails Guide for version 3.2 and before, you can see that they used to demonstrate scope with the lambda method. Then from version 4.0, for whatever reason, the Rails team has decided to demostrate the scope method with the stabby lambda constructor.

 

Pros and Cons of both notations

One reason why some prefer the stabby lambda, is that it is shorter than typing the word lambda.

However, in contrast to the lambda method, the "stabby lambda" constructor does not explicitly tell you that it is creating a lambda. As a result, when people follow the examples on Rails Guide (or elsewhere) with a stabby constructor, there is a chance that they are not aware of that they are actually creating a lambda.

In my opinion, even though lambda method takes more keystrokes, it still has the advantage of providing more clarity and explicity. Just imagine you being a new Ruby programmer, who has to perform a google search for "->(){} ruby" instead of "lambda ruby"!

In the next post, we will explore how the scope method within ActiveRecord uses lambda and blocks to define class methods and extends ActiveRecord::Relation objects.

 

Compose a hash using inject method

Tags: ruby, hash, inject, enumerable
Publish Date: 2016-04-26

inject is a method defined in the Enumerable module, that most people would use to sum up values.
Here is a classic way people would use this method, whereby the argument 0 is used as a starting value:

[1,2,3,4,5,6,7,8,9,10].inject(0){ |memo, elem| memo + elem }
# => 55

Since inject is defined in Enumerable, it is also available to instances of Hash

Hash.included_modules
# => [Enumerable, Kernel]
{}.respond_to? :inject
# => true

Let's say we have 3 employees, their names and ages are represented this way:

employees = [{name: "Alan", age: 30}, 
             {name: "Tom", age: 45}, 
             {name: "Steve", age: 22} ]

Obviously we could sum up their ages using inject, but this is not necessarily useful

total_age = employees.inject(0) do |memo, elem|
  memo += elem[:age]
end
# => 97

However, inject is a lot more versatile than that. My former colleague Paul (archan937) likes to use inject as a way to compose a hash, a trick which I have adapted.

In our example, we could use the previous employees array to provide some much more meaningful information, such as the amount of years before each employee reaches his retirement age.

Assuming the retirement age is 65:

years_to_retirement = employees.inject({}) do |memo, elem|
  memo[elem[:name]] = 65 - elem[:age]
  memo
end
# => {"Alan"=>35, "Tom"=>20, "Steve"=>43} 

In the previous example, an empty hash ({}) was provided to inject as the initial value.
Then during each iteration, we added a key-value pair to this hash, using the employee's name as key and years-to-retirement as value.

 

Pay attention to the second block-parameter

One thing to bear in mind when iterating through a collection with inject (as with other iterators), is that iterating through an array works differently from iterating through a hash.
In the previous examples, we saw that the second block-parameter represents individual elements in the given array. (As expected)

To clarify:

employees.class
# => Array 
employees.inject(0) do |memo, elem|
  puts elem.inspect
end
# It prints:
# {:name=>"Alan", :age=>30}
# {:name=>"Tom", :age=>45}
# {:name=>"Steve", :age=>22}

But when iterating through key-value pairs within a hash, the second block-parameter would act differently.
Let's say, my wallet is represented by a hash and the banknotes' denominations (50, 20, 10 & 5) act as keys while the amounts of these banknotes act as the value:

my_wallet = {50 => 1, 20 => 2, 10 => 3, 5 => 2 }

# Now look what happens when we inspect the second block-parameter (called 'elem' in this example):

my_wallet.inject(0) do |memo, elem|
  puts elem.inspect
end
# It prints:
# [50, 1]
# [20, 2]
# [10, 3]
# [5, 2]

So we saw that inject turns the key-value pairs into arrays of two elements, then pass it on as the second block-parameter.
This behaviour is not unique to inject, but it also applies to other iterators such as map and each.

While we could sum up the amount of money in my_wallet by summing products of elem[0] * elem[1], a much clearer way would be to modify our second block-parameter this way:

my_wallet.inject(0) do |memo, (key,value)|
  memo +=  key * value
end
# => 130