How can I optimize this array.each in ruby?

Posted on

Problem

I’m trying to optimize this bit of code and I thought I’d throw it out here to see if anyone has any ideas. I’ve been testing the performance using Benchmark.measure {...} and wrapping the entire function.

Yes, the item has a date that is a Ruby Date

  puts Benchmark.measure {
    grouped_items
  }

  def grouped_items
    unless @grouped
      @grouped = {}
      @items.each do |item|
        key = item.date
        if @grouped.has_key?(key)
          @grouped[key] << item
        else
          @grouped[key] = [item]
        end
      end
    end
    @grouped
  end

Thanks for any insight you care to share.

Edit #1: My first optimization. I was aiming for a slightly more succinct function and gained a .2 seconds reduction in time to process 100_000 items.

  def grouped_items
    unless @grouped
      @grouped = Hash.new { |h, k| h[k] = [] }
      @items.map {|item| @grouped[item.date] << item }
    end
    @grouped
  end

Edit #2: My second iteration with more or less the same performance profile but with much less code.

def grouped_items
  @grouped ||= @items.group_by { |item| item.date }
end

Edit #3: An alternative way to do the same thing above.

def grouped_items
  @grouped ||= @items.group_by &:date
end

Solution

Are you aware of Enumerable#group_by?

def grouped_items
  @items.group_by{|item| item.date}
end

What about this?

@items.group_by { |item| item.effective_date }

Benchmarks for the curious:

Ryan v1 : 1.080000   0.000000   1.080000 (  1.077599)
Ryan v2 : 0.620000   0.000000   0.620000 (  0.622756)
group_by: 0.580000   0.000000   0.580000 (  0.576531)

Leave a Reply

Your email address will not be published. Required fields are marked *