Using the Benchmark Module

Kevin Huang
5 min readOct 19, 2020

--

I recently started my journey to become a software engineer at Flatiron School and it has been great! The main language we’ve been using at Flatiron is Ruby. I’ve had a bit of coding experience during college, learning about Object Oriented Programming in Java. However, learning OO was as far as I got.

During the pre-work we were introduced to enumerables. Enumerables are a collection of classes that includes methods that sorts and filter (and many more!) through data provided. One thing that stood out to me was both #map and #collect. I know they both do the same thing, but I was curious to see how they performed separately. While I was googling the difference between the two methods, I came across a post that introduced the Benchmark module. The module has classes that calculates the time it takes the computer to execute the given code.

Using the benchmark module

To start using benchmark we’ll need to install the benchmark gem, to do so, run the following command line:

$ gem install benchmark

Next, all we need to use is “require ‘benchmark’” similar to using Pry.

There are multiple methods the benchmark class provides. To learn more about the other methods check them out here. I will be using the #bm method in the examples below which generates sequential reports and allows us to label each report.

Comparing both #map and #collect

Lets consider the following code:

Using #map

I created an array with 4 items and used the #map method and passed in the block to add 10 to every element in the array. I then took it over to the Benchmark to see how long it takes to execute this, and the following are the result:

using #map

What the numbers shown above are represented in seconds. It shows the user CPU time, system CPU time and the total of both the user and system CPU time. The number we’re looking for is the number in the parenthesis (0.000004) which shows the elapsed real time. Note: execute time will be based on cpu. I am currently running OS Catalina 3.1ghz dual-core i5.

Now let’s see how it compares with the #collect method

using #collect

There is almost no difference in execution time with so little data, so let’s take this test into a bigger scale.

Doing the test in a bigger scale

I created an array with 20,000 indexes that consists consecutive numbers from 1–20,000. I then used #map to add 300 to each element 10,000 times. I decided to run the method 3x each.

Almost no difference in executing time!

So it seems as there is no difference between using #map and #collect. What about other methods that are very similar? What I wanted to find was what methods were similar to #collect and #map.

So then, I decided to try Another method that I used frequently. First thing that came to mind and used a lot is the #push and shovel method (<<). Both of these methods appends to an existing array and updates the current array.

For this test, I used the same statement code, adding 200 to every element in the array. Instead of iterating through 10,000 I did 50,000 times instead.

Using the #bm method I can run multiple reports. I ran #push and << three times for more concrete results. I labeled each report respectively (.push_1,.push_2,.push_3,<<_1,<<_2,<<_3)And the following is what I got back:

What I found was using the shovel method is almost 10 seconds faster! Why does the shovel method execute so much faster? There has to be a reason why there is such a big difference in execution time.

What I found out is that while #push and shovel are similar, they aren’t necessarily the same. While they both append to an array, the #push method can take more than 1 argument, and the shovel method only takes one. Which is why I believe shovel was executing faster because the #push method might be looking for any additional arguments.

It seems more efficient to use the shovel method when you need to append only 1 value to an array. If you need to append more than one value, then you will need to use the #push method.

Conclusion

I think the benchmark tool is amazing and I’m excited to use this tool in the future. At the moment our codes are generally no longer than 100 lines, and the amount of data we’re dealing with is not very big. The reports generated by benchmark is almost negligible. I think it’s important that we keep in mind the efficiency of our codes so when we progress throughout the program/career we know which methods we can call on for better performances. Be it speed or memory allocation. If you like to see more on performances, I came across another medium article by Dr. Derek Austin who goes into detail on the fastest way to find a min,max number in an array in JS.

--

--