Memory Profiling

This page is a quick guide on tracking down and patching memory leaks and other memory usage issues in Ruby & Rails.)

Example: Profiling Harvesters

Our main interest in Harvesters for the total active memory (MemoryProfiler::Results#total_retained_memsize) to stay near constant, even while the total memory used throughout (MemoryProfiler::Results#total_allocated_memsize) grows linearly.  In the simplest case:

Memory Profiling Harvesters
require 'memory_profiler'

recs =  MyHarvester.new.records
report = MemoryProfiler.report { 100.times { recs.next } }
 
report.total_retained_memsize # => 1560673

Ideally the output of #total_retained_memsize is very similar to the same result for MemoryProfiler.report { recs.next }.total_retained_memsize. In practice we observe growth up into the tens of records.  The end result may also vary widely based on the size of the record in question.  A good approach is to try gradually increasing the size of the "harvest", comparing memory costs as number of records touched grows.

Varying Harvest Size
reports = {}
reports[:report_1]   = MemoryProfiler.report { recs.next }
reports[:report_5]   = MemoryProfiler.report { 5.times { recs.next } }
reports[:report_10]  = MemoryProfiler.report { 10.times { recs.next } }
reports[:report_50]  = MemoryProfiler.report { 50.times { recs.next } }
reports[:report_100] = MemoryProfiler.report { 100.times { recs.next } }
reports[:report_150] = MemoryProfiler.report { 150.times { recs.next } }
reports[:report_250] = MemoryProfiler.report { 250.times { recs.next } }
reports[:report_500] = MemoryProfiler.report { 500.times { recs.next } }
 
reports.map { |key, report| [key, report.total_retained_memsize] }
# => [[:report_1, 36673], 
#     [:report_5, 34652], 
#     [:report_10, 1377225], 
#     [:report_50, 1306615], 
#     [:report_100, 1303581], 
#     [:report_150, 1622342], 
#     [:report_250, 1582591], 
#     [:report_500, 1793210]] 

Generally, I've found that pushing beyond 500 records causes problems for MemoryProfiler.  Since it keeps track of objects allocated by gem throughout the entire run, the total number of objects grows very large.  Still, at 500, we might be a little suspicious of the growth we see in the higher numbers, here.  We can get a slightly closer look through #objects_retained_by_gem.  

I inspected the reports as a whole with reports.map { |key, report| [key, report.retained_objects_by_gem] }, the full output isn't worth reproducing, here. The takeaway is that a total of 9 gems have retained objects for each reported "harvest" (Spring occasionally sneaks in as a 10th).  Among these, we see stable memory usage between harvests for all but two.  The JSON library's retained objects grow from 1, to 6642 between reports 5 and 10, then remains stable (5000-6700) after this point.  This seems to account for the baseline memsize growth in all the following reports.

Additionally, the RDF gem has an initially very low object count, but it seems to grow linearly between reports:

Objects Retained by Gem
reports.map { |key, report| [key, report.retained_objects_by_gem] } # I won't repeat the full output, here.
 
reports[:report_1].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } #   => [{:data=>"rdf-1.1.17.1", :count=>17}]
reports[:report_5].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } #   => [{:data=>"rdf-1.1.17.1", :count=>65}]
reports[:report_50].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } #  => [{:data=>"rdf-1.1.17.1", :count=>605}]
reports[:report_100].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>1205}]
reports[:report_250].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>3005}]
reports[:report_500].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>6005}]
 

While the effect is negligible at first, it becomes visible around the 500 mark.  At this point, the total objects in RDF are catching up to JSON.  What's going on here?

Gems (ones we've used/mentioned in examples)

  1. MemoryProfiler

Web Resources

  1. What I Learned About Hunting Memory Leaks in Ruby 2.1
  2. Debugging memory leaks in Ruby