Memory Profiling
This page is a quick guide on tracking down and patching memory leaks and other memory usage issues in Ruby & Rails.)
Example: Profiling Harvesters
Our main interest in Harvesters for the total active memory (MemoryProfiler::Results#total_retained_memsize
) to stay near constant, even while the total memory used throughout (MemoryProfiler::Results#total_allocated_memsize
) grows linearly. In the simplest case:
require 'memory_profiler' recs = MyHarvester.new.records report = MemoryProfiler.report { 100.times { recs.next } } report.total_retained_memsize # => 1560673
Ideally the output of #total_retained_memsize
is very similar to the same result for MemoryProfiler.report { recs.next }.total_retained_memsize
. In practice we observe growth up into the tens of records. The end result may also vary widely based on the size of the record in question. A good approach is to try gradually increasing the size of the "harvest", comparing memory costs as number of records touched grows.
reports = {} reports[:report_1] = MemoryProfiler.report { recs.next } reports[:report_5] = MemoryProfiler.report { 5.times { recs.next } } reports[:report_10] = MemoryProfiler.report { 10.times { recs.next } } reports[:report_50] = MemoryProfiler.report { 50.times { recs.next } } reports[:report_100] = MemoryProfiler.report { 100.times { recs.next } } reports[:report_150] = MemoryProfiler.report { 150.times { recs.next } } reports[:report_250] = MemoryProfiler.report { 250.times { recs.next } } reports[:report_500] = MemoryProfiler.report { 500.times { recs.next } } reports.map { |key, report| [key, report.total_retained_memsize] } # => [[:report_1, 36673], # [:report_5, 34652], # [:report_10, 1377225], # [:report_50, 1306615], # [:report_100, 1303581], # [:report_150, 1622342], # [:report_250, 1582591], # [:report_500, 1793210]]
Generally, I've found that pushing beyond 500 records causes problems for MemoryProfiler
. Since it keeps track of objects allocated by gem throughout the entire run, the total number of objects grows very large. Still, at 500, we might be a little suspicious of the growth we see in the higher numbers, here. We can get a slightly closer look through #objects_retained_by_gem
.
I inspected the reports as a whole with reports.map { |key, report| [key, report.retained_objects_by_gem]
}, the full output isn't worth reproducing, here. The takeaway is that a total of 9 gems have retained objects for each reported "harvest" (Spring occasionally sneaks in as a 10th). Among these, we see stable memory usage between harvests for all but two. The JSON
library's retained objects grow from 1, to 6642 between reports 5 and 10, then remains stable (5000-6700) after this point. This seems to account for the baseline memsize growth in all the following reports.
Additionally, the RDF
gem has an initially very low object count, but it seems to grow linearly between reports:
reports.map { |key, report| [key, report.retained_objects_by_gem] } # I won't repeat the full output, here. reports[:report_1].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>17}] reports[:report_5].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>65}] reports[:report_50].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>605}] reports[:report_100].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>1205}] reports[:report_250].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>3005}] reports[:report_500].retained_objects_by_gem.select { |gem| gem[:data].include?('rdf') } # => [{:data=>"rdf-1.1.17.1", :count=>6005}]
While the effect is negligible at first, it becomes visible around the 500 mark. At this point, the total objects in RDF
are catching up to JSON
. What's going on here?