Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Krikri::Harvesters::OAIHarvester

  • Krikri::Harvesters::CouchdbHarvester


Running an Ad-Hoc Harvest

In non-production environments, it's often useful to run a portion of the full harvest. You can do this with a harvester as follows:

Code Block
languageruby
themeEmacs
titleRun a partial harvest
 > harvester = Krikri::Harvesters::OAIHarvester.enqueue(opts)
 > test_harvest_uri = RDF::URI('http://example.org/my_test_harvest')
 > harvester.records.take(1000).each { |rec| harvester.process_record(rec, test_harvest_uri) 

This does effectively what the Harvester's `#run` method does, and will process the records in the same way as a queued harvest, as though it had been run by an activity "http://example.org/my_test_harvest". Note that while you can query the records by that URI through the provenance query client, etc... this does not create a `Krikri::Activity` in the database.

 

 Harvest Behaviors and Record Class

By default, running a harvester saves each record as a `Krikri::OriginalRecord`.  This behavior is customizable by passing a class implementing the `HarvestBehavior` interface to `:harvest_behavior`, and/or a different record class (responding to `#build`) to the `:record_class` option.  The OAI harvester, for instance, implements a specialized `SkipDeletedBehavior` which passes silently over OAI records marked with the status "deleted".