...
Krikri::Harvesters::OAIHarvester
Krikri::Harvesters::CouchdbHarvester
Running an Ad-Hoc Harvest
In non-production environments, it's often useful to run a portion of the full harvest. You can do this with a harvester as follows:
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
> harvester = Krikri::Harvesters::OAIHarvester.enqueue(opts)
> test_harvest_uri = RDF::URI('http://example.org/my_test_harvest')
> harvester.records.take(1000).each { |rec| harvester.process_record(rec, test_harvest_uri) |
This does effectively what the Harvester's `#run
` method does, and will process the records in the same way as a queued harvest, as though it had been run by an activity "http://example.org/my_test_harvest". Note that while you can query the records by that URI through the provenance query client, etc... this does not create a `Krikri::Activity
` in the database.
Harvest Behaviors and Record Class
By default, running a harvester saves each record as a `Krikri::OriginalRecord
`. This behavior is customizable by passing a class implementing the `HarvestBehavior
` interface to `:harvest_behavior
`, and/or a different record class (responding to `#build
`) to the `:record_class
` option. The OAI harvester, for instance, implements a specialized `SkipDeletedBehavior
` which passes silently over OAI records marked with the status "deleted".