Krikri::Indexer.enqueue({ index_class: 'Krikri::QASearchIndex', generator_uri: Krikri::Activity.find(activity_id).rdf_subject }) |
A successful index job will commit to solr upon completion.
To clear a provider from the QA index manually, you can do: qa = Krikri::QASearchIndex.new qa.delete_by_query 'provider_id:http\://dp.la/api/contributor/washington' qa.commit |
The staging host is given by a ansible configuration variable es_cluster_loadbal
at https://github.com/dpla/aws/blob/master/ansible/group_vars/staging#L25.
stg = 'internal-search-lbal-stg-1352112635.us-east-1.elb.amazonaws.com' # verify that this is up to date; the job will fail after the query (within 5 minutes) if it is incorrect. Krikri::Indexer.enqueue({ index_class: 'Krikri::ProdSearchIndex', generator_uri: Krikri::Activity.find(activity_id).rdf_subject, host: stg, index_name: 'dpla_alias' }) |
If you need to index data to the temporary frontend QA portal (http://ec2-54-172-127-200.compute-1.amazonaws.com/) use the staging host for the search load balancer (from above) but change the index name from 'dpla_alias' to 'fqa_172_30_0_143'
stgHost = 'internal-search-lbal-stg-1352112635.us-east-1.elb.amazonaws.com' # verify that this is up to date; the job will fail after the query (within 5 minutes) if it is incorrect. Krikri::Indexer.enqueue({ index_class: 'Krikri::ProdSearchIndex', generator_uri: Krikri::Activity.find(activity_id).rdf_subject, host: stgHost, index_name: 'fqa_172_30_0_143' }) |
Krikri::Indexer.enqueue(index_class: 'Krikri::ProdSearchIndex', generator_uri: Krikri::Activity.find(activity_id).rdf_subject) |
When indexing an existing provider from Heidrun for the first time, we need to clear the old indexed items. These will appear as duplicates with the same API ID, due to a change in how we handle the index's internal `
idx_prod = Krikri::ProdSearchIndex.new provider_name = "scdl" # for example query = {:query=>{:filtered=>{ :query=>{:match_all=>{}}, :filter=>{:bool=>{ :must_not=>{:term=>{:ingestionSequence=>"999999"}}, :must =>{:term=>{:"provider.@id"=>"http://dp.la/api/contributor/#{provider_name}"}} }}}}} response = idx_prod.elasticsearch.search(index: 'dpla_alias', body: query) response['hits']['total'] # check that hit total matches expected; probably a good idea to check actual matches, too. # delete the items; look for "successful"=>5 # if you get failures, checking the logs in `/var/log/elasticsearch` on the production boxes is a good starting place for diagnostics idx_prod.elasticsearch.delete_by_query(index: 'dpla_alias', body: query[:query]) # => {"ok"=>true, "_indices"=>{"dpla-20150410-144958"=>{"_shards"=>{"total"=>5, "successful"=>5, "failed"=>0}}}} |