Legacy Environment Post-Installation
Covering post-installation tasks that have to be performed with the platform
and ingestion
applications to get a system set up for ingests.
First:
- Create the
dpla
,dashboard
,dpla_api_auth
, andbulk_download
databases in the CouchDB Futon control panel. (e.g. http://local.dp.la:5984/_utils/index.html) This is easier than trying to do it with the rake tasks inplatform
, and there are no rake tasks for thedashboard
andbulk_download
databases. The username and password for the control panel are in the contentqa Ansible group_vars file.
This is the only step that needs to be run when building CQAi3 boxes
In the platform application:
Run rake tasks as
api
(sudo -u api -i) in /srv/www/api (You may need to 'rbenv shell 1.9.3-p547' first).$ bundle exec rake v1:create_and_deploy_index $ bundle exec rake v1:recreate_repo_api_key_database # Even though you created dpla_api_auth above; for adding a view. $ bundle exec rails generate delayed_job $ bundle exec rails generate delayed_job:active_record $ bundle exec rake db:migrate
Ensure that delayed_job is running, if you are using the contentqa engine. (Skip this paragraph if you don't know what contentqa is or don't need it yet.) Our configuration manager (automation) installs an init script as /etc/init.d/delayed_job_api. Unfortunately, that script has no "status" command, so you can use ps aux | grep [d]elayed_job
to find out if it's running. You should be able to use sudo service delayed_job_api
start
to start it, if necessary.
Install pyenv on the system where you will run ingestion
. If you're using our VMs, this should be on your local system, not one of the VMs.
# run as 'ingestion' user git clone https://github.com/pyenv/pyenv.git ~/.pyenv echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc echo 'eval "$(pyenv init -)"' >> ~/.bashrc exec $SHELL
Install Python 2.7.6 by running pyenv install 2.7.6
. If you're on a server where the legacy ingestion system is the only Python application, make that the global default by typing pyenv global 2.7.6
.
Install virtualenv by typing pip install virtualenv
.
On a server that is dedicated to ingestion
, we tend to use /v1/ingestion as the virtualenv and put the application in /v1/ingestion/ingestion via git clone. You'll need to create this /v1 directory as root, which means you should run
sudo mkdir /v1 sudo chmod a+rwx /v1
Create a virtualenv environment where you will install the ingestion application. The example below shows where we put it on our dedicated ingestion server, but the location is really up to you if you're doing this locally. To configure and set up the virtrualenv:
$ virtualenv /v1/ingestion $ source /v1/ingestion/bin/activate
Then, cd into /v1/ingestion and clone the ingestion application with
git clone https://github.com/dpla/ingestion.git
In the ingestion application:
- Install the necessary Python packages by running
pip install -r requirements.txt
in/v1/ingestion/ingestion
. - In ingestion, edit your
akara.ini file
, as suggested at https://github.com/dpla/ingestion. If you need to run thecontentqa
engine, setSyncQAViews=True
.- Please ensure that your [Twofishes] configuration is correct and uses the IP address (not the host name!) of the geo-prod box. This is a temporary patch until a long term solution using either hosts file or DNS is put in place for these "stand-alone" boxes that are not truly standalone anymore since they depend on an external Twofishes server.
- Run
python setup.py install
- Create the /v1/ingestion/ingestion/logs directory:
mkdir logs
Run
sync_couch_views.py
:$ python scripts/sync_couch_views.py dpla $ python scripts/sync_couch_views.py dashboard $ python scripts/sync_couch_views.py bulk_download