Automation Project

Overview

This document is about the collection of Ansible playbooks and supporting files contained in our GitHub repository https://github.com/dpla/automation.

Our Infrastructure Migration and Automation Initiative page, which dates from 2015, expresses the goals of the DPLA Automation project. This document elaborates a bit on the current state of affairs especially with concern to participation by external parties.

Project Utility Outside of the DPLA

The VirtualBox virtual machines that are created and provisioned by automation if you follow the instructions in the README provide an application stack that represents the DPLA's legacy technology. Current advances in our Ingestion 3 project and in some of our supporting infrastructure are not represented. The automation VMs also don't contain our Ingestion 1 system, which would be necessary for getting data into the search index for the Frontend and API applications to function. This latter fact is mentioned in the README but bears repeating here.

We don't have anything as of June 2017 that would help you explore our Ingestion 3 system or view where we're headed with the revisions that we're currently discussing internally for our frontend sites in general. The frontend webapps that are installed on those automation VMs are slated for replacement.

Requirements for Implementation

The automation project requires some degree of familiarity and comfort with Unix-like operating systems in general and the nature of the applications that it attempts to provision and configure demands a certain willingness to get one's hands dirty and tinker and troubleshoot. Partly because the legacy DPLA applications (especially frontend, platform (API), and ingestion) predate automation and partly because they are very old applications that are likely to break due to dependency deprecation issues, it's become more of a reality that we can't provide a completely smooth installation process for the whole suite of applications. You should expect to spend some time troubleshooting and figuring out how to get these applications installed, and how to set up our legacy ingestion app to ingest data for the frontend and API to work with.

The Wordpress, Exhibitions, and Primary Source Sets sites that are installed on the VMs are minimally functional due to the fact that (due to time and allocation issues) we can't provide sample data for these content-management sites, and the Wordpress comes from a private git repository due to asset copyright restrictions. You'll end up with very sparse installations of these applications that won't look like the sites you see on https://dp.la/. You'll need to fill in your own data, learn about Omeka, if necessary, and tinker with Primary Source Sets to see these working on your VMs.