Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Notes, for reference by the DPLA tech team.

The following dependencies work together, and enable read/write to S3

...

Use either of these methods to add dependencies to an EC2 cluster:

  • Add as jar files to /home/ec2-user/spark/jars/
  •  List after the --packages flag when running spark-submit

Dependencies for HarvestEntry

...

Dependencies for JsonlEntry

  • com.databricks:spark-avro_2.11:4.0.0

  • org.apache.hadoop:hadoop-aws:2.7.6

  • com.amazonaws:aws-java-sdk:1.7.4

  • org.rogach:scallop_2.11:3.0.3

Dependences for IngestRemap 

...