Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Rearranged sections, noted new file format

...

Files are formatted as JSON, and have the following structure:

Code Block
languagejs
[
    {
		...
		"_source": { ... record ... }
		...
    },
	... more ...
]

This is a straight dump of an Elasticsearch index and has some fields outside of "_source" that you can ignore.

Per the note below, we would like to switch back to a lighter-weight structure with fewer unnecessary fields some time in 2016, after we can perform a software upgrade that will make this possible.

Former file formats

If you wrote software to process our files before December 15th, 2015, it was designed to work with one of the following structures, and will need to be updated.

The first format resulted from our old method of exporting the data from CouchDB views, where each element of "rows" had a "doc" property, as follows.

Code Block
{
    "total_rows": <number>,
    "rows": [
                {
                    "doc": {
                               ... record ...
                    }
                },
                ... more rows ...
            ]
}

 

...

Prior to May 28th, 2014, we were also including various other CouchDB-related properties alongside "doc" in every row element.

New file format

We will be changing We changed the structure of our export files' JSON on July 1st, 2014 .  The existing format is a legacy of the way we used to export the direct output of CouchDB views, where each element of "rows" had a "doc" property.  The new format will be more simple, and will result in lower file sizes, especially for the larger files.  The format that we are currently considering is as to be as follows:

Code Block
languagejs
[
    {  ... record ... },
    ... more records ...
]

We intent to change back to this format some time in 2016, pending a related software upgrade.

Please let us know if you have any comments or questions about the new format, using our contact form.

...