Retrieving A Full Extract
Full extracts of the Acuris Risk Intelligence content set are generated once per month on the 1st of the month and is always available from 00:00 BST. Below is an example curl request showing the request for the individuals:
The response will include a link as well as a timestamp:
- The link returned is a URL that is valid for 15 minutes and will allow you to access the extract. When using the link, you don’t have to provide your API key in the header.
- The timestamp provided indicates the time that the full extract was generated. This should be stored somewhere as it will be critical for subsequent delta processing.
The file downloaded is GZIP compressed and will need to first be extracted by you; the contents will be in JSON Lines format which means each line of the file contains a fully parsable JSON document. Each line therefor contains a full ARI profile. The main benefit to this is the ability to parse and load each record one at a time in order to conserve memory usage. Note that the file is UTF-8 encoded. The schema for the documents is the same as the profile schema defined in the delta endpoint.
Important: If you download the full extract using a browser, the browser will automatically unzip the file but will not remove the “.gz” filename suffix. This is a known issue we see in most browsers. We suggest you manually rename the file to remove the “.gz” extension.
Here is a sample pseudo-code snippet that outlines how you should parse JSON Lines once it is downloaded:
Note: You should retrieve the full extract monthly to perform a reconciliation against your database to ensure you are staying in sync.