This quickstart guide will help you make your first API call to justextract.it in under 5 minutes. If you are looking for detailed API specifications, please refer to our reference document instead.

Step 1: Get an API Key

Please book a meeting and get a paid API key from our console thereafter.

Important: You will not be able to proceed without completing this step.

Step 2: Set your environment variable

local.env
JUSTEXTRACT_API_KEY = YOUR_API_KEY

Step 3: Prepare a PDF file for test extraction

To speed up experimention, we will use a publicly hosted PDF report by McKinsey to demonstrate various extraction capabilities. Alternatively, feel free to use your own PDF document hosted publicly.

Important: Google Drive links or any proprietary cloud drive URLs pointing to a PDF will not work. Only raw PDFs, e.g. those hosted on S3 will be accessible by our service for privacy reasons.

Step 4: Copy the code below to make a fetch request

const response = await fetch(
    "https://api.justextract.it/v1/api/extract",
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: "Bearer YOUR_API_KEY",
      },
      body: JSON.stringify({
        // We will use a sample PDF report from McKinsey here
        url: "https://www.mckinsey.com/~/media/mckinsey/business%20functions/quantumblack/our%20insights/the%20state%20of%20ai/2024/the-state-of-ai-in-early-2024-final.pdf",
        filters: [
          // These filter options can be left empty for now
          // Read "https://docs.justextract.it/development" for more info
          { pages: [] },
          { query: "" },
          { keywords: [] },
          { content_types: [] },
          { orientation: "portrait" },
        ],
      }),
    }
  );

const data = await response.json();

```python api.py
import requests

response = requests.post(
  'https://api.justextract.it/v1/api/extract',
  headers={
      'Content-Type': 'application/json',
      'Authorization': 'Bearer YOUR_API_KEY'
  },
  json={
      'url': 'https://www.mckinsey.com/~/media/mckinsey/business%20functions/quantumblack/our%20insights/the%20state%20of%20ai/2024/the-state-of-ai-in-early-2024-final.pdf',
      'filters': [
          {'pages': []},
          {'query': ''},
          {'keywords': []},
          {'content_types': []},
          {'orientation': 'portrait'}
      ]
  }
)

data = response.json()

Step 5: Run the code in your development environment

npm
node api.js

A response should soon appear in your console.

Success Response Object

To make development as easy as possible for you, all response objects are standardized across our APIs:

example-response.json
{
  "language_detected": ["english", "vietnamese"],
  "handwritten_detected": "true",
  "extracted_data": [
    {
      "id": "1",
      "page": "1",
      "data": {
        "header": "string",
        "body": {
          "table": [
            {
              "row_id": "1",
              "content": "string"
            },
            {
              "row_id": "2",
              "content": "string"
            }
          ],
          "caption": "string"
        },
        "footer": "string"
      }
    }
  ]
}

For a full list of values contained in a response, please refer to our API reference.

To increase your extraction volume, you may request a custom pricing plan from us.