If you haven’t already, please refer to our Quickstart Guide first to understand the basics of making API calls with justextract.it.

Prerequisites

  • A valid API key (obtained from the quickstart process)
  • Environment variable JUSTEXTRACT_API_KEY properly configured
  • Basic understanding of making API requests

Advanced Filtering

The justextract.it API supports sophisticated filtering capabilities through an array of filter objects. Each filter object can be one of the following types:

Content Type Filtering

Filter content based on specific types:

{
  "include": false,
  "content_types": ["table", "image", "text", "hyperlink"]
}

Keyword Filtering

Search for specific keywords within the document:

{
  "include": false,
  "keywords": ["specific", "terms", "to", "find"]
}

Custom Query Filtering

Use natural language queries to find relevant content:

{
  "include": false,
  "query": "Find sections discussing financial performance"
}

Page Number Filtering

Extract specific pages:

{
  "include": false,
  "pages": [1, 3, 5]
}

Page Orientation Filtering

Filter based on page orientation:

{
  "include": false,
  "orientation": "portrait" // or "landscape"
}
  • The include parameter defaults to false if not specified - Filters can be combined in any order within the filters array - Multiple instances of the same filter type can be used - Filter processing is optimized: non-AI filters (page number, orientation) are processed first - Sequential filtering: Results from one filter are passed to the next filter in the array

Example Request with Multiple Filters

{
  "url": "https://example.com/document.pdf",
  "filters": [
    {
      "pages": [1, 2, 3],
      "include": true
    },
    {
      "keywords": ["revenue", "growth"],
      "include": true
    },
    {
      "query": "Find sections discussing market analysis",
      "include": true
    }
  ]
}

Using multiple filters, especially custom queries, may impact processing time and costs. Consider optimizing your filter strategy based on your specific needs.