Smarter Extraction: Filtered, Focused, and Context-Aware

We’ve introduced powerful new capabilities to the Extraction API that give you more control over what gets extracted — and why.

🧠 What’s New?

You can now:

Filter extraction results using natural language
Specify the kind of information you want (e.g., “only Air Fare transactions”) and let the model do the work.
Filter using JsonPath expressions
For more programmatic control, use JsonPath to surgically include or exclude content.
Contextualize extraction with objectives and reference data
Provide the “why” behind the extraction. This helps the model reason more accurately using guidance like known passengers or category mappings.

✨ Example 1: Natural Language Filtering

If your goal is to extract only Air Fare transactions, you can specify this intent directly:

"filters": [
  {
    "type": "llm",
    "instructions": "Only keep Air Fare transactions."
  }
]

The model will return only transactions like:

{
  "description": "Flight Montreal to Moscow",
  "category": "Air Fare"
}

…and filter out entries like hotel stays, with reasoning like:

{
  "jsonPath": "$.transactions[2]",
  "reasoning": "Filtered out because the category is 'Hotel', not 'Air Fare'."
}

🔍 Example 2: JsonPath Filtering

Prefer something more deterministic? Use a JsonPath filter like this:

"filters": [
  {
    "type": "jsonPath",
    "instructions": "Only keep Air Fare",
    "jsonPath": "$.transactions[?(@.category=='Air Fare')]"
  }
]

You’ll get the same filtered result — but with explicit targeting.

🧭 Example 3: Contextual Extraction

Sometimes, you’re extracting for a specific purpose — say, a refund request. You can now provide an objective and supporting reference data to influence the result:

"context": {
  "objective": "Support a refund request by extracting details only for known affected passengers...",
  "referenceData": {
    "knownPassengers": ["Mathieu Isabel", "Arthur Isabel"],
    "categoryMapping": {
      "Flight": "Air Fare",
      "Deluxe Suite": "Hotel",
      "Carbon Offset": "Other"
    }
  }
}

With this context, the Extraction API returns only entries that align with the intent. For example:

{
  "passenger": "Mathieu Isabel",
  "description": "Flight from Montreal to Moscow",
  "category": "Air Fare",
  "amount": "985.31"
}

And it will filter out unrelated data like:

{
  "passenger": "Marie Isabel",
  "description": "Deluxe Suite Hotel Booking",
  "reasoning": "Filtered out because Marie Isabel is not a known affected passenger."
}

By combining your business intent with structured knowledge, the API adapts the extraction to what actually matters for your workflow.

🧪 Try It Out

These capabilities make the Extraction API smarter, more precise, and more aligned with your goals. Whether you’re building claim workflows, reconciliation pipelines, or document understanding tools — this unlocks new levels of flexibility.

Go check out the Content API on RapidAPI and give the extraction capability a try!

Dretza