Content Analysis
mathieu.isabel  

Smarter Extraction: Filtered, Focused, and Context-Aware

We’ve introduced powerful new capabilities to the Extraction API that give you more control over what gets extracted — and why.

🧠 What’s New?

You can now:

  • Filter extraction results using natural language
    Specify the kind of information you want (e.g., “only Air Fare transactions”) and let the model do the work.
  • Filter using JsonPath expressions
    For more programmatic control, use JsonPath to surgically include or exclude content.
  • Contextualize extraction with objectives and reference data
    Provide the “why” behind the extraction. This helps the model reason more accurately using guidance like known passengers or category mappings.

✨ Example 1: Natural Language Filtering

If your goal is to extract only Air Fare transactions, you can specify this intent directly:

"filters": [
  {
    "type": "llm",
    "instructions": "Only keep Air Fare transactions."
  }
]

The model will return only transactions like:

{
  "description": "Flight Montreal to Moscow",
  "category": "Air Fare"
}

…and filter out entries like hotel stays, with reasoning like:

{
  "jsonPath": "$.transactions[2]",
  "reasoning": "Filtered out because the category is 'Hotel', not 'Air Fare'."
}

🔍 Example 2: JsonPath Filtering

Prefer something more deterministic? Use a JsonPath filter like this:

"filters": [
  {
    "type": "jsonPath",
    "instructions": "Only keep Air Fare",
    "jsonPath": "$.transactions[?(@.category=='Air Fare')]"
  }
]

You’ll get the same filtered result — but with explicit targeting.

🧭 Example 3: Contextual Extraction

Sometimes, you’re extracting for a specific purpose — say, a refund request. You can now provide an objective and supporting reference data to influence the result:

"context": {
  "objective": "Support a refund request by extracting details only for known affected passengers...",
  "referenceData": {
    "knownPassengers": ["Mathieu Isabel", "Arthur Isabel"],
    "categoryMapping": {
      "Flight": "Air Fare",
      "Deluxe Suite": "Hotel",
      "Carbon Offset": "Other"
    }
  }
}

With this context, the Extraction API returns only entries that align with the intent. For example:

{
  "passenger": "Mathieu Isabel",
  "description": "Flight from Montreal to Moscow",
  "category": "Air Fare",
  "amount": "985.31"
}

And it will filter out unrelated data like:

{
  "passenger": "Marie Isabel",
  "description": "Deluxe Suite Hotel Booking",
  "reasoning": "Filtered out because Marie Isabel is not a known affected passenger."
}

By combining your business intent with structured knowledge, the API adapts the extraction to what actually matters for your workflow.

🧪 Try It Out

These capabilities make the Extraction API smarter, more precise, and more aligned with your goals. Whether you’re building claim workflows, reconciliation pipelines, or document understanding tools — this unlocks new levels of flexibility.

Go check out the Content API on RapidAPI and give the extraction capability a try!

Leave A Comment