CLOUDCRUISE UNITED

LLM_VISION only

ExtractDatamodel — PDF / Canvas Viewer

Claims data is rendered on a <canvas> element, simulating a PDF viewer or canvas-based spreadsheet. There are no DOM text nodes — STATIC XPath finds nothing and LLM_DOM sees only a blank canvas tag. Only LLM_VISION (screenshot) can read the rendered content.

Ground truth (hidden from DOM — only in canvas pixels)
Claim IDPatientAmountStatusDate
CLM-90210Martinez, Rosa$1,245.00Approved04/15/2026
CLM-90211Kim, David$892.50Pending04/16/2026
CLM-90212Brown, Alice$3,100.00Denied04/17/2026
CLM-90213Singh, Raj$567.25Approved04/18/2026
Why only LLM_VISION works
  • Canvas renders pixels — the DOM contains only <canvas> with zero text nodes
  • STATIC XPath: no text to extract, returns null for every field
  • LLM_DOM: serialized HTML just shows <canvas data-testid="claims-canvas"></canvas>
  • LLM_VISION: screenshot shows the full rendered table — LLM reads it like a human would
  • Real-world equivalent: PDF.js viewer, canvas-based EHR grids, chart widgets with labels
Workflow node config
{
  "action": "EXTRACT_DATAMODEL",
  "parameters": {
    "execution": "LLM_VISION",
    "extract_data_model": {
      "type": "object",
      "properties": {
        "claims": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "claim_id": { "type": "string" },
              "patient":  { "type": "string" },
              "amount":   { "type": "string" },
              "status":   { "type": "string" },
              "date":     { "type": "string" }
            }
          }
        }
      }
    }
  }
}