Overview
The Reader is the foundational gateway for automation. It goes far beyond traditional OCR by not only reading the text; but converting documents (PDFs, TIFFs, scanned Images, etc. ) into a rich, machine-readable markup language. The output captures everything from text, graphics and tables to the spatial relationships between them, creating a perfect digital twin for Agents and downstream nodes to use.
Capabilities
-
Holistic Analysis
Uses proprietary models to recognize all key components, including paragraphs, tables, graphics, form fields, checkboxes, and signatures.
-
Verifiable Output
Generates a visual overlay of model confidence, guiding reviewers to areas that need the most attention.
-
Audit-Ready Output
Allows AI agents and downstream models to pinpoint the source of information ensuring auditability.
-
Coach Integration
Learn from corrections to handle your specific proprietary layouts.
-
Universal Document Support
Processes any document format, including native PDFs, Word, Scanned images, and documents with printed or handwritten text.
-
Model Agnostic
Seamlessly plug in proprietary OCR or massive Visual Language Models (VLM).
Benefits
-
Unlock true automation
Move beyond simple text extraction to enable complex automation based on a document's true contextual meaning.
-
Eliminate data entry errors
Drastically reduce mistakes by understanding the relationship between fields, like which name belongs to which borrower.
-
Ensure full auditability
Enable AI agents to prove their findings by linking every piece of extracted data to its source location.
FAQ
Standard OCR extracts text but discards its context. The Reader Node understands context. It knows that a number is part of an address, a line item in a table, or the final total on an invoice. This semantic understanding is crucial for reliable automation.
Yes, the Reader allows you to bring your own OCR APIs or use Papyri's proprietary models.