D

lift

Multimodalby Datalab·Model page

Datalab's multimodal model for extracting structured JSON data from PDFs and document images.

Share:

Model Card

lift

lift is a structured extraction model from Datalab that pulls structured JSON out of PDFs and images. Pass any JSON schema and lift returns a JSON object matching it, using schema-constrained decoding to guarantee valid, well-typed output.

Try lift in the free playground, or use the hosted API for higher accuracy, per-field verification, and citations.

Features

  • Extract structured data from documents
  • Pass any JSON schema
  • Handles multi-page documents in a single pass, including values that span pages
  • Two inference modes: local (HuggingFace) and remote (vLLM server)
  • CLI for single files, inline schemas, or whole directories
  • Schema Studio: a Streamlit app to build, save, and test schemas against your documents

Quickstart

pip install lift-pdf

# With vLLM (recommended, lightweight install)
lift_vllm
lift_extract input.pdf ./output --schema schema.json

# With HuggingFace (requires torch)
pip install lift-pdf[hf]
lift_extract input.pdf ./output --schema schema.json --method hf

A schema is standard JSON Schema. Keep it simple — string, number, integer, boolean, arrays of those, arrays of objects, and nested objects are all supported. Write a description for any field whose name isn't self-explanatory, and mark a field required only when it must appear; fields genuinely absent from a document come back null.

{
  "type": "object",
  "properties": {
    "invoice_number": {"type": "string", "description": "Invoice identifier"},
    "total": {"type": "number", "description": "Total amount due"},
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": {"type": "string"},
          "amount": {"type": "number"}
        }
      }
    }
  },
  "required": ["invoice_number", "total"]
}

Usage

With vLLM (recommended)

from lift import extract
from lift.model import InferenceManager

# Start the vLLM server first with: lift_vllm
model = InferenceManager(method="vllm")
result = extract("document.pdf", "schema.json", model=model)
print(result.extraction)

With HuggingFace Transformers

from lift import extract
from lift.model import InferenceManager

# Loads datalab-to/lift in-process (requires: pip install lift-pdf[hf])
model = InferenceManager(method="hf")
result = extract("document.pdf", "schema.json", model=model)
print(result.extraction)

extract accepts the schema as a dict, a path to a .json file, an inline JSON string, or the name of a saved schema. Pass page_range="0-5" to limit PDF pages, and set VLLM_API_BASE to target a remote server.

Benchmarks

Evaluated on a 225-document extraction benchmark (6–64 pages per document, ~11,000 scored fields) with adversarial cases planted throughout: cross-page values, exhaustive lists, fields that must be left null, near-miss distractors, multi-source aggregation. Scoring is deterministic exact-match against ground truth (numeric tolerance, normalized strings).

All models receive the same rendered page images, and extract each document in a single pass.

Model Size Field accuracy Full-document accuracy Median latency* Features
Datalab API 95.9% 44.4% 30.8s Citations + Verification
Gemini Flash 3.5 91.3% 40.0% 28.1s
lift 9B 90.2% 20.9% 9.5s
Azure Content Understanding 83.4% 22.2% 73.7s
NuExtract3 4B 81.5% 8.4% 8.3s
Qwen3.5-9B 9B 76.3% 24.0% 16.8s

* Per document, 8 concurrent requests. Local models (lift, Qwen3.5-9B, NuExtract3) served with vLLM on a single GPU; Gemini, Datalab, and Azure via API. Latency varies with hardware and load — treat as relative, not absolute.

  • Field accuracy — fraction of individual schema fields extracted correctly.
  • Full-document accuracy — fraction of documents where every field is correct.

Hosted models with verification, citations, and confidence scores are available via the Datalab API — test in the playground.

Commercial Usage

Code is Apache 2.0. Model weights use a modified OpenRAIL-M license: free for research, personal use, and startups under $5M funding/revenue. Cannot be used competitively with our API. For broader commercial licensing, see pricing.

Credits

Author
D
Datalab
Organization
datalab-to
Details
Downloads516
Likes95
AccessOpen Source
Taskimage-text-to-text
Parameters9.7B
Trending94
Licenseopenrail
Librarytransformers
CreatedJun 19, 2026
UpdatedJun 19, 2026
View on Hugging Face
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

lift — AI Model Details | Applied