olmOCR-2-7B
Fine-tuned from Qwen2.5-VL-7B with GRPO RL training for superior math equation, table, and document OCR performance.
Provider
Ai2
Allen Institute for AI
Context Window
16K
Tokens
Type
Generation
OCR
Released
Oct 2025
RL-Enhanced Document OCR
olmOCR-2-7B is a release of the olmOCR model fine-tuned from Qwen2.5-VL-7B-Instruct using the olmOCR-mix-1025 dataset. It has been additionally fine-tuned using GRPO RL training to boost its performance at math equations, tables, and other tricky OCR cases. The model outputs natural-reading plain text with LaTeX for equations and HTML for tables.
Built for document intelligence
Natural Document Reading
Returns plain text as if reading the document naturally — ideal for search, summarization, and content extraction pipelines.
Math & Table Extraction
Fine-tuned with GRPO RL training to excel at math equations, complex tables, and other tricky OCR edge cases.
Figure & Chart Detection
Automatically labels figures and charts with descriptive alt text and bounding coordinates for downstream processing.
Flexible Pricing Tiers
Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.10 | $0.10 |
| Async | $0.15 | $0.15 |
Context window natively supported up to 16k tokens.
Getting the Best Results
Default Prompt
This model expects a prompt alongside the image. The default prompt converts equations to LaTeX, tables to HTML, and labels figures with markdown syntax.
prompt = """Attached is one page of a document that you must process.
Just return the plain text representation of this document as if
you were reading it naturally. Convert equations to LateX and
tables to HTML.
If there are any figures or charts, label them with the following
markdown syntax 
Return your output as markdown, with a front matter section on top
specifying values for the primary_language, is_rotation_valid,
rotation_correction, is_table, and is_diagram parameters"""
messages = [{"role": "user", "content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{image_base64}"}}
]}]Image Processing
This model expects a single document image as input, rendered such that the longest dimension is 1288 pixels. Maintain aspect ratio for best results.
Start Building in Minutes
olmOCR-2-7B is accessible via OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")