DeepSeek-OCR-2
Advanced OCR with a novel causal vision encoder that captures reading order for enhanced structured text extraction.
Provider
DeepSeek
Context Window
16K
Tokens
Type
Generation
OCR
Released
Jan 2026
Next-Generation Document OCR
DeepSeek-OCR-2 is DeepSeek's latest OCR model. It expands on DeepSeek-OCR with a novel causal vision encoder that captures reading order to enhance structured extraction of text. Whether you need plain text extraction or structured markdown output with preserved headings, tables, and lists, DeepSeek-OCR-2 delivers high-accuracy results at scale.
Built for document intelligence
Plain Text Extraction
Extract raw text content from images, scans, and photos without preserving layout or formatting — ideal for search indexing and data pipelines.
Structured Document Parsing
Convert documents to well-structured markdown preserving headings, paragraphs, lists, and tables for downstream processing.
Batch Document Processing
Process thousands of scanned documents, invoices, receipts, and forms at scale using Doubleword's batch API for maximum cost efficiency.
Flexible Pricing Tiers
Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.05 | $0.05 |
| Async | $0.08 | $0.08 |
Context window natively supported up to 16k tokens.
Getting the Best Results
Plain Text Extraction
Use Free OCR. as the prompt when you want only the text content from the image, without preserving layout or structure.
messages = [{"role": "user", "content": [
{"type": "text", "text": "Free OCR."},
{"type": "image_url", "image_url": {"url": image_url}}
]}]Structured Markdown Extraction
Use <|grounding|>Convert the document to markdown. when you want to preserve headings, paragraphs, lists, and tables.
messages = [{"role": "user", "content": [
{"type": "text", "text": "<|grounding|>Convert the document to markdown."},
{"type": "image_url", "image_url": {"url": image_url}}
]}]Start Building in Minutes
DeepSeek-OCR-2 is accessible via OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")