Vision-Language

MoE Architecture

Open Weights

Qwen3-VL-235B-A22B

Frontier-level multimodal intelligence for complex visual reasoning, GUI automation, and sophisticated coding.

Get API Key Test in Playground

Total Parameters

235B

22B Activated

Context Window

262K

Tokens

Modalities

Text, Image

& Video

Max Output

16,384

Tokens

About

Frontier-Level Vision & Language Unification

Qwen3-VL-235B-A22B Instruct is a massive open-weight multimodal model that unifies strong text generation with deep visual understanding. Delivering performance comparable to GPT-5 Chat and Claude 4 Opus Thinking, it excels at advanced reasoning, complex code generation, and 2D/3D spatial grounding. Whether you need multilingual document parsing, complex multi-image dialogue, or visual coding workflows, Qwen3-VL provides maximum intelligence for your most demanding applications where quality is paramount.

Vision-Language MoE — 22B active / 235B total

Use Cases

Built for visual intelligence

Advanced Visual Perception

Excels at general vision-language tasks including VQA, chart/table extraction, document parsing, and robust recognition of real-world and synthetic categories.

Agentic GUI Automation

Operates GUI elements for automation tasks, aligns text to video timelines for precise temporal queries, and follows complex instructions over multi-turn dialogues.

Visual Coding Workflows

Transforms UI mockups and whiteboard sketches directly into functional code and actively assists with UI debugging and spatial reasoning.

Frontier-Level Reasoning

Delivers text-only performance comparable to flagship proprietary models, making it ideal for sophisticated mathematical analysis and complex project execution.

Benchmarks

Multimodal Intelligence

Proven performance across reasoning, coding, and agentic workflows.

Intelligence Index

Better than 48% of models

GPQA Diamond

Better than 58% of models

τ²-Bench Telecom

Better than 42% of models

Category	Benchmark	Score	Description
Reasoning	GPQA Diamond	67%	Graduate-level scientific reasoning
Reasoning	τ²-Bench Telecom	35%	AI agents in dual-control scenarios
Reasoning	AA-LCR	31%	Long context reasoning evaluation
Reasoning	IFBench	43%	Instruction-following accuracy
Coding	SciCode	33%	Python for scientific computing
Knowledge	AA-Omniscience Accuracy	20%	Proportion of correctly answered questions
Knowledge	AA-Omniscience Non-Hallucination	14%	Confidently answered questions that are correct

Metrics sourced from Artificial Analysis.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Standard	$0.10	$0.40
Async	$0.15	$0.55
Realtime	$0.60	$1.20

Context window natively supported up to 262k tokens.

Quickstart

Start Building in Minutes

Qwen3-VL-235B-A22B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

Developer Tip: Recommended Sampling Parameters

For optimal performance and to reduce endless repetitions, the Qwen team recommends: Temperature=0.7, TopP=0.8, TopK=20, MinP=0, and a presence_penalty=1.5 (adjust up to 2.0 if repetition persists).

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
    batch_file = client.files.create(
        file=file,
        purpose="batch"
    )

print(f"File ID: {batch_file.id}")

# Step 2: Create a batch job
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")

Ready to deploy Qwen3-VL-235B-A22B?

Get Your API Keys Read the Full Documentation