Qwen3-VL-30B-A3B
The highly capable mid-size multimodal model for production workloads, reasoning, and visual coding.
Total Parameters
30B
3B Activated
Context Window
128K
Tokens
Modalities
Text, Image
& Video
Performance Class
GPT-4.1-mini
/ Sonnet 4
High-Performance Multimodal Efficiency
Meet Qwen3-VL-30B, the highly capable mid-size model of the Qwen3-VL family. It unifies strong text generation with visual understanding for images and videos, delivering performance similar to GPT-4.1-mini and Claude Sonnet 4. Built on an efficient 48-layer Mixture-of-Experts architecture, it is suited for production workloads that are cost-constrained or require high token volumes. From advanced 2D/3D spatial grounding and document AI to converting UI sketches into debugged code, Qwen3-VL-30B offers robust reasoning capabilities without frontier model costs.
Vision-Language MoE — 3B active / 30B total
Built for production multimodal workloads
Production Multimodal AI
Excels at general vision-language tasks including VQA, robust OCR, document AI, and long-form visual comprehension across real-world and synthetic categories.
Agentic GUI Automation
Handles multi-image multi-turn instructions, aligns text to video timelines for precise temporal queries, and natively navigates GUI automation tasks.
Visual Coding & STEM
Transforms whiteboard sketches and mockups directly into functional code. Actively assists with UI debugging, scientific computing, and complex reasoning.
Enterprise Customization
Trained on 36 trillion tokens across 119 languages, its expert-specific adaptation makes it the perfect foundation for supervised fine-tuning and domain-specific AI.
Efficient Multimodal Intelligence
Proven performance across reasoning, coding, and agentic workflows for the 30B weight class.
Overall Intelligence
Better than 34% of models
Coding Capability
Better than 43% of models
Agentic Capability
Better than 29% of models
| Category | Benchmark | Score |
|---|---|---|
| Reasoning | GPQA Diamond | 69.5% |
| Reasoning | τ²-Bench Telecom | 19.0% |
| Reasoning | IFBench | 33.1% |
| Reasoning | AA-LCR | 23.7% |
| Reasoning | GDPval-AA | 1.3% |
| Reasoning | HLE | 6.4% |
| Reasoning | CritPt | 0.0% |
| Coding | SciCode | 30.8% |
| Coding | Terminal-Bench Hard | 6.1% |
| Knowledge | AA-Omniscience | 15.5% |
Metrics sourced from Artificial Analysis.
Flexible Pricing Tiers
Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.05 | $0.20 |
| Async | $0.07 | $0.30 |
| Realtime | $0.16 | $0.80 |
Context window natively supported up to 128k tokens.
Start Building in Minutes
Qwen3-VL-30B-A3B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/chat/completions",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")