MoE Architecture

Native Multimodal

Open Weights

Qwen3.5 397B A17B

The hyper-efficient multimodal giant for reasoning, coding, and autonomous agents.

Get API Key Test in Playground

Total Parameters

397B

17B Activated

Context Window

262K

Tokens

Modalities

Text, Image

& Video

Speed

Up to 19x

Faster vs Qwen3-Max

About

Next-Generation Efficiency Meets Native Multimodal Reasoning

Qwen3.5 397B A17B is a state-of-the-art vision-language foundation model. By utilizing a highly efficient sparse Mixture-of-Experts (MoE) architecture, it activates only 17 billion of its 397 billion parameters per token. This allows it to deliver cross-generational parity with massive dense models at blazing-fast speeds. Featuring early fusion training across text, images, and video, Qwen3.5 397B A17B is purpose-built for real-world adaptability, complex coding tasks, and global deployment.

Mixture of Experts — 17B active / 397B total

Use Cases

Built for the hardest problems

Native Multimodal Workflows

Process text, high-resolution images, and videos simultaneously with early-fusion architecture. Perfect for UI element detection and visual document understanding.

Autonomous Coding Agents

Top-tier performance in agentic coding environments. Natively supports tool calling, executing complex CLI workflows, and recovering from execution failures.

Deep Reasoning (Thinking Mode)

Built-in "Thinking Mode" generates step-by-step internal logic before answering, dramatically increasing accuracy on complex math, science, and logic problems.

Global Multilingual Deployment

Trained for nuanced cultural understanding across 201 languages and dialects, making it the ideal foundation model for global enterprise applications.

Benchmarks

Industry-Leading Intelligence

Proven performance across reasoning, coding, and agentic workflows.

Intelligence Index

Better than 94% of models

GPQA Diamond

Better than 95% of models

τ²-Bench Telecom

Better than 96% of models

Category	Benchmark	Score	Description
Reasoning	GPQA Diamond	89%	Graduate-level scientific reasoning
Reasoning	Humanity's Last Exam	28%	Humanity's Last Exam
Reasoning	τ²-Bench Telecom	96%	AI agents in dual-control scenarios
Reasoning	AA-LCR	66%	Long context reasoning evaluation
Reasoning	IFBench	79%	Instruction-following accuracy
Reasoning	GDPval-AA	35%	Agentic performance on real-world work tasks
Coding	SciCode	42%	Python for scientific computing
Coding	Terminal-Bench Hard	39%	Agentic coding & terminal use
Knowledge	AA-Omniscience Accuracy	30%	Proportion of correctly answered questions
Knowledge	AA-Omniscience Non-Hallucination	14%	Confidently answered questions that are correct

Metrics sourced from Artificial Analysis.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Standard	$0.15	$1.20
Async	$0.30	$1.80
Realtime	$0.60	$3.60

Context window supported up to 256k tokens.

Quickstart

Start Building in Minutes

Qwen3.5 397B A17B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Step 1: Upload a batch input file
with open("batch_requests.jsonl", "rb") as file:
    batch_file = client.files.create(
        file=file,
        purpose="batch"
    )

print(f"File ID: {batch_file.id}")

# Step 2: Create a batch job
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")

Ready to deploy Qwen3.5 397B A17B?

Get Your API Keys Read the Full Documentation