Doubleword
    MoE Architecture
    Native Multimodal
    Open Weights

    Qwen3.5 397B A17B

    The hyper-efficient multimodal giant for reasoning, coding, and autonomous agents.

    Total Parameters

    397B

    17B Activated

    Context Window

    262K

    Tokens

    Modalities

    Text, Image

    & Video

    Speed

    Up to 19x

    Faster vs Qwen3-Max

    About

    Next-Generation Efficiency Meets Native Multimodal Reasoning

    Qwen3.5 397B A17B is a state-of-the-art vision-language foundation model. By utilizing a highly efficient sparse Mixture-of-Experts (MoE) architecture, it activates only 17 billion of its 397 billion parameters per token. This allows it to deliver cross-generational parity with massive dense models at blazing-fast speeds. Featuring early fusion training across text, images, and video, Qwen3.5 397B A17B is purpose-built for real-world adaptability, complex coding tasks, and global deployment.

    E
    E
    E

    Mixture of Experts — 17B active / 397B total

    Use Cases

    Built for the hardest problems

    Native Multimodal Workflows

    Process text, high-resolution images, and videos simultaneously with early-fusion architecture. Perfect for UI element detection and visual document understanding.

    Autonomous Coding Agents

    Top-tier performance in agentic coding environments. Natively supports tool calling, executing complex CLI workflows, and recovering from execution failures.

    Deep Reasoning (Thinking Mode)

    Built-in "Thinking Mode" generates step-by-step internal logic before answering, dramatically increasing accuracy on complex math, science, and logic problems.

    Global Multilingual Deployment

    Trained for nuanced cultural understanding across 201 languages and dialects, making it the ideal foundation model for global enterprise applications.

    Benchmarks

    Industry-Leading Intelligence

    Proven performance across reasoning, coding, and agentic workflows.

    45

    Overall Intelligence

    Better than 94% of models

    41.3

    Coding Capability

    Better than 94% of models

    55.8

    Agentic Capability

    Better than 95% of models

    CategoryBenchmarkScore
    ReasoningGPQA Diamond89.3%
    Reasoningτ²-Bench Telecom95.6%
    ReasoningIFBench78.8%
    ReasoningAA-LCR65.7%
    ReasoningGDPval-AA35.7%
    ReasoningHLE27.3%
    ReasoningCritPt1.7%
    CodingSciCode42.0%
    CodingTerminal-Bench Hard40.9%
    KnowledgeAA-Omniscience31.4%

    Metrics sourced from Artificial Analysis.

    Pricing

    Flexible Pricing Tiers

    Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

    TierInput / 1M tokensOutput / 1M tokens
    Standard$0.15$1.20
    Async$0.30$1.80
    Realtime$0.60$3.60

    Context window supported up to 256k tokens.

    Quickstart

    Start Building in Minutes

    Qwen3.5 397B A17B is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

    Python
    from openai import OpenAI
    
    client = OpenAI(
        api_key="your-api-key-here",
        base_url="https://api.doubleword.ai/v1"
    )
    
    # Step 1: Upload a batch input file
    with open("batch_requests.jsonl", "rb") as file:
        batch_file = client.files.create(
            file=file,
            purpose="batch"
        )
    
    print(f"File ID: {batch_file.id}")
    
    # Step 2: Create a batch job
    batch = client.batches.create(
        input_file_id=batch_file.id,
        endpoint="/v1/chat/completions",
        completion_window="24h"
    )
    
    print(f"Batch ID: {batch.id}")
    
    # Step 3: Check batch status
    batch_status = client.batches.retrieve(batch.id)
    print(f"Status: {batch_status.status}")

    Ready to deploy Qwen3.5 397B A17B?