Multimodal

Agentic

Thinking Mode

MoE

Moonshot

Open Weights

Kimi K2.6

Moonshot's open-source native multimodal agentic model. Built on the same MoE multimodal architecture as K2.5 with a 256K context window — combining strong reasoning, visual understanding, and agentic tool use across instant and thinking modes.

Get API Key Test in Playground

Architecture

MoE

Multimodal

Context Window

256K

Tokens

Intelligence

AA Index v4.0

License

Open

Open Weights

About

Native Multimodal MoE for Agentic Workflows

Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. Built on the same MoE multimodal architecture as K2.5 with a 256K context window, K2.6 unifies reasoning, visual understanding, and agentic tool use across instant and thinking modes.

Thinking mode is enabled by default; disable with {"chat_template_kwargs": {"enable_thinking": false}}. K2.6 does not support graduated thinking levels — reasoning_effort is not supported.

Code

Vision

Term

Tools

Plan

Swarm

Multimodal Agentic Flagship

Use Cases

Built for autonomous, multimodal agents

Long-Horizon Coding

End-to-end coding performance across Rust, Go, Python, front-end, DevOps, and performance optimization workflows.

Coding-Driven Design

Turns prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows with structured layouts and visual polish.

Elevated Agent Swarm

Decomposes complex tasks into parallel, domain-specialized subtasks — scaling to large coordinated agent runs for end-to-end outputs.

Proactive Orchestration

Built for autonomous execution. Persistent background agents that manage schedules, execute code, and coordinate cross-platform operations with minimal oversight.

Benchmarks

Frontier Coding & Agentic Performance

Artificial Analysis Intelligence Index v4.0 scores. In Moonshot's published benchmarks, K2.6 outperformed GPT-5-mini, GPT-OSS-120B, and Claude Sonnet 4.5.

Intelligence Index

Better than 94% of models

GPQA Diamond

Better than 96% of models

τ²-Bench Telecom

Better than 95% of models

Category	Benchmark	Score	Description
Reasoning	GPQA Diamond	91%	Graduate-level scientific reasoning
Reasoning	Humanity's Last Exam	36%	Humanity's Last Exam
Reasoning	τ²-Bench Telecom	96%	AI agents in dual-control scenarios
Reasoning	AA-LCR	70%	Long context reasoning evaluation
Reasoning	IFBench	76%	Instruction-following accuracy
Reasoning	GDPval-AA	49%	Agentic performance on real-world work tasks
Coding	SciCode	53%	Python for scientific computing
Coding	Terminal-Bench Hard	44%	Agentic coding & terminal use
Knowledge	AA-Omniscience Accuracy	33%	Proportion of correctly answered questions
Knowledge	AA-Omniscience Non-Hallucination	61%	Proportion of confidently answered questions that are correct

Metrics sourced from Artificial Analysis and Moonshot's published evaluations. Reasoning (thinking) mode enabled.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Overnight (24H)	$0.45	$2.00
Async	$0.70	$3.00
Realtime	$0.95	$4.00

Context window natively supported up to 256K tokens.

Quickstart

Start Building in Minutes

Kimi K2.6 is accessible via OpenAI-compatible endpoints.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Long-horizon multimodal agentic task (thinking enabled by default)
response = client.chat.completions.create(
    model="moonshotai/Kimi-K2.6",
    messages=[
        {"role": "user", "content": "Plan and execute a 3-step refactor of this codebase."}
    ],
    # To disable step-by-step reasoning:
    # extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)

print(response.choices[0].message.content)

💡 Pro Tip

K2.6 shines on long-horizon agentic work — let it sustain reasoning across planning, tool use, and iterative debugging. K2.6 does not support graduated thinking levels (reasoning_effort has no effect). For latency-sensitive endpoints, disable thinking with "chat_template_kwargs": {"enable_thinking": false}.

Ready to deploy Kimi K2.6?

Get Your API Keys Read the Full Documentation