Kimi K2.6
Moonshot's open-source native multimodal agentic model. Built on the same MoE multimodal architecture as K2.5 with a 256K context window — combining strong reasoning, visual understanding, and agentic tool use across instant and thinking modes.
Architecture
MoE
Multimodal
Context Window
256K
Tokens
Intelligence
54
AA Index v4.0
License
Open
Open Weights
Native Multimodal MoE for Agentic Workflows
Kimi K2.6 is an open-source, native multimodal agentic model that advances practical capabilities in long-horizon coding, coding-driven design, proactive autonomous execution, and swarm-based task orchestration. Built on the same MoE multimodal architecture as K2.5 with a 256K context window, K2.6 unifies reasoning, visual understanding, and agentic tool use across instant and thinking modes.
Thinking mode is enabled by default; disable with {"chat_template_kwargs": {"enable_thinking": false}}. K2.6 does not support graduated thinking levels — reasoning_effort is not supported.
Multimodal Agentic Flagship
Built for autonomous, multimodal agents
Long-Horizon Coding
End-to-end coding performance across Rust, Go, Python, front-end, DevOps, and performance optimization workflows.
Coding-Driven Design
Turns prompts and visual inputs into production-ready interfaces and lightweight full-stack workflows with structured layouts and visual polish.
Elevated Agent Swarm
Decomposes complex tasks into parallel, domain-specialized subtasks — scaling to large coordinated agent runs for end-to-end outputs.
Proactive Orchestration
Built for autonomous execution. Persistent background agents that manage schedules, execute code, and coordinate cross-platform operations with minimal oversight.
Frontier Coding & Agentic Performance
Artificial Analysis Intelligence Index v4.0 scores. In Moonshot's published benchmarks, K2.6 outperformed GPT-5-mini, GPT-OSS-120B, and Claude Sonnet 4.5.
Intelligence Index
Better than 94% of models
GPQA Diamond
Better than 96% of models
τ²-Bench Telecom
Better than 95% of models
| Category | Benchmark | Score |
|---|---|---|
| Reasoning | GPQA Diamond | 91% |
| Reasoning | Humanity's Last Exam | 36% |
| Reasoning | τ²-Bench Telecom | 96% |
| Reasoning | AA-LCR | 70% |
| Reasoning | IFBench | 76% |
| Reasoning | GDPval-AA | 49% |
| Coding | SciCode | 53% |
| Coding | Terminal-Bench Hard | 44% |
| Knowledge | AA-Omniscience Accuracy | 33% |
| Knowledge | AA-Omniscience Non-Hallucination | 61% |
Metrics sourced from Artificial Analysis and Moonshot's published evaluations. Reasoning (thinking) mode enabled.
Flexible Pricing Tiers
Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Overnight (24H) | $0.45 | $2.00 |
| Async | $0.70 | $3.00 |
| Realtime | $0.95 | $4.00 |
Context window natively supported up to 256K tokens.
Start Building in Minutes
Kimi K2.6 is accessible via OpenAI-compatible endpoints.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Long-horizon multimodal agentic task (thinking enabled by default)
response = client.chat.completions.create(
model="moonshotai/Kimi-K2.6",
messages=[
{"role": "user", "content": "Plan and execute a 3-step refactor of this codebase."}
],
# To disable step-by-step reasoning:
# extra_body={"chat_template_kwargs": {"enable_thinking": False}},
)
print(response.choices[0].message.content)💡 Pro Tip
K2.6 shines on long-horizon agentic work — let it sustain reasoning across planning, tool use, and iterative debugging. K2.6 does not support graduated thinking levels (reasoning_effort has no effect). For latency-sensitive endpoints, disable thinking with "chat_template_kwargs": {"enable_thinking": false}.
