Doubleword
    Dense Architecture
    Text-Only
    Thinking Mode

    Qwen3-14B-FP8

    A highly efficient 14.8B parameter dense language model optimized for high-volume text tasks and dual-mode reasoning.

    Total Parameters

    14.8B

    Context Window

    131K

    Tokens

    Modalities

    Text Only

    Max Output

    16,384

    Tokens

    About

    Efficient Text Generation with Dual-Mode Reasoning

    Meet Qwen3-14B, a dense 14.8B parameter causal language model from the Qwen3 release. Designed for both complex reasoning and efficient dialogue, it supports seamless switching between a "thinking" mode for rigorous logic, math, and programming tasks, and a "non-thinking" mode for general-purpose conversation. Trained on 36 trillion multilingual tokens across 100+ languages, it serves as an excellent foundation for high-volume workloads like classification, extraction, and summarization where maximum frontier performance is not strictly required.

    D
    D
    D

    Dense 14.8B — Dual-Mode Reasoning

    Use Cases

    Built for efficient text intelligence

    High-Volume Processing

    Perfectly suited for tasks that do not require massive frontier intelligence, such as document classification, data extraction, and large-scale summarization.

    Dual-Mode Execution

    Toggle between specialized "thinking" mode for robust logical inference and creative coding, or "non-thinking" mode for rapid, general-purpose conversation.

    Business Customization

    A strong baseline for medium-scale enterprise custom model training, allowing for efficient adaptation to domain-specific professional services.

    Academic & Research

    Provides an optimal, budget-friendly foundation for natural language processing methodologies, educational AI, and fine-tuning experiments.

    Benchmarks

    Capable & Cost-Effective

    Proven baseline performance across reasoning, coding, and agentic workflows for the 14B weight class.

    16.2

    Overall Intelligence

    Better than 34% of models

    13.1

    Coding Capability

    Better than 35% of models

    14.4

    Agentic Capability

    Better than 39% of models

    CategoryBenchmarkScore
    ReasoningGPQA Diamond60.4%
    Reasoningτ²-Bench Telecom34.5%
    ReasoningIFBench40.5%
    ReasoningAA-LCR0.0%
    ReasoningGDPval-AA0.4%
    ReasoningHLE4.3%
    ReasoningCritPt0.0%
    CodingSciCode31.6%
    CodingTerminal-Bench Hard3.8%
    KnowledgeAA-Omniscience14.9%

    Metrics sourced from Artificial Analysis. Hallucination Rate: 24.5%

    Pricing

    Flexible Pricing Tiers

    Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

    TierInput / 1M tokensOutput / 1M tokens
    Standard$0.02$0.20
    Async$0.03$0.30
    Realtime$0.05$0.60

    Context window natively supported up to 131k tokens via YaRN-based scaling.

    Quickstart

    Start Building in Minutes

    Qwen3-14B-FP8 is accessible via OpenAI-compatible endpoints. Here is how to integrate it using the standard Python SDK via Doubleword.ai.

    Developer Tip: Recommended Sampling Parameters

    For optimal performance and to reduce endless repetitions, the Qwen team recommends: Temperature=0.7, TopP=0.8, TopK=20, MinP=0, and a presence_penalty=1.5 (adjust up to 2.0 if language mixing or repetition persists).

    Python
    from openai import OpenAI
    
    client = OpenAI(
        api_key="your-api-key-here",
        base_url="https://api.doubleword.ai/v1"
    )
    
    # Step 1: Upload a batch input file
    with open("batch_requests.jsonl", "rb") as file:
        batch_file = client.files.create(
            file=file,
            purpose="batch"
        )
    
    print(f"File ID: {batch_file.id}")
    
    # Step 2: Create a batch job
    batch = client.batches.create(
        input_file_id=batch_file.id,
        endpoint="/v1/chat/completions",
        completion_window="24h"
    )
    
    print(f"Batch ID: {batch.id}")
    
    # Step 3: Check batch status
    batch_status = client.batches.retrieve(batch.id)
    print(f"Status: {batch_status.status}")