Doubleword
    Text Embedding
    MRL Support
    100+ Languages

    Qwen3-Embedding-8B

    The state-of-the-art multilingual embedding model with flexible vector dimensions and advanced long-text understanding.

    Total Parameters

    8B

    Context Window

    32K

    Tokens

    Output Dimensions

    32–4096

    Configurable

    Supported Languages

    100+

    Languages

    About

    Exceptional Versatility & Multilingual Retrieval

    The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for advanced text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, the 8B variant inherits exceptional multilingual capabilities, long-text understanding, and reasoning skills. It offers flexible vector definitions across all dimensions (from 32 up to 4096) and supports user-defined instructions, making it incredibly adaptable for specific tasks, languages, or retrieval scenarios.

    Vector Embeddings — 32 to 4096 dims

    Use Cases

    Built for semantic search & retrieval

    Advanced Text & Code Retrieval

    Achieve highly accurate search functionality across vast document repositories and codebases with robust multilingual and cross-lingual support.

    Text Classification & Clustering

    Efficiently categorize massive datasets and perform unsupervised grouping of text data to discover underlying patterns and topics.

    Bitext Mining

    Identify and align parallel sentences across more than 100 different languages to support translation and localization workflows.

    Flexible Dimensionality (MRL)

    Utilize Matryoshka Representation Learning (MRL) to truncate embedding dimensions on the fly (down to 32 dimensions) while preserving high retrieval accuracy.

    Benchmarks

    State-of-the-Art Ranking

    Industry-leading performance on the MTEB multilingual leaderboard and versatile application support.

    MTEB Multilingual

    No. 1

    Score: 70.58

    Layers

    36

    Transformer layers

    Instruction Aware

    Yes

    Task-specific prompting

    CategoryTaskStatus
    RetrievalText Retrieval
    Native Support
    RetrievalCode Retrieval
    Native Support
    ClassificationText Classification
    Native Support
    ClusteringText Clustering
    Native Support
    MiningBitext Mining
    Native Support

    Performance data sourced from official Qwen3 Embedding evaluations.

    Pricing

    Flexible Pricing Tiers

    Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

    TierInput / 1M tokensOutput / 1M tokens
    Standard$0.02$0.00
    Async$0.03$0.00
    Realtime$0.04$0.00

    Context window natively supported up to 32k tokens. Output tokens are not billed for embedding operations.

    Quickstart

    Start Building in Minutes

    Qwen3-Embedding-8B is accessible via standard OpenAI-compatible endpoints. Here is how to integrate it using the Python SDK for batch embedding generation via Doubleword.ai.

    Developer Tip: Embedding Endpoints

    Ensure you point your batch jobs to the /v1/embeddings endpoint rather than the chat completions endpoint to generate vector arrays.

    Python
    from openai import OpenAI
    
    client = OpenAI(
        api_key="your-api-key-here",
        base_url="https://api.doubleword.ai/v1"
    )
    
    # Step 1: Upload a batch input file formatted for embeddings
    with open("batch_embeddings_requests.jsonl", "rb") as file:
        batch_file = client.files.create(
            file=file,
            purpose="batch"
        )
    
    print(f"File ID: {batch_file.id}")
    
    # Step 2: Create a batch job (Note the /v1/embeddings endpoint)
    batch = client.batches.create(
        input_file_id=batch_file.id,
        endpoint="/v1/embeddings",
        completion_window="24h"
    )
    
    print(f"Batch ID: {batch.id}")
    
    # Step 3: Check batch status
    batch_status = client.batches.retrieve(batch.id)
    print(f"Status: {batch_status.status}")

    Ready to deploy Qwen3-Embedding-8B?