Text Embedding

MRL Support

100+ Languages

Qwen3-Embedding-8B

The state-of-the-art multilingual embedding model with flexible vector dimensions and advanced long-text understanding.

Get API Key Test in Playground

Total Parameters

Context Window

32K

Tokens

Output Dimensions

32–4096

Configurable

Supported Languages

100+

Languages

About

Exceptional Versatility & Multilingual Retrieval

The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for advanced text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, the 8B variant inherits exceptional multilingual capabilities, long-text understanding, and reasoning skills. It offers flexible vector definitions across all dimensions (from 32 up to 4096) and supports user-defined instructions, making it incredibly adaptable for specific tasks, languages, or retrieval scenarios.

Vector Embeddings — 32 to 4096 dims

Use Cases

Built for semantic search & retrieval

Advanced Text & Code Retrieval

Achieve highly accurate search functionality across vast document repositories and codebases with robust multilingual and cross-lingual support.

Text Classification & Clustering

Efficiently categorize massive datasets and perform unsupervised grouping of text data to discover underlying patterns and topics.

Bitext Mining

Identify and align parallel sentences across more than 100 different languages to support translation and localization workflows.

Flexible Dimensionality (MRL)

Utilize Matryoshka Representation Learning (MRL) to truncate embedding dimensions on the fly (down to 32 dimensions) while preserving high retrieval accuracy.

Benchmarks

State-of-the-Art Ranking

Industry-leading performance on the MTEB multilingual leaderboard and versatile application support.

MTEB Multilingual

No. 1

Score: 70.58

Layers

Transformer layers

Instruction Aware

Yes

Task-specific prompting

Category	Task	Status	Description
Retrieval	Text Retrieval	Native Support	High-accuracy document search and RAG pipelines
Retrieval	Code Retrieval	Native Support	Cross-lingual and multi-language programming search
Classification	Text Classification	Native Support	Automated categorization of multilingual documents
Clustering	Text Clustering	Native Support	Unsupervised grouping of related text data
Mining	Bitext Mining	Native Support	Identifying parallel sentences across 100+ languages

Performance data sourced from official Qwen3 Embedding evaluations.

Pricing

Flexible Pricing Tiers

Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.

Tier	Input / 1M tokens	Output / 1M tokens
Standard	$0.02	$0.00
Async	$0.03	$0.00
Realtime	$0.04	$0.00

Context window natively supported up to 32k tokens. Output tokens are not billed for embedding operations.

Quickstart

Start Building in Minutes

Qwen3-Embedding-8B is accessible via standard OpenAI-compatible endpoints. Here is how to integrate it using the Python SDK for batch embedding generation via Doubleword.ai.

Developer Tip: Embedding Endpoints

Ensure you point your batch jobs to the /v1/embeddings endpoint rather than the chat completions endpoint to generate vector arrays.

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key-here",
    base_url="https://api.doubleword.ai/v1"
)

# Step 1: Upload a batch input file formatted for embeddings
with open("batch_embeddings_requests.jsonl", "rb") as file:
    batch_file = client.files.create(
        file=file,
        purpose="batch"
    )

print(f"File ID: {batch_file.id}")

# Step 2: Create a batch job (Note the /v1/embeddings endpoint)
batch = client.batches.create(
    input_file_id=batch_file.id,
    endpoint="/v1/embeddings",
    completion_window="24h"
)

print(f"Batch ID: {batch.id}")

# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")

Ready to deploy Qwen3-Embedding-8B?

Get Your API Keys Read the Full Documentation