Qwen3-Embedding-8B
The state-of-the-art multilingual embedding model with flexible vector dimensions and advanced long-text understanding.
Total Parameters
8B
Context Window
32K
Tokens
Output Dimensions
32–4096
Configurable
Supported Languages
100+
Languages
Exceptional Versatility & Multilingual Retrieval
The Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for advanced text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, the 8B variant inherits exceptional multilingual capabilities, long-text understanding, and reasoning skills. It offers flexible vector definitions across all dimensions (from 32 up to 4096) and supports user-defined instructions, making it incredibly adaptable for specific tasks, languages, or retrieval scenarios.
Vector Embeddings — 32 to 4096 dims
Built for semantic search & retrieval
Advanced Text & Code Retrieval
Achieve highly accurate search functionality across vast document repositories and codebases with robust multilingual and cross-lingual support.
Text Classification & Clustering
Efficiently categorize massive datasets and perform unsupervised grouping of text data to discover underlying patterns and topics.
Bitext Mining
Identify and align parallel sentences across more than 100 different languages to support translation and localization workflows.
Flexible Dimensionality (MRL)
Utilize Matryoshka Representation Learning (MRL) to truncate embedding dimensions on the fly (down to 32 dimensions) while preserving high retrieval accuracy.
State-of-the-Art Ranking
Industry-leading performance on the MTEB multilingual leaderboard and versatile application support.
MTEB Multilingual
No. 1
Score: 70.58
Layers
36
Transformer layers
Instruction Aware
Yes
Task-specific prompting
| Category | Task | Status |
|---|---|---|
| Retrieval | Text Retrieval | Native Support |
| Retrieval | Code Retrieval | Native Support |
| Classification | Text Classification | Native Support |
| Clustering | Text Clustering | Native Support |
| Mining | Bitext Mining | Native Support |
Performance data sourced from official Qwen3 Embedding evaluations.
Flexible Pricing Tiers
Choose the optimal balance of speed and cost for your workflow. Prices are per 1M tokens.
| Tier | Input / 1M tokens | Output / 1M tokens |
|---|---|---|
| Standard | $0.02 | $0.00 |
| Async | $0.03 | $0.00 |
| Realtime | $0.04 | $0.00 |
Context window natively supported up to 32k tokens. Output tokens are not billed for embedding operations.
Start Building in Minutes
Qwen3-Embedding-8B is accessible via standard OpenAI-compatible endpoints. Here is how to integrate it using the Python SDK for batch embedding generation via Doubleword.ai.
Developer Tip: Embedding Endpoints
Ensure you point your batch jobs to the /v1/embeddings endpoint rather than the chat completions endpoint to generate vector arrays.
from openai import OpenAI
client = OpenAI(
api_key="your-api-key-here",
base_url="https://api.doubleword.ai/v1"
)
# Step 1: Upload a batch input file formatted for embeddings
with open("batch_embeddings_requests.jsonl", "rb") as file:
batch_file = client.files.create(
file=file,
purpose="batch"
)
print(f"File ID: {batch_file.id}")
# Step 2: Create a batch job (Note the /v1/embeddings endpoint)
batch = client.batches.create(
input_file_id=batch_file.id,
endpoint="/v1/embeddings",
completion_window="24h"
)
print(f"Batch ID: {batch.id}")
# Step 3: Check batch status
batch_status = client.batches.retrieve(batch.id)
print(f"Status: {batch_status.status}")