Qwen3 Reranker 8B API & Playground

significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining

Qwen3 Reranker 8B API Features

Serverless Docs	Qwen3 Reranker 8B is available via Fireworks' serverless API, where you pay per token. There are several ways to call the Fireworks API, including Fireworks' Python client, the REST API, or OpenAI's Python client.
On-demand Deployment Docs	On-demand deployments allow you to use Qwen3 Reranker 8B on dedicated GPUs with Fireworks' high-performance serving stack with high reliability and no rate limits.

Available Serverless

Run queries immediately, pay only for usage

$0.20

Per 1M Tokens

Metadata

State

Ready

Created on

10/7/2025

Kind

Embedding model

Provider

Fireworks AI

Specification

Calibrated

Mixture-of-Experts

Parameters

8.18B

Supported Functionality

Fine-tuning

Not supported

Serverless

Supported

Context Length

40.9k tokens

Function Calling

Not supported

Embeddings

Supported

Rerankers

Supported

Support image input

Not supported

Qwen3 Reranker 8B

Qwen3 Reranker 8B API Features

Serverless

On-demand Deployment

Available Serverless

Metadata

Specification

Supported Functionality