Skip to main content
POST
/
rerank
Rerank documents
curl --request POST \
  --url https://api.fireworks.ai/inference/v1/rerank \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "accounts/fireworks/models/qwen3-reranker-8b",
  "query": "What is machine learning?",
  "documents": [
    "Machine learning is a subset of AI.",
    "The weather is sunny today."
  ],
  "top_n": 2,
  "return_documents": true,
  "task": "Given a web search query, retrieve relevant passages that answer the query"
}'
{
  "object": "list",
  "model": "<string>",
  "data": [
    {
      "index": 123,
      "relevance_score": 0.5,
      "document": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "total_tokens": 123
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
query
string
required

The search query to use for reranking documents.

Example:

"What is machine learning?"

documents
string[]
required

A list of documents to rerank. Each document is a string.

Minimum length: 1
Example:
[
"Machine learning is a subset of AI.",
"The weather is sunny today."
]
model
string | null

The name of the reranker model to use.

Example:

"accounts/fireworks/models/qwen3-reranker-8b"

top_n
integer | null

The number of most relevant documents to return. If not specified, all documents are returned.

Required range: x >= 1
return_documents
boolean
default:true

Whether to return the document text in the response. Defaults to true.

task
string | null

Optional task description to guide the reranking process.

Example:

"Given a web search query, retrieve relevant passages that answer the query"

Response

200 - application/json

OK

object
enum<string>
required

The object type, which is always "list".

Available options:
list
model
string
required

The name of the model used for reranking.

data
object[]
required

The list of reranked documents, ordered by relevance score (highest first).

usage
object
required

The usage information for the request.