POST /v1/rerank — Rerank Documents for RAG Pipelines

Use this endpoint to rerank a set of candidate documents against a user query. Reranking is typically the final step in a RAG pipeline: after an initial vector search retrieves a broad set of candidates, a reranker scores each document’s true relevance to the query and returns them in ranked order, improving the quality of context passed to the language model. Endpoint: POST https://api.qhaigc.net/v1/rerank

Supported Models

Model ID	Description
`bge-reranker-v2-m3`	Lightweight cross-encoder reranker optimized for multilingual RAG pipelines.

Request Parameters

model

string

必填

The reranking model to use. Example: bge-reranker-v2-m3.

query

string

必填

The search query or user question to rank documents against.

documents

string[]

必填

An array of document strings to score and rank. Each element is the text content of one document or passage.

top_n

integer

Return only the top N highest-scoring results. If omitted, all documents are scored and returned.

Response Fields

results

array

Array of scored document results, sorted by relevance_score in descending order (highest relevance first).

显示 Result object fields

results[].index

integer

The original position of this document in the input documents array. Use this to map scores back to your source documents.

results[].relevance_score

number

A float between 0 and 1 indicating relevance to the query. Higher scores mean higher relevance.

results[].document

string

The document text (returned when the model includes document content in the response).

usage

object

Token usage for this request.

显示 Usage fields

usage.prompt_tokens

integer

Number of tokens processed (query + all documents).

usage.total_tokens

integer

Total tokens consumed.

Code Examples

import requests

url = "https://api.qhaigc.net/v1/rerank"
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer sk-your-api-key-here"
}

payload = {
    "model": "bge-reranker-v2-m3",
    "query": "Organic skincare products for sensitive skin",
    "top_n": 3,
    "documents": [
        "Organic skincare for sensitive skin with aloe vera and chamomile. Clinically tested and hypoallergenic.",
        "New makeup trends focus on bold colors and innovative application techniques for a striking look.",
        "Bio-Hautpflege für empfindliche Haut mit Aloe Vera und Kamille. Klinisch getestet und hypoallergen."
    ]
}

response = requests.post(url, headers=headers, json=payload)
results = response.json()["results"]

for r in results:
    print(f"Index {r['index']}: score={r['relevance_score']:.4f}")

Example Response

{
  "results": [
    {
      "index": 0,
      "relevance_score": 0.9854
    },
    {
      "index": 2,
      "relevance_score": 0.6773
    },
    {
      "index": 1,
      "relevance_score": 0.000016
    }
  ],
  "usage": {
    "prompt_tokens": 77,
    "total_tokens": 77
  }
}

The result shows that document at index 0 (English skincare text) and index 2 (German skincare text) are both highly relevant to the query, while document at index 1 (makeup trends) is not.

RAG Pipeline Integration

Embed and index your documents

Use POST /v1/embeddings to convert your knowledge base into vectors and store them in a vector database.

Retrieve candidates with vector search

At query time, embed the user’s question and retrieve the top 20–50 most similar document chunks from your vector database.

Rerank the candidates

Send the retrieved chunks to POST /v1/rerank with the user’s question as the query. Set top_n to 3–5 to keep only the most relevant passages.

Generate the response

Pass the top reranked passages as context to POST /v1/chat/completions and ask the model to answer based on the provided content.

Reranking is most effective when your initial vector retrieval returns 20+ candidates. With fewer candidates, the reranker has less to work with and the quality improvement is smaller.

Overview

Chat

Images

Audio & Voice

Video & Music

Embeddings & Rerank

Utility APIs

Account

POST /v1/rerank — Rerank Documents for RAG Pipelines

Supported Models

Request Parameters

Response Fields

Code Examples

Example Response

RAG Pipeline Integration

Overview

Chat

Images

Audio & Voice

Video & Music

Embeddings & Rerank

Utility APIs

Account

​Supported Models

​Request Parameters

​Response Fields

​Code Examples

​Example Response

​RAG Pipeline Integration

Supported Models

Request Parameters

Response Fields

Code Examples

Example Response

RAG Pipeline Integration