Rerankers
Define reranker abstractions and implementations for retrieval result refinement.
This module provides base classes and concrete reranker implementations for post-processing retrieval results in a terminology-mapping pipeline. It includes a simple BM25-based lexical reranker and a neural reranker based on Qwen3 causal language models.
The module is designed around pipeline-style composition, allowing rerankers to be used as callable processing stages that accept retrieval outputs and return the same results reordered by reranking scores.
BaseReranker
Bases: PipelineBaseClass, ABC
Define the abstract interface for reranker pipeline components.
This base class establishes the contract for reranker implementations that
operate on retrieval results. Subclasses must implement the rerank()
method, while the base __call__() method provides runtime type checking
and pipeline-compatible invocation behavior.
Source code in aatm\rerankers.py
__init__(*args, **kwargs)
Initialize the reranker base class.
This constructor accepts arbitrary positional and keyword arguments to support a flexible subclass interface, but it does not perform any initialization itself.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*args
|
Any
|
Positional arguments reserved for subclasses. |
()
|
**kwargs
|
Any
|
Keyword arguments reserved for subclasses. |
{}
|
Returns:
| Type | Description |
|---|---|
None
|
None. |
Source code in aatm\rerankers.py
rerank(retriever_results)
abstractmethod
Rerank the retrieved results for one or more queries.
Subclasses must implement this method to assign reranking scores and reorder documents within each query result set according to their relevance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
retriever_results
|
RetrieverResults
|
Retrieval output containing the original queries and their associated candidate results. |
required |
Returns:
| Type | Description |
|---|---|
RetrieverResults
|
The reranked retrieval results. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
If the subclass does not override this method. |
Source code in aatm\rerankers.py
__call__(retriever_results)
Validate the input type and rerank the retrieval results.
This method makes reranker instances directly callable and compatible
with the pipeline interface. It checks that the input is a
RetrieverResults instance before delegating to rerank().
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
retriever_results
|
RetrieverResults
|
Retrieval output to be reranked. |
required |
Returns:
| Type | Description |
|---|---|
RetrieverResults
|
The reranked retrieval results. |
Raises:
| Type | Description |
|---|---|
AssertionError
|
If |
Source code in aatm\rerankers.py
BM25Reranker
Bases: BaseReranker
Rerank retrieved documents using BM25 lexical similarity.
This reranker scores each retrieved document against its corresponding query using BM25 over whitespace-tokenized text. Scores are stored in each result object and the result lists are sorted in descending score order.
Source code in aatm\rerankers.py
rerank(retriever_results)
Rerank retrieval results using BM25 scores.
For each query, this method builds a BM25 index over the retrieved document expressions, computes lexical relevance scores for the query, stores the scores in the corresponding result objects, and sorts each result list by descending rerank score.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
retriever_results
|
RetrieverResults
|
Retrieval output containing queries and their associated candidate documents. |
required |
Returns:
| Type | Description |
|---|---|
RetrieverResults
|
The same |
Source code in aatm\rerankers.py
Qwen3RerankerModels
Bases: Enum
Enumerate the supported Qwen3 reranker model identifiers.
This enumeration provides the Hugging Face model names corresponding to the available Qwen3 reranker variants supported by the package.
Source code in aatm\rerankers.py
Qwen3Reranker
Bases: BaseReranker
Rerank retrieved documents with a Qwen3 language-model-based judge.
This reranker formats each query-document pair as an instruction-following relevance judgment task, runs the pairs through a Qwen3 causal language model, and interprets the model's probability of answering "yes" as the reranking score.
The class supports custom task instructions, prompt prefix and suffix templates, configurable sequence length, and automatic placement on CPU or CUDA when available.
Source code in aatm\rerankers.py
156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 | |
__init__(model_id, max_length=8192, task=None, prefix=None, suffix=None, *args, **kwargs)
Initialize the Qwen3 reranker and load the model resources.
This constructor resolves the requested model identifier, loads the tokenizer and causal language model, configures the execution device, defines the task prompt and prompt wrappers, and precomputes tokenized prefix and suffix sequences used during input construction.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_id
|
str
|
Name of the supported Qwen3 reranker model variant. |
required |
max_length
|
int
|
Maximum total token length for each formatted input, including prefix and suffix tokens. |
8192
|
task
|
str
|
Optional task instruction describing the relevance judgment objective. A default instruction is used when not provided. |
None
|
prefix
|
str
|
Optional prompt prefix inserted before each formatted query- document pair. A default system-and-user prompt is used when not provided. |
None
|
suffix
|
str
|
Optional prompt suffix appended after each formatted query- document pair. A default assistant prompt is used when not provided. |
None
|
*args
|
Any
|
Additional positional arguments reserved for compatibility. |
()
|
**kwargs
|
Any
|
Additional keyword arguments reserved for compatibility. |
{}
|
Returns:
| Type | Description |
|---|---|
None
|
None. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
OSError
|
If the tokenizer or model weights cannot be loaded. |
Source code in aatm\rerankers.py
format_instruction(instruction, query, doc)
Format a query-document pair as a reranking instruction string.
This helper builds the textual input passed to the language model by combining the task instruction, query, and candidate document into a structured prompt.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
instruction
|
str
|
Task-level instruction that defines the relevance judgment objective. |
required |
query
|
str
|
Search query or source expression being matched. |
required |
doc
|
str
|
Candidate document or concept expression to evaluate. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A formatted string containing the instruction, query, and document. |
Source code in aatm\rerankers.py
process_inputs(pairs)
Tokenize and pad formatted query-document pairs for model inference.
This method tokenizes the provided text pairs, applies truncation while reserving space for the configured prefix and suffix tokens, appends those tokens to each sequence, pads the batch, and moves the resulting tensors to the model device.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pairs
|
List[str]
|
Sequence of formatted query-document prompt strings. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Tensor]
|
A dictionary of model input tensors suitable for forward inference. |
Source code in aatm\rerankers.py
compute_logits(inputs, **kwargs)
Compute relevance scores from the model's final-token logits.
This method runs the model in inference mode, extracts the logits for the final token position, compares the logits for the "yes" and "no" tokens, applies a log-softmax over those two values, and returns the probability assigned to "yes" for each input example.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
inputs
|
dict[str, Tensor]
|
Tokenized model inputs prepared for batch inference. |
required |
**kwargs
|
Any
|
Additional keyword arguments reserved for future extensions. |
{}
|
Returns:
| Type | Description |
|---|---|
list[float]
|
A list of relevance scores, where each score is the model's |
list[float]
|
probability that the corresponding document is relevant to the |
list[float]
|
query. |
Source code in aatm\rerankers.py
rerank(retriever_results)
Rerank retrieval results using Qwen3 relevance judgments.
This method converts each query-document pair into an instruction-based prompt, performs batched model inference to obtain relevance scores, assigns those scores to the corresponding retrieved documents, and sorts each query result list in descending score order.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
retriever_results
|
RetrieverResults
|
Retrieval output containing queries and candidate documents to rerank. |
required |
Returns:
| Type | Description |
|---|---|
RetrieverResults
|
The same retrieval results object with updated |
RetrieverResults
|
values and reordered candidate lists. |