Retrievers registry
Register and instantiate ChromaDB-backed retriever configurations.
This module defines the registry of supported embedding models used to build ChromaDB retrievers. Each registry entry stores the information required to construct a retriever, including the model identifier, embedding function class, collection name, and optional rate limit metadata.
The module also provides a factory function for loading a configured
ChromaDBRetriever instance from a registered model name.
ChromaRetrieverModelSpec
dataclass
Store the configuration required for a ChromaDB-backed retriever.
Instances of this dataclass describe a retriever model specification, including the embedding model identifier, the embedding function class used to encode queries, the target ChromaDB collection name, and optional rate limit metadata.
Source code in aatm\registries\retrievers.py
name
instance-attribute
Unique registry name used to identify the retriever model.
model_id
instance-attribute
Identifier of the embedding model used by the retriever.
embedding_function_cls
instance-attribute
Embedding function class used to encode queries for retrieval.
collection_name = 'expressions'
class-attribute
instance-attribute
Name of the ChromaDB collection queried by the retriever.
rate_limit = None
class-attribute
instance-attribute
Optional maximum throughput in items per minute for embedding generation.
chromadb_path
property
Return the filesystem path for this retriever's ChromaDB database.
The path is derived from the retriever name and points to the local directory where the persistent ChromaDB collection is stored.
Returns:
| Type | Description |
|---|---|
str
|
The string path to the retriever's ChromaDB persistence directory. |
output_path
property
Return the default output directory associated with this retriever.
This path can be used by downstream workflows to store outputs related to the retriever configuration.
Returns:
| Type | Description |
|---|---|
str
|
The string path to the default output directory for this retriever. |
to_dict()
Convert the retriever specification to a dictionary.
This helper provides a dictionary representation of the retriever configuration for compatibility with code paths that expect mapping-based access instead of attribute-based access.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dictionary containing the retriever model identifier, embedding function class, collection name, ChromaDB path, output path, and rate limit. |
Source code in aatm\registries\retrievers.py
load_retriever(model_name)
Instantiate a registered ChromaDB retriever by model name.
This function looks up a retriever specification in the model registry,
creates a persistent ChromaDB client for the configured storage path,
instantiates the corresponding embedding function, and returns a configured
ChromaDBRetriever.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Registry name of the retriever model to instantiate. |
required |
Returns:
| Type | Description |
|---|---|
ChromaDBRetriever
|
A configured |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the provided model name is not present in the retriever registry. |