gnes.indexer.doc.filesys module

class gnes.indexer.doc.filesys.DirectoryIndexer(data_path: str, keep_na_doc: bool = True, file_suffix: str = 'gif', *args, **kwargs)[source]

Bases: gnes.indexer.base.BaseDocIndexer

add(keys: List[int], docs: List[gnes_pb2.Document], *args, **kwargs)[source]

write GIFs of each document into disk folder structure: /data_path/doc_id/0.gif, 1.gif… :param keys: list of doc id :param docs: list of docs

query(keys: List[int], *args, **kwargs) → List[gnes_pb2.Document][source]
Parameters:keys – list of doc id
Returns:list of documents whose chunks field contain all the GIFs of this doc(one GIF per chunk)
train(*args, **kwargs)

Train the model, need to be overrided