gnes.router.reduce module

class gnes.router.reduce.AvgEmbedRouter(*args, **kwargs)[source]

Bases: gnes.router.base.BaseEmbedReduceRouter

Gather all embeddings from multiple encoders and do average on a specific axis. In default, average will happen on the first axis. chunk_idx, doc_idx denote index in for loop used in BaseEmbedReduceRouter

reduce_embedding(accum_msgs, msg_type, chunk_idx, doc_idx)[source]
train(*args, **kwargs)

Train the model, need to be overrided

class gnes.router.reduce.Chunk2DocTopkReducer(reduce_op='sum', *args, **kwargs)[source]

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all chunks by their doc_id, result in a topk doc list. This is almost always useful, as the final result should be group by doc_id not chunk

get_key(x)[source]
Return type:str
set_key(x, k)[source]
train(*args, **kwargs)

Train the model, need to be overrided

class gnes.router.reduce.ChunkTopkReducer(reduce_op='sum', *args, **kwargs)[source]

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all chunks by their chunk_id from all shards, aka doc_id-offset, result in a topk chunk list

get_key(x)[source]
Return type:str
set_key(x, k)[source]
train(*args, **kwargs)

Train the model, need to be overrided

class gnes.router.reduce.ConcatEmbedRouter(*args, **kwargs)[source]

Bases: gnes.router.base.BaseEmbedReduceRouter

Gather all embeddings from multiple encoders and concat them on a specific axis. In default, concat will happen on the last axis. chunk_idx, doc_idx denote index in for loop used in BaseEmbedReduceRouter

reduce_embedding(accum_msgs, msg_type, chunk_idx, doc_idx)[source]
train(*args, **kwargs)

Train the model, need to be overrided

class gnes.router.reduce.DocFillReducer(*args, **kwargs)[source]

Bases: gnes.router.base.BaseReduceRouter

Gather all documents raw content from multiple shards. This is only useful when you have - multiple doc-indexer and docs are spreaded over multiple shards. - require full-doc retrieval with the original content, not just an doc id Ideally, only each doc can only belong to one shard.

apply(msg, accum_msgs, *args, **kwargs)[source]

Modify the current message based on accumulated messages

Parameters:
  • msg (gnes_pb2.Message) – the current message
  • accum_msgs (List[Message')]) – accumulated messages
train(*args, **kwargs)

Train the model, need to be overrided

class gnes.router.reduce.DocTopkReducer(reduce_op='sum', *args, **kwargs)[source]

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all docs by their doc_id, result in a topk doc list

get_key(x)[source]
Return type:str
set_key(x, k)[source]
train(*args, **kwargs)

Train the model, need to be overrided