gnes.router.reduce module¶

class gnes.router.reduce.AvgEmbedRouter(*args, **kwargs)[source]¶

Bases: gnes.router.base.BaseEmbedReduceRouter

Gather all embeddings from multiple encoders and do average on a specific axis. In default, average will happen on the first axis. chunk_idx, doc_idx denote index in for loop used in BaseEmbedReduceRouter

reduce_embedding(accum_msgs, msg_type, chunk_idx, doc_idx)[source]¶

train(*args, **kwargs)¶: Train the model, need to be overrided

class gnes.router.reduce.Chunk2DocTopkReducer(reduce_op='sum', *args, **kwargs)[source]¶

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all chunks by their doc_id, result in a topk doc list. This is almost always useful, as the final result should be group by doc_id not chunk

get_key(x)[source]¶

Return type:	`str`

set_key(x, k)[source]¶

train(*args, **kwargs)¶: Train the model, need to be overrided

class gnes.router.reduce.ChunkTopkReducer(reduce_op='sum', *args, **kwargs)[source]¶

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all chunks by their chunk_id from all shards, aka doc_id-offset, result in a topk chunk list

get_key(x)[source]¶

Return type:	`str`

set_key(x, k)[source]¶

train(*args, **kwargs)¶: Train the model, need to be overrided

class gnes.router.reduce.ConcatEmbedRouter(*args, **kwargs)[source]¶

Bases: gnes.router.base.BaseEmbedReduceRouter

Gather all embeddings from multiple encoders and concat them on a specific axis. In default, concat will happen on the last axis. chunk_idx, doc_idx denote index in for loop used in BaseEmbedReduceRouter

reduce_embedding(accum_msgs, msg_type, chunk_idx, doc_idx)[source]¶

train(*args, **kwargs)¶: Train the model, need to be overrided

class gnes.router.reduce.DocFillReducer(*args, **kwargs)[source]¶

Bases: gnes.router.base.BaseReduceRouter

Gather all documents raw content from multiple shards. This is only useful when you have - multiple doc-indexer and docs are spreaded over multiple shards. - require full-doc retrieval with the original content, not just an doc id Ideally, only each doc can only belong to one shard.

apply(msg, accum_msgs, *args, **kwargs)[source]¶

Modify the current message based on accumulated messages

Parameters:	msg (gnes_pb2.Message) – the current message accum_msgs (`List`[`Message')`]) – accumulated messages

train(*args, **kwargs)¶: Train the model, need to be overrided

class gnes.router.reduce.DocTopkReducer(reduce_op='sum', *args, **kwargs)[source]¶

Bases: gnes.router.base.BaseTopkReduceRouter

Gather all docs by their doc_id, result in a topk doc list

get_key(x)[source]¶

Return type:	`str`

set_key(x, k)[source]¶

train(*args, **kwargs)¶: Train the model, need to be overrided