gnes.preprocessor.text.split module

class gnes.preprocessor.text.split.SentSplitPreprocessor(min_sent_len: int = 1, max_sent_len: int = 256, deliminator: str = '.!?。!?', is_json: bool = False, *args, **kwargs)[source]

Bases: gnes.preprocessor.base.BaseTextPreprocessor

apply(doc: gnes_pb2.Document) → None[source]
train(*args, **kwargs)

Train the model, need to be overrided