Go to Dcipher Analytics
All Collections
Text segmentation & tokenization
Text segmentation & tokenization
Split your text into manageable pieces through text segmentation and tokenization.
4 articles in this collection
Written by
Zafer Çavdar
and
Tomas Larsson
Splitting text into sentences
This operation splits text into sentences while paying attention to language-specific syntax rules.
Written by
Zafer Çavdar
Updated over a week ago
Tokenization: splitting text into words and phrases
The Tokenize & Tag operation splits text into meaningful units in the form of words, phrases, and named entities, tagged by part-of-speech.
Written by
Tomas Larsson
Updated over a week ago
Smart segmentation
The Segment Text operation splits text into segments based on semantic similarity as well as references between adjacent sentences.
Written by
Tomas Larsson
Updated over a week ago
Pattern-based text splitting
The Split by Pattern operation splits text by pattern, for example into sentences or paragraphs.
Written by
Tomas Larsson
Updated over a week ago