Go to Dcipher Analytics
All Collections
Text segmentation & tokenization
Text segmentation & tokenization
Split your text into manageable pieces through text segmentation and tokenization.
3 articles in this collection
Written by
Aslı Duruoğlu
and
Tomas Larsson
Smart segmentation
The Segment Text operation splits text into segments based on semantic similarity as well as references between adjacent sentences.
Written by
Aslı Duruoğlu
Updated over a week ago
Pattern-based text splitting
The Split by Pattern operation splits text by pattern, for example into sentences or paragraphs.
Written by
Aslı Duruoğlu
Updated over a week ago
Tokenization: splitting text into words and phrases
The Tokenize & Tag operation splits text into meaningful units in the form of words, phrases, and named entities, tagged by part-of-speech.
Written by
Tomas Larsson
Updated over a week ago