All Collections
About Dcipher Analytics
What functionality does Dcipher Analytics offer?
What functionality does Dcipher Analytics offer?

Whatever your text analytics needs are, Dcipher Analytics has you covered.

Tomas Larsson avatar
Written by Tomas Larsson
Updated over a week ago

Keywords: features

Dcipher Analytics covers the following set of features. Click the visual feature tree to view it in full-size or explore the list of features below.

Features

NLP/Text analytics

  • Preprocessing
    └── Case modifications
    └── URL cleaning
    └── XML/HTML tag cleaning
    └── Tab/new line cleaning
    └── Hashtag cleaning
    └── Punctuation cleaning
    └── Spell checking
    └── Stop-word cleaning
    └── Substring/repeat cleaning
    └── Emoji cleaning
    └── Changing field types
    └── Changing date formats
    └── Automated text subtype detection

  • Text segmentation
    └── Split text into words
    └── Lemmatization
    └── Part-of-speech tagging
    └── Named Entity Recognition
    └── Phrase detection
    └── Split text into sentences
    └── Split text into paragraphs
    └── Rule-based paragraph boundary detection
    └── Context-aware paragraph boundary detection
    └── Content extraction
    └── Extract content based on keywords
    └── Extract content based on pattern

  • Vectorizers

    └── GPT

    └── Cohere

    └── Sentence Transformers
    └── Word2Vec
    └── Doc2Vec
    └── FastText
    └── GloVe
    └── BERT
    └── ELMo
    └── OneHot Encoding

  • Enrichment
    └── Sentiment analysis
    └── Emojization & Emoji extraction
    └── Key phrase detection
    └── Language detection
    └── Concept detection
    └── Quotation detection
    └── Text statistics
    └── Date extraction
    └── Text summarization
    └── Topic detection

  • Semantic similarity modelling
    └── Document to document similarity calculations
    └── Document scoring based on words
    └── Word scoring based on words

  • Contextual analysis
    └── Cosine similarity on case-term frequency vectors
    └── CKC similarity on case-term frequency vectors
    └── Co-occurrence similarity
    └── Burst-based over-representation

  • Temporal analysis
    └── Momentum
    └── Burst detection
    └── Topic evolution analysis

  • Regular expression operations
    └── Split by pattern
    └── Extract properties by pattern
    └── Replace pattern

  • 3rd party integrations

    └── OpenAI

    └── Cohere

    └── Anthropic

    └── Google Vertex AI
    └── Google Translate
    └── Google NLP
    └── IBM Watson Natural Language Understanding

Machine learning

  • Supervised learning
    └── Classification
    └── Logistic Regression
    └── Decision Trees
    └── Random Forests
    └── Multilayer Neural Networks
    └── Support Vector Machines
    └── Naive Bayes
    └── Regression
    └── Linear Regression
    └── Decision Trees
    └── Random Forests
    └── Gradient Boosted Regressors
    └── Isotonic Regressors
    └── AFT Survival Regressors

  • Unsupervised learning
    └── Clustering
    └── K-means clustering
    └── Gaussian mixture clustering
    └── Power iteration clustering
    └── DBSCAN clustering
    └── Outlier scoring
    └── One-class SVM
    └── Robust Covarience
    └── Isolation Forest
    └── Local Outlier Factor
    └── Robust PCA
    └── Dimensionality reduction
    └── Principal component analysis
    └── Singular value decomposition
    └── T-distributed Stochastic Neighbor Embedding
    └── Uniform Manifold Approximation and Projection

  • Semi-supervised learning
    └── Classification
    └── Tagging
    └── Regression

Data visualization

  • Bar chart

  • Bubble cloud

  • Comparison cloud

  • Contour plot

  • Network

  • Heat map

  • Scatter plot

  • Schema

  • Table

  • Word cloud

  • Bump chart

  • Foam chart

  • Line chart

Numerical operations

  • Preprocessing
    └── Formula-based transformations
    └── Discretization
    └── Normalization
    └── Imputation
    └── Convert numbers to words

  • Grouping and aggregations
    └── SQL aggregation functions
    └── Standard deviation
    └── Variance
    └── Median
    └── Vector sum
    └── Vector mean
    └── Entropy
    └── Gini-index​

Filtering

  • Numeric filters
    └── Greater than
    └── Greater than or equal to
    └── Smaller than
    └── Smaller than or equal to
    └── In between

  • Value filters
    └── Contains
    └── Does not contain
    └── Is one of
    └── Is not one of

  • Regexp filters
    └── Matches regexp
    └── Does not match regexp

  • Geometric filters
    └── Located inside
    └── Located outside

Import & export of data

  • Import data
    └── From file
    └── JSON/JSONL file
    └── Delimiter-separated file (CSV, TSV, etc)
    └── PDF files(s)

    └── Word files(s)
    └── Excel file
    └── TXT file(s)
    └── From social media
    └── Facebook
    └── Twitter
    └── Instagram
    └── YouTube
    └── Forums
    └── Blogs

    └── From news media

    └── From Survey Monkey

    └── From Miro

  • Export results
    └── As JSON, XML
    └── As Table (CSV/TSV, Excel)

    └── As Document (PDF, DOCX)
    └── As Image (SVG)

Did this answer your question?