Concept detection

The Detect Concepts operation finds the underlying concepts in the input text.

Zafer Γ‡avdar avatar
Written by Zafer Γ‡avdar
Updated over a week ago

Keywords: detect concepts

The Detect Concepts operation is used to reveal underlying concepts in a text, even when the concepts have not been explicitly mentioned.

πŸ‘‰ Example: To illustrate the notion of conceptualization of text, consider the following input text and the resulting concepts.

Input text: "The presence of titin antibody is associated with late onset of myasthenia gravis (MG) and a variable risk for thymoma. The presence of circulating titin antibodies in late-onset non-thymoma MG patients indicates more severe disease."

Detected concepts: autoimmune disease, muscle-specific protein, muscle disease, rare disease.

The operation is useful for mapping the content of the text on a higher level than that of specific words and phrases mentioned in the text. It uses lexical semantics analysis to identify the most central concepts given words and entities in the text.

Concept detection works for text in 45 languages. The output concepts are in English, allowing for analysis of text in multiple languages without prior translation. However, depending on the language, different pipelines are applied:

Step-by-step guide

1. Open the operation configuration window

Select the text field that you want to apply the operation to the Schema workbench and click the "Add operation" button at the top of the workspace.

Search for "Detect concept" or find the operation under "Text enrichment" and click it.

2. Name the output field

Under "Output collection name", type the name of the field that the concept collections should be inserted into.

3. Specify the language

Under "Language source", specify the language of the input text. In case the dataset contains text in different languages, first apply language detection, and specify the resulting language field as the language source.

4. Apply the operation

Click "Apply" to run the operation. This outputs a collection with identified concepts and their scores (indicating how strongly a concept relates to a text).

Did this answer your question?