The classify text (zero-shot) operation does not require the training of a text classifier beforehand and calculates relevance scores for arbitrarily defined labels based on the input text. It is used primarily in content-based classifications.
Step-by-step guide
1. Open the operation configuration window
Click the "Add operation" button at the top of the workspace.
Search for "Classify text (zero-shot)" or find the operation under "Text enrichment" and click it.
2. Specify the language
In the "Language" drop-down, select the language of your input text. Currently, you can use this operation in 15 different languages.
3. Add labels
Multi-label mode allows scoring each label independent of the other labels, and each label gets a relevance score between 0 and 1.0. There should be at least one label when the multi-label mode is enabled. By disabling this feature, you can categorize the labels into mutually exclusive categories and calculate the scores together. Therefore, there should be at least two labels when the multi-label mode is disabled.
To add labels, type a label and hit enter. This field supports entering multiple labels.
4. Auto-detect score threshold
The score threshold is the level you set for removing the labels with less relevance than the set threshold. Turn on the auto-detect score threshold to find the input-specific optimal score threshold when filtering labels.
5. Name the output field
Under "Output field name,” type the name of the output field.
6. Apply the operation
Click "Apply" to run the operation. Now, the text data is classified according to the labels. The Schema output is shown below as “Scores.” To review the output, create a Table View workbench and carry the dataset to the workbench. Below the Scores field, there are three subfields, which are id, label, and score. Label shows the classified label regarding its context, whereas the score indicates the relevancy score of the labeling.