Keywords: continuous, discrete, numbers
"Discretize" operation reduces the number of values from a continuous numeric variable like age, money, etc by grouping them into intervals or bins.
Step-by-step guide
1. Select the numeric field
Select the numeric field you want to discretize in the Schema View.
2. Open the operation configuration window
Click the "Add operations" button at the top of the workspace, then click the "Numerical operations" tab, then click "Discretize".
3. Set the parameters of the operation
The following parameters can be set:
Numeric path: Name of the input numeric source field.
Method:
Equal width: All bins will have the same range no matter how many data points are in each bin. For example, (1.0, 3.0), (3.0, 5.0), (5.0, 7.0).
Equal frequency: All bins will have the same number of data points no matter how short or long their ranges are.
Number of buckets: Number of bins/intervals desired.
Output field name: Assigned intervals will be stored as a categorial string under this field.
4. Apply the operation
Click "Apply". A new categorical string (named "bins" by default) has now been generated and added to the schema.