Keywords: continuous, discrete, numbers

"Discretize" operation reduces the number of values from a continuous numeric variable like age, money, etc by grouping them into intervals or bins.

Step-by-step guide

1. Select the numeric field

Select the numeric field you want to discretize in the Schema View.

2. Open the operation configuration window

Click the "Add operations" button at the top of the workspace, then click the "Numerical operations" tab, then click "Discretize".

3. Set the parameters of the operation

The following parameters can be set:

  • Numeric path: Name of the input numeric source field.

  • Method:

    • Equal width: All bins will have the same range no matter how many data points are in each bin. For example, (1.0, 3.0), (3.0, 5.0), (5.0, 7.0).

    • Equal frequency: All bins will have the same number of data points no matter how short or long their ranges are.

  • Number of buckets: Number of bins/intervals desired.

  • Output field name: Assigned intervals will be stored as a categorial string under this field.

4. Apply the operation

Click "Apply". A new categorical string (named "bins" by default) has now been generated and added to the schema.




Did this answer your question?