- Machine Learning: other versions:
- Setup and security
- Getting started with machine learning
- Anomaly detection
- Overview
- Concepts
- Configure anomaly detection
- API quick reference
- Supplied configurations
- Function reference
- Examples
- Generating alerts for anomaly detection jobs
- Aggregating data for faster performance
- Customizing detectors with custom rules
- Detecting anomalous categories of data
- Detecting anomalous locations in geographic data
- Performing population analysis
- Altering data in your datafeed with runtime fields
- Adding custom URLs to machine learning results
- Handling delayed data
- Mapping anomalies by location
- Exporting and importing machine learning jobs
- Limitations
- Troubleshooting
- Data frame analytics
Feature encoding
editFeature encoding
editMachine learning models can only work with numerical values. For this reason, it is necessary to transform the categorical values of the relevant features into numerical ones. This process is called feature encoding.
Data frame analytics automatically performs feature encoding. The input data is pre-processed with the following encoding techniques:
- one-hot encoding: Assigns vectors to each category. The vector represent whether the corresponding feature is present (1) or not (0).
- target-mean encoding: Replaces categorical values with the mean value of the target variable.
- frequency encoding: Takes into account how many times a given categorical value is present in relation with a feature.
When the model makes predictions on new data, the data needs to be processed in the same way it was trained. Machine learning model inference in the Elastic Stack does this automatically, so the automatically applied encodings are used in each call for inference.
Feature importance is calculated for the original categorical fields, not the automatically encoded features.