- Elasticsearch Guide: other versions:
- What is Elasticsearch?
- What’s new in 7.17
- Quick start
- Set up Elasticsearch
- Installing Elasticsearch
- Configuring Elasticsearch
- Important Elasticsearch configuration
- Secure settings
- Auditing settings
- Circuit breaker settings
- Cluster-level shard allocation and routing settings
- Cross-cluster replication settings
- Discovery and cluster formation settings
- Field data cache settings
- Index lifecycle management settings
- Index management settings
- Index recovery settings
- Indexing buffer settings
- License settings
- Local gateway settings
- Logging
- Machine learning settings
- Monitoring settings
- Node
- Networking
- Node query cache settings
- Search settings
- Security settings
- Shard request cache settings
- Snapshot and restore settings
- Transforms settings
- Thread pools
- Watcher settings
- Advanced configuration
- Important system configuration
- Bootstrap Checks
- Heap size check
- File descriptor check
- Memory lock check
- Maximum number of threads check
- Max file size check
- Maximum size virtual memory check
- Maximum map count check
- Client JVM check
- Use serial collector check
- System call filter check
- OnError and OnOutOfMemoryError checks
- Early-access check
- G1GC check
- All permission check
- Discovery configuration check
- Bootstrap Checks for X-Pack
- Starting Elasticsearch
- Stopping Elasticsearch
- Discovery and cluster formation
- Add and remove nodes in your cluster
- Full-cluster restart and rolling restart
- Remote clusters
- Set up X-Pack
- Configuring X-Pack Java Clients
- Plugins
- Upgrade Elasticsearch
- Index modules
- Mapping
- Text analysis
- Overview
- Concepts
- Configure text analysis
- Built-in analyzer reference
- Tokenizer reference
- Token filter reference
- Apostrophe
- ASCII folding
- CJK bigram
- CJK width
- Classic
- Common grams
- Conditional
- Decimal digit
- Delimited payload
- Dictionary decompounder
- Edge n-gram
- Elision
- Fingerprint
- Flatten graph
- Hunspell
- Hyphenation decompounder
- Keep types
- Keep words
- Keyword marker
- Keyword repeat
- KStem
- Length
- Limit token count
- Lowercase
- MinHash
- Multiplexer
- N-gram
- Normalization
- Pattern capture
- Pattern replace
- Phonetic
- Porter stem
- Predicate script
- Remove duplicates
- Reverse
- Shingle
- Snowball
- Stemmer
- Stemmer override
- Stop
- Synonym
- Synonym graph
- Trim
- Truncate
- Unique
- Uppercase
- Word delimiter
- Word delimiter graph
- Character filters reference
- Normalizers
- Index templates
- Data streams
- Ingest pipelines
- Example: Parse logs
- Enrich your data
- Processor reference
- Append
- Bytes
- Circle
- Community ID
- Convert
- CSV
- Date
- Date index name
- Dissect
- Dot expander
- Drop
- Enrich
- Fail
- Fingerprint
- Foreach
- GeoIP
- Grok
- Gsub
- HTML strip
- Inference
- Join
- JSON
- KV
- Lowercase
- Network direction
- Pipeline
- Registered domain
- Remove
- Rename
- Script
- Set
- Set security user
- Sort
- Split
- Trim
- Uppercase
- URL decode
- URI parts
- User agent
- Aliases
- Search your data
- Query DSL
- Aggregations
- Bucket aggregations
- Adjacency matrix
- Auto-interval date histogram
- Categorize text
- Children
- Composite
- Date histogram
- Date range
- Diversified sampler
- Filter
- Filters
- Geo-distance
- Geohash grid
- Geotile grid
- Global
- Histogram
- IP range
- Missing
- Multi Terms
- Nested
- Parent
- Range
- Rare terms
- Reverse nested
- Sampler
- Significant terms
- Significant text
- Terms
- Variable width histogram
- Subtleties of bucketing range fields
- Metrics aggregations
- Pipeline aggregations
- Average bucket
- Bucket script
- Bucket count K-S test
- Bucket correlation
- Bucket selector
- Bucket sort
- Cumulative cardinality
- Cumulative sum
- Derivative
- Extended stats bucket
- Inference bucket
- Max bucket
- Min bucket
- Moving average
- Moving function
- Moving percentiles
- Normalize
- Percentiles bucket
- Serial differencing
- Stats bucket
- Sum bucket
- Bucket aggregations
- EQL
- SQL
- Overview
- Getting Started with SQL
- Conventions and Terminology
- Security
- SQL REST API
- SQL Translate API
- SQL CLI
- SQL JDBC
- SQL ODBC
- SQL Client Applications
- SQL Language
- Functions and Operators
- Comparison Operators
- Logical Operators
- Math Operators
- Cast Operators
- LIKE and RLIKE Operators
- Aggregate Functions
- Grouping Functions
- Date/Time and Interval Functions and Operators
- Full-Text Search Functions
- Mathematical Functions
- String Functions
- Type Conversion Functions
- Geo Functions
- Conditional Functions And Expressions
- System Functions
- Reserved keywords
- SQL Limitations
- Scripting
- Data management
- ILM: Manage the index lifecycle
- Overview
- Concepts
- Automate rollover
- Customize built-in ILM policies
- Index lifecycle actions
- Configure a lifecycle policy
- Migrate index allocation filters to node roles
- Troubleshooting index lifecycle management errors
- Start and stop index lifecycle management
- Manage existing indices
- Skip rollover
- Restore a managed data stream or index
- Autoscaling
- Monitor a cluster
- Roll up or transform your data
- Set up a cluster for high availability
- Snapshot and restore
- Secure the Elastic Stack
- Elasticsearch security principles
- Configuring security
- Updating node security certificates
- User authentication
- Built-in users
- Service accounts
- Internal users
- Token-based authentication services
- Realms
- Realm chains
- Active Directory user authentication
- File-based user authentication
- LDAP user authentication
- Native user authentication
- OpenID Connect authentication
- PKI user authentication
- SAML authentication
- Kerberos authentication
- Integrating with other authentication systems
- Enabling anonymous access
- Controlling the user cache
- Configuring SAML single-sign-on on the Elastic Stack
- Configuring single sign-on to the Elastic Stack using OpenID Connect
- User authorization
- Built-in roles
- Defining roles
- Security privileges
- Document level security
- Field level security
- Granting privileges for data streams and aliases
- Mapping users and groups to roles
- Setting up field and document level security
- Submitting requests on behalf of other users
- Configuring authorization delegation
- Customizing roles and authorization
- Enable audit logging
- Restricting connections with IP filtering
- Securing clients and integrations
- Operator privileges
- Troubleshooting
- Some settings are not returned via the nodes settings API
- Authorization exceptions
- Users command fails due to extra arguments
- Users are frequently locked out of Active Directory
- Certificate verification fails for curl on Mac
- SSLHandshakeException causes connections to fail
- Common SSL/TLS exceptions
- Common Kerberos exceptions
- Common SAML issues
- Internal Server Error in Kibana
- Setup-passwords command fails due to connection failure
- Failures due to relocation of the configuration files
- Limitations
- Watcher
- Command line tools
- How to
- REST APIs
- API conventions
- Autoscaling APIs
- Compact and aligned text (CAT) APIs
- cat aliases
- cat allocation
- cat anomaly detectors
- cat count
- cat data frame analytics
- cat datafeeds
- cat fielddata
- cat health
- cat indices
- cat master
- cat nodeattrs
- cat nodes
- cat pending tasks
- cat plugins
- cat recovery
- cat repositories
- cat segments
- cat shards
- cat snapshots
- cat task management
- cat templates
- cat thread pool
- cat trained model
- cat transforms
- Cluster APIs
- Cluster allocation explain
- Cluster get settings
- Cluster health
- Cluster reroute
- Cluster state
- Cluster stats
- Cluster update settings
- Nodes feature usage
- Nodes hot threads
- Nodes info
- Nodes reload secure settings
- Nodes stats
- Pending cluster tasks
- Remote cluster info
- Task management
- Voting configuration exclusions
- Cross-cluster replication APIs
- Data stream APIs
- Document APIs
- Enrich APIs
- EQL APIs
- Features APIs
- Fleet APIs
- Find structure API
- Graph explore API
- Index APIs
- Alias exists
- Aliases
- Analyze
- Analyze index disk usage
- Clear cache
- Clone index
- Close index
- Create index
- Create or update alias
- Create or update component template
- Create or update index template
- Create or update index template (legacy)
- Delete component template
- Delete dangling index
- Delete alias
- Delete index
- Delete index template
- Delete index template (legacy)
- Exists
- Field usage stats
- Flush
- Force merge
- Freeze index
- Get alias
- Get component template
- Get field mapping
- Get index
- Get index settings
- Get index template
- Get index template (legacy)
- Get mapping
- Import dangling index
- Index recovery
- Index segments
- Index shard stores
- Index stats
- Index template exists (legacy)
- List dangling indices
- Open index
- Refresh
- Resolve index
- Rollover
- Shrink index
- Simulate index
- Simulate template
- Split index
- Synced flush
- Type exists
- Unfreeze index
- Update index settings
- Update mapping
- Index lifecycle management APIs
- Create or update lifecycle policy
- Get policy
- Delete policy
- Move to step
- Remove policy
- Retry policy
- Get index lifecycle management status
- Explain lifecycle
- Start index lifecycle management
- Stop index lifecycle management
- Migrate indices, ILM policies, and legacy, composable and component templates to data tiers routing
- Ingest APIs
- Info API
- Licensing APIs
- Logstash APIs
- Machine learning anomaly detection APIs
- Add events to calendar
- Add jobs to calendar
- Close jobs
- Create jobs
- Create calendars
- Create datafeeds
- Create filters
- Delete calendars
- Delete datafeeds
- Delete events from calendar
- Delete filters
- Delete forecasts
- Delete jobs
- Delete jobs from calendar
- Delete model snapshots
- Delete expired data
- Estimate model memory
- Find file structure
- Flush jobs
- Forecast jobs
- Get buckets
- Get calendars
- Get categories
- Get datafeeds
- Get datafeed statistics
- Get influencers
- Get jobs
- Get job statistics
- Get machine learning info
- Get model snapshots
- Get model snapshot upgrade statistics
- Get overall buckets
- Get scheduled events
- Get filters
- Get records
- Open jobs
- Post data to jobs
- Preview datafeeds
- Reset jobs
- Revert model snapshots
- Set upgrade mode
- Start datafeeds
- Stop datafeeds
- Update datafeeds
- Update filters
- Update jobs
- Update model snapshots
- Upgrade model snapshots
- Machine learning data frame analytics APIs
- Create data frame analytics jobs
- Delete data frame analytics jobs
- Evaluate data frame analytics
- Explain data frame analytics
- Get data frame analytics jobs
- Get data frame analytics jobs stats
- Preview data frame analytics
- Start data frame analytics jobs
- Stop data frame analytics jobs
- Update data frame analytics jobs
- Machine learning trained model APIs
- Migration APIs
- Node lifecycle APIs
- Reload search analyzers API
- Repositories metering APIs
- Rollup APIs
- Script APIs
- Search APIs
- Searchable snapshots APIs
- Security APIs
- Authenticate
- Change passwords
- Clear cache
- Clear roles cache
- Clear privileges cache
- Clear API key cache
- Clear service account token caches
- Create API keys
- Create or update application privileges
- Create or update role mappings
- Create or update roles
- Create or update users
- Create service account tokens
- Delegate PKI authentication
- Delete application privileges
- Delete role mappings
- Delete roles
- Delete service account token
- Delete users
- Disable users
- Enable users
- Get API key information
- Get application privileges
- Get builtin privileges
- Get role mappings
- Get roles
- Get service accounts
- Get service account credentials
- Get token
- Get user privileges
- Get users
- Grant API keys
- Has privileges
- Invalidate API key
- Invalidate token
- OpenID Connect prepare authentication
- OpenID Connect authenticate
- OpenID Connect logout
- Query API key information
- SAML prepare authentication
- SAML authenticate
- SAML logout
- SAML invalidate
- SAML complete logout
- SAML service provider metadata
- SSL certificate
- Snapshot and restore APIs
- Snapshot lifecycle management APIs
- SQL APIs
- Transform APIs
- Usage API
- Watcher APIs
- Definitions
- Migration guide
- Release notes
- Elasticsearch version 7.17.28
- Elasticsearch version 7.17.27
- Elasticsearch version 7.17.26
- Elasticsearch version 7.17.25
- Elasticsearch version 7.17.24
- Elasticsearch version 7.17.23
- Elasticsearch version 7.17.22
- Elasticsearch version 7.17.21
- Elasticsearch version 7.17.20
- Elasticsearch version 7.17.19
- Elasticsearch version 7.17.18
- Elasticsearch version 7.17.17
- Elasticsearch version 7.17.16
- Elasticsearch version 7.17.15
- Elasticsearch version 7.17.14
- Elasticsearch version 7.17.13
- Elasticsearch version 7.17.12
- Elasticsearch version 7.17.11
- Elasticsearch version 7.17.10
- Elasticsearch version 7.17.9
- Elasticsearch version 7.17.8
- Elasticsearch version 7.17.7
- Elasticsearch version 7.17.6
- Elasticsearch version 7.17.5
- Elasticsearch version 7.17.4
- Elasticsearch version 7.17.3
- Elasticsearch version 7.17.2
- Elasticsearch version 7.17.1
- Elasticsearch version 7.17.0
- Elasticsearch version 7.16.3
- Elasticsearch version 7.16.2
- Elasticsearch version 7.16.1
- Elasticsearch version 7.16.0
- Elasticsearch version 7.15.2
- Elasticsearch version 7.15.1
- Elasticsearch version 7.15.0
- Elasticsearch version 7.14.2
- Elasticsearch version 7.14.1
- Elasticsearch version 7.14.0
- Elasticsearch version 7.13.4
- Elasticsearch version 7.13.3
- Elasticsearch version 7.13.2
- Elasticsearch version 7.13.1
- Elasticsearch version 7.13.0
- Elasticsearch version 7.12.1
- Elasticsearch version 7.12.0
- Elasticsearch version 7.11.2
- Elasticsearch version 7.11.1
- Elasticsearch version 7.11.0
- Elasticsearch version 7.10.2
- Elasticsearch version 7.10.1
- Elasticsearch version 7.10.0
- Elasticsearch version 7.9.3
- Elasticsearch version 7.9.2
- Elasticsearch version 7.9.1
- Elasticsearch version 7.9.0
- Elasticsearch version 7.8.1
- Elasticsearch version 7.8.0
- Elasticsearch version 7.7.1
- Elasticsearch version 7.7.0
- Elasticsearch version 7.6.2
- Elasticsearch version 7.6.1
- Elasticsearch version 7.6.0
- Elasticsearch version 7.5.2
- Elasticsearch version 7.5.1
- Elasticsearch version 7.5.0
- Elasticsearch version 7.4.2
- Elasticsearch version 7.4.1
- Elasticsearch version 7.4.0
- Elasticsearch version 7.3.2
- Elasticsearch version 7.3.1
- Elasticsearch version 7.3.0
- Elasticsearch version 7.2.1
- Elasticsearch version 7.2.0
- Elasticsearch version 7.1.1
- Elasticsearch version 7.1.0
- Elasticsearch version 7.0.0
- Elasticsearch version 7.0.0-rc2
- Elasticsearch version 7.0.0-rc1
- Elasticsearch version 7.0.0-beta1
- Elasticsearch version 7.0.0-alpha2
- Elasticsearch version 7.0.0-alpha1
- Dependencies and versions
Create trained models API
editCreate trained models API
editCreates a trained model.
Models created in version 7.8.0 are not backwards compatible with older node versions. If in a mixed cluster environment, all nodes must be at least 7.8.0 to use a model stored by a 7.8.0 node.
Request
editPUT _ml/trained_models/<model_id>
Prerequisites
editRequires the manage_ml
cluster privilege. This privilege is included in the
machine_learning_admin
built-in role.
Description
editThe create trained model API enables you to supply a trained model that is not created by data frame analytics.
Path parameters
edit-
<model_id>
- (Required, string) The unique identifier of the trained model.
Query parameters
edit-
defer_definition_decompression
-
(Optional, boolean)
If set to
true
and acompressed_definition
is provided, the request defers definition decompression and skips relevant validations. This deferral is useful for systems or users that know a good byte size estimate for their model and know that their model is valid and likely won’t fail during inference.
Request body
edit-
compressed_definition
-
(Required, string)
The compressed (GZipped and Base64 encoded) inference definition of the model.
If
compressed_definition
is specified, thendefinition
cannot be specified.
-
definition
-
(Required, object) The inference definition for the model. If
definition
is specified, thencompressed_definition
cannot be specified.Properties of
definition
-
preprocessors
-
(Optional, object) Collection of preprocessors. See Preprocessor examples.
Properties of
preprocessors
-
frequency_encoding
-
(Required, object) Defines a frequency encoding for a field.
Properties of
frequency_encoding
-
feature_name
- (Required, string) The name of the resulting feature.
-
field
- (Required, string) The field name to encode.
-
frequency_map
- (Required, object map of string:double) Object that maps the field value to the frequency encoded value.
-
custom
-
(Optional, Boolean)
Boolean value indicating if the analytics job created the preprocessor
or if a user provided it. This adjusts the feature importance calculation.
When
true
, the feature importance calculation returns importance for the processed feature. Whenfalse
, the total importance of the original field is returned. Default isfalse
.
-
-
one_hot_encoding
-
(Required, object) Defines a one hot encoding map for a field.
Properties of
one_hot_encoding
-
field
- (Required, string) The field name to encode.
-
hot_map
- (Required, object map of strings) String map of "field_value: one_hot_column_name".
-
custom
-
(Optional, Boolean)
Boolean value indicating if the analytics job created the preprocessor
or if a user provided it. This adjusts the feature importance calculation.
When
true
, the feature importance calculation returns importance for the processed feature. Whenfalse
, the total importance of the original field is returned. Default isfalse
.
-
-
target_mean_encoding
-
(Required, object) Defines a target mean encoding for a field.
Properties of
target_mean_encoding
-
default_value
-
(Required, double)
The feature value if the field value is not in the
target_map
. -
feature_name
- (Required, string) The name of the resulting feature.
-
field
- (Required, string) The field name to encode.
-
target_map
-
(Required, object map of string:double) Object that maps the field value to the target mean value.
-
custom
-
(Optional, Boolean)
Boolean value indicating if the analytics job created the preprocessor
or if a user provided it. This adjusts the feature importance calculation.
When
true
, the feature importance calculation returns importance for the processed feature. Whenfalse
, the total importance of the original field is returned. Default isfalse
.
-
-
-
-
trained_model
-
(Required, object) The definition of the trained model.
Properties of
trained_model
-
tree
-
(Required, object) The definition for a binary decision tree.
Properties of
tree
-
classification_labels
-
(Optional, string) An array of classification labels (used for
classification
). -
feature_names
- (Required, string) Features expected by the tree, in their expected order.
-
target_type
-
(Required, string)
String indicating the model target type;
regression
orclassification
. -
tree_structure
-
(Required, object)
An array of
tree_node
objects. The nodes must be in ordinal order by theirtree_node.node_index
value.
-
-
tree_node
-
(Required, object) The definition of a node in a tree.
There are two major types of nodes: leaf nodes and not-leaf nodes.
-
Leaf nodes only need
node_index
andleaf_value
defined. -
All other nodes need
split_feature
,left_child
,right_child
,threshold
,decision_type
, anddefault_left
defined.
Properties of
tree_node
-
decision_type
-
(Optional, string)
Indicates the positive value (in other words, when to choose the left node)
decision type. Supported
lt
,lte
,gt
,gte
. Defaults tolte
. -
default_left
-
(Optional, Boolean)
Indicates whether to default to the left when the feature is missing. Defaults
to
true
. -
leaf_value
- (Optional, double) The leaf value of the of the node, if the value is a leaf (in other words, no children).
-
left_child
- (Optional, integer) The index of the left child.
-
node_index
- (Integer) The index of the current node.
-
right_child
- (Optional, integer) The index of the right child.
-
split_feature
- (Optional, integer) The index of the feature value in the feature array.
-
split_gain
- (Optional, double) The information gain from the split.
-
threshold
- (Optional, double) The decision threshold with which to compare the feature value.
-
Leaf nodes only need
-
ensemble
-
(Optional, object) The definition for an ensemble model. See Model examples.
Properties of
ensemble
-
aggregate_output
-
(Required, object) An aggregated output object that defines how to aggregate the outputs of the
trained_models
. Supported objects areweighted_mode
,weighted_sum
, andlogistic_regression
. See Aggregated output example.Properties of
aggregate_output
-
logistic_regression
-
(Optional, object) This
aggregated_output
type works with binary classification (classification for values [0, 1]). It multiplies the outputs (in the case of theensemble
model, the inference model values) by the suppliedweights
. The resulting vector is summed and passed to asigmoid
function. The result of thesigmoid
function is considered the probability of class 1 (P_1
), consequently, the probability of class 0 is1 - P_1
. The class with the highest probability (either 0 or 1) is then returned. For more information about logistic regression, see this wiki article.Properties of
logistic_regression
-
weights
- (Required, double) The weights to multiply by the input values (the inference values of the trained models).
-
-
weighted_sum
-
(Optional, object) This
aggregated_output
type works with regression. The weighted sum of the input values.Properties of
weighted_sum
-
weights
- (Required, double) The weights to multiply by the input values (the inference values of the trained models).
-
-
weighted_mode
-
(Optional, object) This
aggregated_output
type works with regression or classification. It takes a weighted vote of the input values. The most common input value (taking the weights into account) is returned.Properties of
weighted_mode
-
weights
- (Required, double) The weights to multiply by the input values (the inference values of the trained models).
-
-
exponent
-
(Optional, object) This
aggregated_output
type works with regression. It takes a weighted sum of the input values and passes the result to an exponent function (e^x
wherex
is the sum of the weighted values).Properties of
exponent
-
weights
- (Required, double) The weights to multiply by the input values (the inference values of the trained models).
-
-
-
classification_labels
- (Optional, string) An array of classification labels.
-
feature_names
- (Optional, string) Features expected by the ensemble, in their expected order.
-
target_type
-
(Required, string)
String indicating the model target type;
regression
orclassification.
-
trained_models
-
(Required, object)
An array of
trained_model
objects. Supported trained models aretree
andensemble
.
-
-
-
-
description
- (Optional, string) A human-readable description of the inference trained model.
-
estimated_heap_memory_usage_bytes
-
(Optional, integer)
[7.16.0]
Deprecated in 7.16.0. Replaced by
model_size_bytes
-
estimated_operations
-
(Optional, integer)
The estimated number of operations to use the trained model during inference.
This property is supported only if
defer_definition_decompression
istrue
or the model definition is not supplied.
-
inference_config
-
(Required, object) The default configuration for inference. This can be either a
regression
orclassification
configuration. It must match the underlyingdefinition.trained_model
'starget_type
.Properties of
inference_config
-
regression
-
(Optional, object) Regression configuration for inference.
Properties of regression inference
-
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of feature importance values per document. By default, it is zero and no feature importance calculation occurs.
-
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to
predicted_value
.
-
-
classification
-
(Optional, object) Classification configuration for inference.
Properties of classification inference
-
num_top_classes
- (Optional, integer) Specifies the number of top class predictions to return. Defaults to 0.
-
num_top_feature_importance_values
- (Optional, integer) Specifies the maximum number of feature importance values per document. By default, it is zero and no feature importance calculation occurs.
-
prediction_field_type
-
(Optional, string)
Specifies the type of the predicted field to write.
Acceptable values are:
string
,number
,boolean
. Whenboolean
is provided1.0
is transformed totrue
and0.0
tofalse
. -
results_field
-
(Optional, string)
The field that is added to incoming documents to contain the inference
prediction. Defaults to
predicted_value
. -
top_classes_results_field
-
(Optional, string)
Specifies the field to which the top classes are written. Defaults to
top_classes
.
-
-
-
input
-
(Required, object) The input field names for the model definition.
Properties of
input
-
field_names
- (Required, string) An array of input field names for the model.
-
-
metadata
- (Optional, object) An object map that contains metadata about the model.
-
model_size_bytes
-
(Optional, integer)
The estimated memory usage in bytes to keep the trained model in memory. This
property is supported only if
defer_definition_decompression
istrue
or the model definition is not supplied. -
tags
- (Optional, string) An array of tags to organize the model.
Examples
editPreprocessor examples
editThe example below shows a frequency_encoding
preprocessor object:
{ "frequency_encoding":{ "field":"FlightDelayType", "feature_name":"FlightDelayType_frequency", "frequency_map":{ "Carrier Delay":0.6007414737092798, "NAS Delay":0.6007414737092798, "Weather Delay":0.024573576178086153, "Security Delay":0.02476631010889467, "No Delay":0.6007414737092798, "Late Aircraft Delay":0.6007414737092798 } } }
The next example shows a one_hot_encoding
preprocessor object:
{ "one_hot_encoding":{ "field":"FlightDelayType", "hot_map":{ "Carrier Delay":"FlightDelayType_Carrier Delay", "NAS Delay":"FlightDelayType_NAS Delay", "No Delay":"FlightDelayType_No Delay", "Late Aircraft Delay":"FlightDelayType_Late Aircraft Delay" } } }
This example shows a target_mean_encoding
preprocessor object:
{ "target_mean_encoding":{ "field":"FlightDelayType", "feature_name":"FlightDelayType_targetmean", "target_map":{ "Carrier Delay":39.97465788139886, "NAS Delay":39.97465788139886, "Security Delay":203.171206225681, "Weather Delay":187.64705882352948, "No Delay":39.97465788139886, "Late Aircraft Delay":39.97465788139886 }, "default_value":158.17995752420433 } }
Model examples
editThe first example shows a trained_model
object:
{ "tree":{ "feature_names":[ "DistanceKilometers", "FlightTimeMin", "FlightDelayType_NAS Delay", "Origin_targetmean", "DestRegion_targetmean", "DestCityName_targetmean", "OriginAirportID_targetmean", "OriginCityName_frequency", "DistanceMiles", "FlightDelayType_Late Aircraft Delay" ], "tree_structure":[ { "decision_type":"lt", "threshold":9069.33437193022, "split_feature":0, "split_gain":4112.094574306927, "node_index":0, "default_left":true, "left_child":1, "right_child":2 }, ... { "node_index":9, "leaf_value":-27.68987349695448 }, ... ], "target_type":"regression" } }
The following example shows an ensemble
model object:
"ensemble":{ "feature_names":[ ... ], "trained_models":[ { "tree":{ "feature_names":[], "tree_structure":[ { "decision_type":"lte", "node_index":0, "leaf_value":47.64069875778043, "default_left":false } ], "target_type":"regression" } }, ... ], "aggregate_output":{ "weighted_sum":{ "weights":[ ... ] } }, "target_type":"regression" }
Aggregated output example
editExample of a logistic_regression
object:
"aggregate_output" : { "logistic_regression" : { "weights" : [2.0, 1.0, .5, -1.0, 5.0, 1.0, 1.0] } }
Example of a weighted_sum
object:
"aggregate_output" : { "weighted_sum" : { "weights" : [1.0, -1.0, .5, 1.0, 5.0] } }
Example of a weighted_mode
object:
"aggregate_output" : { "weighted_mode" : { "weights" : [1.0, 1.0, 1.0, 1.0, 1.0] } }
Example of an exponent
object:
"aggregate_output" : { "exponent" : { "weights" : [1.0, 1.0, 1.0, 1.0, 1.0] } }
Trained models JSON schema
editFor the full JSON schema of trained models, click here.
On this page