Machine Learning
editMachine Learning
editTrained models
editEland allows transforming trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch.
>>> from xgboost import XGBClassifier >>> from eland.ml import MLModel # Train and exercise an XGBoost ML model locally >>> xgb_model = XGBClassifier(booster="gbtree") >>> xgb_model.fit(training_data[0], training_data[1]) >>> xgb_model.predict(training_data[0]) [0 1 1 0 1 0 0 0 1 0] # Import the model into Elasticsearch >>> es_model = MLModel.import_model( es_client="http://localhost:9200", model_id="xgb-classifier", model=xgb_model, feature_names=["f0", "f1", "f2", "f3", "f4"], ) # Exercise the ML model in Elasticsearch with the training data >>> es_model.predict(training_data[0]) [0 1 1 0 1 0 0 0 1 0]
Natural language processing (NLP) with PyTorch
editYou need to install the appropriate version of PyTorch to import an
NLP model. Run python -m pip install 'eland[pytorch]'
to install that version.
For NLP tasks, Eland enables you to import PyTorch models into Elasticsearch. Use the
eland_import_hub_model
script to download and install supported
transformer models from the
Hugging Face model hub. For example:
$ eland_import_hub_model <authentication> \ --url http://localhost:9200/ \ --hub-model-id elastic/distilbert-base-cased-finetuned-conll03-english \ --task-type ner \ --start
Use an authentication method to access your cluster. Refer to Authentication methods. |
|
The cluster URL. Alternatively, use |
|
Specify the identifier for the model in the Hugging Face model hub. |
|
Specify the type of NLP task. Supported values are |
For more information about the available options, run eland_import_hub_model
with the --help
option.
$ eland_import_hub_model --help
Import model with Docker
editTo use the Docker container, you need to clone the Eland repository: https://github.com/elastic/eland
If you want to use Eland without installing it, you can use the Docker image:
You can use the container interactively:
$ docker run -it --rm --network host docker.elastic.co/eland/eland
Running installed scripts is also possible without an interactive shell, for example:
docker run -it --rm docker.elastic.co/eland/eland \ eland_import_hub_model \ --url $ELASTICSEARCH_URL \ --hub-model-id elastic/distilbert-base-uncased-finetuned-conll03-english \ --start
Replace the $ELASTICSEARCH_URL
with the URL for your Elasticsearch cluster. For authentication purposes, include an administrator username and password in the URL in the following format: https://username:password@host:port
.
Install models in an air-gapped environment
editYou can install models in a restricted or closed network by pointing the
eland_import_hub_model
script to local files.
For an offline install of a Hugging Face model, the model first needs to be cloned locally, Git and Git Large File Storage are required to be installed in your system.
- Select a model you want to use from Hugging Face. Refer to the compatible third party model list for more information on the supported architectures.
-
Clone the selected model from Hugging Face by using the model URL. For example:
git clone https://huggingface.co/dslim/bert-base-NER
This command results in a local copy of of the model in the directory
bert-base-NER
. -
Use the
eland_import_hub_model
script with the--hub-model-id
set to the directory of the cloned model to install it:eland_import_hub_model \ --url 'XXXX' \ --hub-model-id /PATH/TO/MODEL \ --task-type ner \ --es-username elastic --es-password XXX \ --es-model-id bert-base-ner
If you use the Docker image to run
eland_import_hub_model
you must bind mount the model directory, so the container can read the files:docker run --mount type=bind,source=/PATH/TO/MODEL,destination=/model,readonly -it --rm docker.elastic.co/eland/eland \ eland_import_hub_model \ --url 'XXXX' \ --hub-model-id /model \ --task-type ner \ --es-username elastic --es-password XXX \ --es-model-id bert-base-ner
Once it’s uploaded to Elasticsearch, the model will have the ID specified by
--es-model-id
. If it is not set, the model ID is derived from--hub-model-id
; spaces and path delimiters are converted to double underscores__
.
Connect to Elasticsearch through a proxy
editBehind the scenes, Eland uses the requests
Python library, which
allows configuring
proxies through an environment variable. For example, to use an HTTP proxy to connect to
an HTTPS Elasticsearch cluster, you need to set the HTTPS_PROXY
environment variable
when invoking Eland:
HTTPS_PROXY=http://proxy-host:proxy-port eland_import_hub_model ...
If you disabled security on your Elasticsearch cluster, you should use HTTP_PROXY
instead.
Authentication methods
editThe following authentication options are available when using the import script:
-
Elasticsearch username and password authentication (specified with the
-u
and-p
options):eland_import_hub_model -u <username> -p <password> --cloud-id <cloud-id> ...
These
-u
and-p
options also work when you use--url
. -
Elasticsearch username and password authentication (embedded in the URL):
eland_import_hub_model --url https://<user>:<password>@<hostname>:<port> ...
-
Elasticsearch API key authentication:
eland_import_hub_model --es-api-key <api-key> --url https://<hostname>:<port> ...
-
HuggingFace Hub access token (for private models):
eland_import_hub_model --hub-access-token <access-token> ...
TLS/SSL
editThe following TLS/SSL options for Elasticsearch are available when using the import script:
-
Specify alternate CA bundle to verify the cluster certificate:
eland_import_hub_model --ca-certs CA_CERTS ...
-
Disable TLS/SSL verification altogether (strongly discouraged):
eland_import_hub_model --insecure ...