Using Terraform with Elastic Cloud

blog-thumb-cloud-blue.png

TLDR; Today we will dive into the two available Terraform providers from Elastic. Terraform allows you to define your infrastructure as a code and keep that in repositories to easily make changes.

Getting Started

Infrastructure-as-code (IaC) means managing your IT infrastructure via files so that changes can be applied automatically without intervention and are properly documented all the time. Terraform has become kind of a standard with some competitors like CloudFormation or Pulumi. Today we will focus on Terraform in combination with Elastic Cloud, so you do not need to use the Cloud UI to spin up, change or remove clusters.

Creating a Cloud API Key

In order to use Terraform, you need a cloud API key. Make sure this is not an API key for a Elastic Cloud instance but for your Elastic Cloud Account. The Elastic Cloud Provider shows the configuration in more detail in the Authentication part of its documentation.

Initial Terraform configuration

Let's start with an almost minimal main.tf file :

terraform {
  required_version = ">= 1.0.0"

  required_providers {
    ec = {
      source  = "elastic/ec"
      version = "0.4.0"
    }
  }
}

provider "ec" {
}

resource "ec_deployment" "custom-deployment-id" {
  name                   = "My deployment identifier"

  region                 = "gcp-europe-west3"
  version                = "8.1.3"
  deployment_template_id = "gcp-memory-optimized-v2"

  elasticsearch {}

  kibana {}
}

After storing this you can run

terraform init
terraform validate
terraform apply auto-approve

The last call will take 1-3 minutes in order to bring up Kibana and Elasticsearch nodes. With that, we have our instance up and running. However, if we want to make use of that, defining outputs in main.tf is very useful:

output "elasticsearch_endpoint" {
  value = ec_deployment.custom-deployment-id.elasticsearch[0].https_endpoint
}

output "elasticsearch_username" {
  value = ec_deployment.custom-deployment-id.elasticsearch_username
}

output "elasticsearch_password" {
  value = ec_deployment.custom-deployment-id.elasticsearch_password
  sensitive = true
}

output "kibana_endpoint" {
  value = ec_deployment.custom-deployment-id.kibana[0].https_endpoint
}

Now, after running terraform apply again, you can see the endpoints in terraform output. With this output, you can run scripts that parse that output and then do actions like index or template creation.

You can use jq in order to extract parts out of the terraform output command (or even inspect the Terraform state file):

output=$(terraform output -json)
endpoint=$(echo $output | jq -r ".elasticsearch_endpoint.value")
username=$(echo $output | jq -r ".elasticsearch_username.value")
password=$(echo $output | jq -r ".elasticsearch_password.value")
curl $endpoint -u $username:$password

This way, the location of the Terraform state is irrelevant as well. Another feature for the Elastic Cloud provider is setting up remote clusters. Let's take a look at how this works.

Setting up remote clusters for CCS/CCR

Cross Cluster Search (CCS) as well as Cross Cluster Replication (CCR) functionalities require so-called remote clusters to be set up, where one cluster can access another cluster. The following sample snippet will fully omit the setup or unneeded output parts, but only show what is required.

resource "ec_deployment" "cluster_1" {
  name                   = "cluster_1"

  region                 = "gcp-europe-west3"
  version                = "8.1.3"
  deployment_template_id = "gcp-memory-optimized-v2"

  elasticsearch {}

  kibana {}
}

resource "ec_deployment" "cluster_2" {
  name                   = "cluster_2"

  region                 = "gcp-europe-west3"
  version                = "8.1.3"
  deployment_template_id = "gcp-memory-optimized-v2"

  elasticsearch {
    remote_cluster {
      deployment_id = ec_deployment.cluster_1.id
      alias         = ec_deployment.cluster_1.name
      ref_id        = ec_deployment.cluster_1.elasticsearch.0.ref_id
    }
  }

  kibana {}
}

In this setup cluster_2 will have a remote connection to cluster_1.

Time to set up cross cluster replication. Note: The comment line above the command indicates which cluster this needs to be executed on!

# Cluster 1
PUT my-leader-index

# Cluster 2
PUT my-follower-index/_ccr/follow?wait_for_active_shards=1
{"remote_cluster":"cluster_1","leader_index":"my-leader-index"}

# Cluster 1
PUT /my-leader-index/_doc/my-doc?refresh=true
{"key":"value"}

# Cluster 2, repeat until hit count > 0, should take less than a second usually
GET /my-follower-index/_search

Now we have a follower index in cluster 2 based on the remote cluster setup in Terraform.

Cross cluster search is similar to use by leveraging the remote cluster connection:

# Cluster 1
PUT /cluster1-index/_doc/1?refresh=true
{"cluster":"cluster1","name":"my name"}

# Cluster 2
PUT /cluster2-index/_doc/1?refresh=true
{"cluster":"cluster2","name":"my name"}

# Cluster 2
GET /cluster2-index,cluster_1:cluster1-index/_search

On top of just managing the instances yourself, you also may want to change the configuration or setup of a particular cluster. This is where the second Terraform provider, namely the elasticstack Terraform provider allows to configure a cluster, regardless if on Elastic Cloud or on-premise.

Using the Elasticstack Provider

The elasticstack provider allows to manage parts of the Elastic Stack, for example cluster settings, index and component templates, users and roles or ingest pipelines and processors. In this example we will create an index pipeline with two processors:

terraform {
  required_version = ">= 1.0.0"

  required_providers {
    ec = {
      source  = "elastic/ec"
      version = "0.4.0"
    }
    elasticstack = {
      source = "elastic/elasticstack",
      version = "0.3.3"
    }
  }
}

provider "ec" {
}

resource "ec_deployment" "custom-deployment-id" {
  name                   = "custom-deployment-id"

  region                 = "gcp-europe-west3"
  version                = "8.1.3"
  deployment_template_id = "gcp-memory-optimized-v2"

  elasticsearch {}

  kibana {}
}

provider "elasticstack" {
  elasticsearch {
    username  = ec_deployment.custom-deployment-id.elasticsearch_username
    password  = ec_deployment.custom-deployment-id.elasticsearch_password
    endpoints = [ec_deployment.custom-deployment-id.elasticsearch[0].https_endpoint]
  }
}

data "elasticstack_elasticsearch_ingest_processor_set" "set_field_terraform" {
  field = "pipeline-source"
  value = "terraform"
}

data "elasticstack_elasticsearch_ingest_processor_grok" "grok_the_log" {
  field    = "message"
  patterns = ["%%{TIMESTAMP_ISO8601:@timestamp} %%{LOGLEVEL:level} %%{GREEDYDATA:message}"]
}

resource "elasticstack_elasticsearch_ingest_pipeline" "ingest" {
  name = "my-ingest-pipeline"

  processors = [
    data.elasticstack_elasticsearch_ingest_processor_set.set_field_terraform.json,
    data.elasticstack_elasticsearch_ingest_processor_grok.grok_the_log.json
  ]
}

This creates a pipeline named my-ingest-pipeline after bringing up the cluster. You could now go to Management/Ingest Pipeline in Kibana and see that the pipeline has been created, or just run the following simulate pipeline call:

POST _ingest/pipeline/my-ingest-pipeline/_simulate
{
  "docs": [
    {
      "_source": {
        "message": "2022-03-03T12:34:56.789Z INFO ze message"
      }
    }
  ]
}

This will return the following document as part of the response:

"_source" : {
  "@timestamp" : "2022-03-03T12:34:56.789Z",
  "level" : "INFO",
  "pipeline-source" : "terraform",
  "message" : "ze message"
}

There is one final step, which is not yet as easy as it should be, but the following little trick does it at least for most of my PoCs.

Adding a Dashboard

While the elasticstack Terraform provider right now only works for changing parts of the Elasticsearch setup, there are other parts in the stack like Kibana.

One of my favorite ways of demoing the Elastic Stack is to provide the whole example in a GitHub repository and make it as easy as possible to get up and running.

Part of this is installing a dashboard that shows the data that was part of the demo. So how can this be done without any Kibana helpers in the Terraform elasticstack provider? By using a curl command as part of a null_resource. This adds some platform dependency and the requirement of curl wherever it is executed.

terraform {
  required_version = ">= 1.0.0"

  required_providers {
    ec = {
      source  = "elastic/ec"
      version = "0.4.0"
    }
  }
}

provider "ec" {
}

data "local_file" "dashboard" {
    filename = "${path.module}/dashboard.ndjson"
}

resource "ec_deployment" "custom-deployment-id" {
  name                   = "custom-deployment-id"

  region                 = "gcp-europe-west3"
  version                = "8.1.2"
  deployment_template_id = "gcp-memory-optimized-v2"

  elasticsearch {}

  kibana {}
}

resource "null_resource" "store_local_dashboard" {
  provisioner "local-exec" {
    command = "curl -X POST -u ${ec_deployment.custom-deployment-id.elasticsearch_username}:${ec_deployment.custom-deployment-id.elasticsearch_password} ${ec_deployment.custom-deployment-id.kibana[0].https_endpoint}/api/saved_objects/_import?overwrite=true -H \"kbn-xsrf: true\" --form [email protected]"
  }
  depends_on = [ec_deployment.custom-deployment-id]
  triggers = {
    dashboard_sha1 = "${sha1(file("dashboard.ndjson"))}"
  }
}

You may also notice the triggers part in the null_resource - this will take changes of the dashboard file into account and execute the curl call again if the sha1sum of the JSON file changes.

Summary

I hope you enjoyed the ride across the ec Terraform provider and the elasticstack Terraform provider. They are both in development and you can follow the corresponding GitHub repositories (ec, elasticstack).

Also, if you encounter any issues, please create an issue in the corresponding GitHub repository and provide feedback. Thank you and happy terraforming!

Watch this webinar and discover how to optimize your cloud operations with Elastic