Create automated snapshots

edit

To set up automated snapshots for Elasticsearch on Kubernetes you have to:

  1. Ensure you have the necessary Elasticsearch storage plugin installed.
  2. Add snapshot repository credentials to the Elasticsearch keystore.
  3. Register the snapshot repository with the Elasticsearch API.
  4. Set up a CronJob to take snapshots on a schedule.

The examples below use the Google Cloud Storage Repository Plugin.

For more information on Elasticsearch snapshots, see Snapshot and Restore.

Install the storage repository plugin

edit

To install the storage repository plugin, you can either use a custom image or add your own init container which installs the plugin when the Pod is created.

To use your own custom image with all necessary plugins pre-installed, use an Elasticsearch resource like the following one:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  version: 7.2.0
  image: your/custom/image:tag
  nodes:
  - nodeCount: 1

Alternatively, install the plugin when the Pod is created by using an init container:

apiVersion: elasticsearch.k8s.elastic.co/v1alpha1
kind: Elasticsearch
metadata:
  name: elasticsearch-sample
spec:
  version: 7.2.0
  nodes:
  - podTemplate:
      spec:
        initContainers:
        - name: install-plugins
          command:
          - sh
          - -c
          - |
            bin/elasticsearch-plugin install --batch repository-gcs
    nodeCount: 1

Assuming you stored this in a file called elasticsearch.yaml you can in both cases create the Elasticsearch cluster with:

kubectl apply -f elasticsearch.yaml

Configure GCS credentials via the Elasticsearch keystore

edit

The Elasticsearch GCS repository plugin requires a JSON file that contains service account credentials. These need to be added as secure settings to the Elasticsearch keystore. For more details, see Google Cloud Storage Repository Plugin.

Using ECK, you can automatically inject secure settings into a cluster node by providing them through a secret in the Elasticsearch Spec.

  1. Create a file containing the GCS credentials. For this example, name it gcs.client.default.credentials_file. The file name is important as it is reflected in the secure setting.

    {
      "type": "service_account",
      "project_id": "your-project-id",
      "private_key_id": "...",
      "private_key": "-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n",
      "client_email": "service-account-for-your-repository@your-project-id.iam.gserviceaccount.com",
      "client_id": "...",
      "auth_uri": "https://accounts.google.com/o/oauth2/auth",
      "token_uri": "https://accounts.google.com/o/oauth2/token",
      "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
      "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/[email protected]"
    }
  2. Create a Kubernetes secret from that file:

    kubectl create secret generic gcs-credentials --from-file=gcs.client.default.credentials_file
  3. Edit the secureSettings section of the Elasticsearch resource:

    kind: Elasticsearch
    spec:
        # ...
        # Inject secure settings into Elasticsearch nodes from a k8s secret reference
        secureSettings:
          secretName: "gcs-credentials"
  4. Apply the modifications:

    kubectl apply -f elasticsearch.yml

GCS credentials are automatically propagated into each node’s keystore. It can take up to a few minutes, depending on the number of secrets in the keystore. You don’t have to restart the nodes.

Register the repository in Elasticsearch

edit
  1. Create the GCS snapshot repository in Elasticsearch following the procedure described in Snapshot and Restore:

    PUT /_snapshot/my_gcs_repository
    {
      "type": "gcs",
      "settings": {
        "bucket": "my_bucket",
        "client": "default"
      }
    }
  2. Take a snapshot with the following HTTP request:

    PUT /_snapshot/my_gcs_repository/test-snapshot

Periodic snapshots with a CronJob

edit

You can set up a simple CronJob to take a snapshot every day.

  1. Make an HTTP request against the appropriate endpoint, using a daily snapshot naming format. Elasticsearch credentials are mounted as a volume into the job’s Pod:

    # snapshotter.yml
    apiVersion: batch/v1beta1
    kind: CronJob
    metadata:
      name: elasticsearch-sample-snapshotter
    spec:
      schedule: "@daily"
      concurrencyPolicy: Forbid
      jobTemplate:
        spec:
          template:
            spec:
              containers:
              - name: snapshotter
                image: centos:7
                volumeMounts:
                  - name: es-basic-auth
                    mountPath: /mnt/elastic/es-basic-auth
                command:
                - /bin/bash
                args:
                - -c
                - 'curl -s -i -k -u "elastic:$(</mnt/elastic/es-basic-auth/elastic)" -XPUT "https://elasticsearch-sample-es-http:9200/_snapshot/my_gcs_repository/%3Csnapshot-%7Bnow%2Fd%7D%3E" | tee /dev/stderr | grep "200 OK"'
              restartPolicy: OnFailure
              volumes:
              - name: es-basic-auth
                secret:
                  secretName: elasticsearch-sample-elastic-user
  2. Apply it to the Kubernetes cluster:

    kubectl apply -f snapshotter.yml

For more details see Kubernetes CronJobs.