Add Kubernetes metadata

edit

The add_kubernetes_metadata processor annotates each event with relevant metadata based on which Kubernetes pod the event originated from. At startup it detects an in_cluster environment and caches the Kubernetes-related metadata. Events are only annotated if a valid configuration is detected. If it’s not able to detect a valid Kubernetes configuration, the events are not annotated with Kubernetes-related metadata.

Each event is annotated with:

  • Pod Name
  • Pod UID
  • Namespace
  • Labels

The add_kubernetes_metadata processor has two basic building blocks which are:

  • Indexers
  • Matchers

Indexers use pods metadata to create unique identifiers for each one of the pods, these identifiers help to correlate the metadata of the observed pods with actual events. For example, the ip_port indexer can take a Kubernetes pod and create identifiers for it based on all its pod_ip:container_port combinations.

Matchers use information in events to construct lookup keys that match the identifiers created by the indexers. For example, when the fields matcher takes ["metricset.host"] as a lookup field, it would construct a lookup key with the value of the field metricset.host. When one of this lookup keys match with one of the identifiers, the event is enriched with the metadata of the identified pod.

Each Beat can define its own default indexers and matchers which are enabled by default. For example, Filebeat enables the container indexer, which identifies pod metadata based on all container IDs, and a logs_path matcher, which takes the log.file.path field, extracts the container ID, and uses it to retrieve metadata.

You can find more information about the available indexers and matchers, and some examples in Indexers and matchers.

The configuration below enables the processor when packetbeat is run as a pod in Kubernetes.

processors:
  - add_kubernetes_metadata:
      # Defining indexers and matchers manually is required for packetbeat, for instance:
      #indexers:
      #  - ip_port:
      #matchers:
      #  - fields:
      #      lookup_fields: ["metricset.host"]

The configuration below enables the processor on a Beat running as a process on the Kubernetes node.

processors:
  - add_kubernetes_metadata:
      host: <hostname>
      # If kube_config is not set, KUBECONFIG environment variable will be checked
      # and if not present it will fall back to InCluster
      kube_config: $Packetbeat Reference [7.9]/.kube/config
      # Defining indexers and matchers manually is required for packetbeat, for instance:
      #indexers:
      #  - ip_port:
      #matchers:
      #  - fields:
      #      lookup_fields: ["metricset.host"]

The configuration below has the default indexers and matchers disabled and enables ones that the user is interested in.

processors:
  - add_kubernetes_metadata:
      host: <hostname>
      # If kube_config is not set, KUBECONFIG environment variable will be checked
      # and if not present it will fall back to InCluster
      kube_config: ~/.kube/config
      default_indexers.enabled: false
      default_matchers.enabled: false
      indexers:
        - ip_port:
      matchers:
        - fields:
            lookup_fields: ["metricset.host"]

The add_kubernetes_metadata processor has the following configuration settings:

host
(Optional) Specify the node to scope packetbeat to in case it cannot be accurately detected, as when running packetbeat in host network mode.
namespace
(Optional) Select the namespace from which to collect the metadata. If it is not set, the processor collects metadata from all namespaces. It is unset by default.
kube_config
(Optional) Use given config file as configuration for Kubernetes client. It defaults to KUBECONFIG environment variable if present.
default_indexers.enabled
(Optional) Enable/Disable default pod indexers, in case you want to specify your own.
default_matchers.enabled
(Optional) Enable/Disable default pod matchers, in case you want to specify your own.

Indexers and matchers

edit

Indexers

edit

Indexers use pods metadata to create unique identifiers for each one of the pods.

Available indexers are:

container
Identifies the pod metadata using the IDs of its containers.
ip_port
Identifies the pod metadata using combinations of its IP and its exposed ports. When using this indexer metadata is identified using the IP of the pods, and the combination if ip:port for each one of the ports exposed by its containers.
pod_name
Identifies the pod metadata using its namespace and its name as namespace/pod_name.
pod_uid
Identifies the pod metadata using the UID of the pod.

Matchers

edit

Matchers are used to construct the lookup keys that match with the identifiers created by indexes.

field_format

edit

Looks up pod metadata using a key created with a string format that can include event fields.

This matcher has an option format to define the string format. This string format can contain placeholders for any field in the event.

For example, the following configuration uses the ip_port indexer to identify the pod metadata by combinations of the pod IP and its exposed ports, and uses the destination IP and port in events as match keys:

processors:
- add_kubernetes_metadata:
    ...
    default_indexers.enabled: false
    default_matchers.enabled: false
    indexers:
      - ip_port:
    matchers:
      - field_format:
          format: '%{[destination.ip]}:%{[destination.port]}'

fields

edit

Looks up pod metadata using as key the value of some specific fields. When multiple fields are defined, the first one included in the event is used.

This matcher has an option lookup_fields to define the files whose value will be used for lookup.

For example, the following configuration uses the ip_port indexer to identify pods, and defines a matcher that uses the destination IP or the server IP for the lookup, the first it finds in the event:

processors:
- add_kubernetes_metadata:
    ...
    default_indexers.enabled: false
    default_matchers.enabled: false
    indexers:
      - ip_port:
    matchers:
      - fields:
          lookup_fields: ['destination.ip', 'server.ip']