Connector clients

edit

Connector clients

edit

Connector clients are Elastic connectors that you self-manage on your own infrastructure. This means they run outside of your Elastic deployment.

You can run them from source or from a Docker container. Deploying your self-managed connector details the options.

For a concrete end-to-end example, see PostgreSQL connector client tutorial.

The following connectors are available as connector clients:

Availability and Elastic prerequisites

edit
Expand to see the Elastic prerequisites

Elastic 8.4.0 or later is required.

Your Elastic deployment must include the following Elastic services:

  • Elasticsearch
  • Kibana
  • Enterprise Search

(A new Elastic Cloud deployment includes these services by default.)

Support and licensing

Depending on how you use connector clients, support and licensing requirements will vary.

Refer to the following subscriptions pages for details. Find your connector of interest in the Elastic Search section under Client Integrations:

Note the following information regarding support for connector clients:

  • A converted but unmodified native connector is supported by Elastic.
  • A converted but customized native connector is not supported by Elastic.

You must have access to Kibana and have write indices privileges for the .elastic-connectors index.

You must additionally deploy the connector service. See Deploying your self-managed connector.

In order to setup, configure, and run a connector you’ll be moving between your third-party service, the Kibana UI, and your terminal.

Initial setup in Kibana

edit

Follow the Connector workflow in the Kibana UI to select the Connector ingestion method.

Next, complete these steps:

  1. Choose which third-party service you’d like to use by selecting a data source.
  2. Create and name a new Elasticsearch index.
  3. Generate a new API key and save it somewhere safe.
  4. Name your connector and provide an optional description
  5. Convert native connector to a connector client (Only applicable if connector is also available natively). This action is irreversible.
  6. Copy the configuration block from the example shown on the screen. You’ll use this in a later step:

    # ...
    api_key: '<YOUR-API-KEY>'
    connector_id: '<YOUR-CONNECTOR-ID>'
    service_type: 'postgresql' # The third party service you've selected

Deploying your self-managed connector

edit

To use connector clients, you must deploy the connector service. The source code is hosted in the elastic/connectors-python repository.

You have two deployment options:

Option 1: Run from source

edit

This is a good option if you’re comfortable working with Python and want to iterate quickly locally.

Clone the repository and edit config.yml
edit

Once you’ve created an index, and entered the access details for your data source, you’re ready to deploy the connector service.

First, you need to clone the elastic/connectors-python repository.

Follow these steps:

  • Clone or fork the connectors-python repository locally with the following command: git clone https://github.com/elastic/connectors-python.
  • Open the config.yml configuration file in your editor of choice.
  • Replace the values for host (your Elasticsearch endpoint), api_key, connector_id, and service_type.

    Expand to see an example config.yml file

    Replace the values for api_key, connector_id, and service_type with the values you copied earlier.

    elasticsearch:
      host: <https://<my-elastic-deployment.es.us-west2.gcp.elastic-cloud.com>> # Your Elasticsearch endpoint
      api_key: '<YOUR-API-KEY>' # Your API key
      ssl: true
      bulk:
        queue_max_size: 1024
        queue_max_mem_size: 25
        display_every: 100
        chunk_size: 1000
        max_concurrency: 5
        chunk_max_mem_size: 5
        concurrent_downloads: 10
      request_timeout: 120
      max_wait_duration: 120
      initial_backoff_duration: 1
      backoff_multiplier: 2
      log_level: info
    
    service:
      idling: 30
      heartbeat: 300
      max_errors: 20
      max_errors_span: 600
      max_concurrent_syncs: 1
      job_cleanup_interval: 300
      log_level: INFO
    
    
    # Connector client settings
    connector_id: '<YOUR-CONNECTOR-ID>' # Your connector ID
    service_type: 'postgresql'  # The service type for your connector
    
    sources:
      # mongodb: connectors.sources.mongo:MongoDataSource
      # s3: connectors.sources.s3:S3DataSource
      # dir: connectors.sources.directory:DirectoryDataSource
      # mysql: connectors.sources.mysql:MySqlDataSource
      # network_drive: connectors.sources.network_drive:NASDataSource
      # google_cloud_storage: connectors.sources.google_cloud_storage:GoogleCloudStorageDataSource
      # azure_blob_storage: connectors.sources.azure_blob_storage:AzureBlobStorageDataSource
      postgresql: connectors.sources.postgresql:PostgreSQLDataSource
      # oracle: connectors.sources.oracle:OracleDataSource
      # sharepoint: connectors.sources.sharepoint:SharepointDataSource
      # mssql: connectors.sources.mssql:MSSQLDataSource
      # jira: connectors.sources.jira:JiraDataSource
Run the connector service
edit

Once you’ve configured the connector code, you can run the connector service.

In your terminal or IDE:

  1. cd into the root of your connectors-python clone/fork.
  2. Run the following commands to compile and run the connector service:

    make install
    make run

The connector service should now be running. The UI will let you know that the connector has successfully connected to your Elasticsearch instance.

Here we’re working locally. In a production setup, you’ll deploy the connector service to your own infrastructure.

Option 2: Run with Docker

edit

You can also deploy connector clients using Docker. This is a good option if you don’t want to worry about your local Python environment.

To do this you will:

  • Create a Docker network.
  • Create a directory to be mounted into the Docker image.
  • Download the sample configuration file (conf.yml) into that directory.
  • Update the config file with your specific values:

    • host (your Elasticsearch endpoint)
    • api_key
    • connector_id
    • service_type.
  • Run the Docker image with the Connector Service

Refer to DOCKER.md in the repo for a more detailed guide.

Enter data source details in Kibana

edit

Once the connector service is running, it’s time to head back to the Kibana UI to finalize the connector configuration. In this step, you need to add the specific connection details about your data source instance, like URL, authorization credentials, etc. As mentioned above, these details will vary based on the third-party data source you’re connecting to.

For example, the PostgreSQL connector requires the following details:

  • Host
  • Port
  • Username
  • Password
  • Database
  • Comma-separated list of tables

You’re now ready to run a sync. Select the Full sync button in the Kibana UI to start ingesting documents into Elasticsearch.

Tutorial

edit

Our PostgreSQL connector client tutorial is a concrete example of running a self-managed connector service and a connector client. This example uses the PostgreSQL connector but the basic process is the same for all connector clients.

Connector testing

edit

The connector framework enables you to run end-to-end (E2E) tests on your connector clients, against a real data source.

To avoid tampering with a real Elasticsearch instance, E2E tests run an isolated Elasticsearch instance in Docker. Configuration values are set in your docker-compose.yml file. Docker Compose manages the setup of the development environment, including both the mock Elastic instance and mock data source.

E2E tests use default configuration values for the connector. Find instructions about testing in each connector’s documentation.

Connector framework

edit

The Elastic Python connector framework enables you to:

  • Customize existing connector clients.
  • Build your own connector clients.

Refer to Elastic connector framework for more information.