uberAgent

Configuring Apache Kafka & Confluent REST Proxy

Introduction

What is Apache Kafka

Apache Kafka is a high-performance distributed message queueing and event stream processing platform. Producers send data to Kafka, which buffers the incoming events to disk for a configurable period of time. Kafka logically groups events into topics, which Consumers subscribe to.

What is Confluent REST Proxy

Confluent is a Kafka distribution targeted at enterprises. It adds monitoring, security, and management components to open source Kafka.

While vanilla Kafka only accepts incoming data via a proprietary binary protocol, Confluent comes with a REST proxy that is similar in nature to Splunk HEC or Elasticsearch REST API. REST Proxy is part of the free community edition of Confluent.

When to Use Kafka with uberAgent

By routing uberAgent’s data through Kafka, enterprises can make end-user computing metrics available to individual teams within the organization in a publish/subscribe kind of model. Once access has been granted for a topic, a team can start consuming that topic’s metrics. While some teams might be interested in uberAgent’s asset and inventory information, others might need access to application performance or browser web app usage data.

Kafka makes it possible to distribute uberAgent’s metrics in a highly scalable manner, supporting hundreds of thousands of endpoints (data producers) and thousands of consumers. uberAgent natively supports Kafka via the Confluent REST proxy.

Prerequisites

In order for uberAgent to be able to send data to a Kafka backend the following is required:

  • Confluent REST Proxy with Confluent Schema Registry
  • Apache Kafka

Please see the system requirements for details.

Implementation Details

Topics

Kafka Topics and uberAgent Sourcetypes

uberAgent uses Splunk’s concept of sourcetypes to logically group events. uberAgent sourcetypes names always start with uberAgent followed by two additional sections concatenated by colons. Example:

uberAgent:Application:Errors
<!--NeedCopy-->

uberAgent sourcetypes are mapped to Kafka topics in a 1:1 relationship. Since colons are not allowed in topic names, they are replaced by underscores. The sourcetype from the above example maps to the following Kafka topic:

uberAgent_Application_Errors
<!--NeedCopy-->

Topic Creation

uberAgent does not create Kafka topics. For uberAgent to be able to send data to Kafka, either topic auto-creation must be enabled (setting auto.create.topics.enable), or the topics needed by uberAgent must be created manually.

uberAgent’s sourcetypes are documented here. A listing of all sourcetypes is available in the file uA_sourcetypes_metrics.csv which is part of the uberAgent dashboard app (included in the uberAgent download).

Partitions

Kafka stores topic data in partitions that can be distributed across nodes in a Kafka cluster. One partition per consumer group is required.

Kafka producers like uberAgent can either select a specific partition for each event or have events be distributed randomly across partitions. uberAgent combines the two approaches: it specifies the hostname as key for each event. This ensures that all events from a specific host are always stored in the same partition.

Schema and Data Format

uberAgent uses Avro as data format when sending to Kafka. Avro requires a schema definition for every event. This guarantees high data quality and easier processing by downstream consumers.

In order to minimize network data volume, uberAgent sends the full schema definition only for the first event of a topic. All further events of the same topic reference the topic’s schema definition by the ID returned from the schema registry in response to the first event.

Avro requires REST Proxy to be configured with a schema registry URL (configuration setting schema.registry.url).

Event Metadata

Each uberAgent event has the same default metadata fields in Kafka as in Splunk:

  • time
  • host
  • index
  • source
  • sourcetype

The sourcetype is used as the key for the Kafka partition selection (see above) and added to each event as a dedicated field.

REST Proxy API Details

uberAgent sends data to the Confluent REST Proxy API v2 via HTTP POST to /topics/SOURCETYPE (docs). The HTTP content type is application/vnd.kafka.v2+json.

Configuration

uberAgent

To configure uberAgent to send data to Kafka via Confluent REST Proxy a configuration section similar to the following is required:

[Receiver]
Name = Kafka
Type = Kafka
Protocol = HTTP
Servers = https://confluent-rest-proxy.company.com:8083/
TLSClientCertificate = LocalMachine\MY\abcd123456789123456789123456789123456789
<!--NeedCopy-->

The receiver Name can be any string. The Protocol must be HTTP - it stands for a REST endpoint accessed via http or https. The Servers parameter accepts a comma-separated list of REST proxies that are contacted in a round-robin fashion for fault tolerance and load sharing. TLSClientCertificate optionally specifies a client certificate to use for authentication (see below).

Kafka

Please see the implementation details section above for information on how to configure Confluent and Kafka for receiving data from uberAgent.

Authentication

Confluent REST Proxy supports authentication via an HTTPS client certificate on the endpoint. Make sure to specify the TLSClientCertificate option in the uberAgent configuration.

It is not necessary to issue individual certificates to clients. A single certificate with CN=uberAgent (or similar) might be best.

Confluent REST Proxy propagates the client certificate to the Kafka broker in one of two ways:

  • TLS
  • SASL

The following sections show the configuration steps for each option.

Client Certificate Propagation via TLS

In order for certificate propagation from REST Proxy to the Kafka brokers to work, the following requirements must be met:

  • REST Proxy and the Kafka brokers must be configured for TLS support
  • A root CA that issues all required certificates and is trusted by all components is highly recommended
  • REST Proxy’s client certificate store must contain the certificate (inluding private key) uberAgent authenticates with
REST Proxy Configuration

This example shows a simple configuration with all components on one server. In a real-world deployment that would most likely not be the case. Adjust the URLs accordingly.

Add the following to the file /etc/kafka-rest/kafka-rest.properties:

###################################################
#
# Schema registry
#
###################################################

schema.registry.url=http://localhost:8081

###################################################
#
# REST proxy to broker connection
#
###################################################

bootstrap.servers=localhost:9093
client.security.protocol=SSL

# The client keystore contains the client certificate (including the private key) that is used by uberAgent to authenticate. REST proxy passes it on to the broker.
client.ssl.keystore.location=/opt/vastlimits/certificates/kafka.clientkeystore.jks
client.ssl.keystore.password=PASSWORD3
client.ssl.key.password=PASSWORD3

# The truststore contains the collection of CA certificates trusted by this application
client.ssl.truststore.location=/opt/vastlimits/certificates/kafka.truststore.jks
client.ssl.truststore.password=PASSWORD1

###################################################
#
# Miscellaneous
#
###################################################

# Listeners (port 8083 cannot be used for REST Proxy because it is used by Confluent Connect)
listeners=https://0.0.0.0:8084

###################################################
#
# Security plugin
#
###################################################

# License for commercial edition (required for the security plugin)
confluent.license=LICENSE_KEY

# Enable the REST proxy security plugin
kafka.rest.resource.extension.class=io.confluent.kafkarest.security.KafkaRestSecurityResourceExtension

# Whether or not to require the HTTPS client to authenticate via the server's trust store (required if the security plugin is enabled)
ssl.client.auth=true

###################################################
#
# HTTPS from clients to REST proxy
#
###################################################

# Web server certificate for REST proxy
ssl.keystore.location=/opt/vastlimits/certificates/kafka.keystore.jks
ssl.keystore.password=PASSWORD2
ssl.key.password=PASSWORD2

# The truststore lists certificates that are allowed to connect. Specify either a single CA certificate or individual client certificates.
ssl.truststore.location=/opt/vastlimits/certificates/kafka.truststore.jks
ssl.truststore.password=PASSWORD1
<!--NeedCopy-->
Kafka Broker Configuration

Add the following to the file /etc/kafka/server.properties:

# The truststore lists certificates that are allowed to connect. Specify either a single CA certificate or individual client certificates.
ssl.truststore.location=/opt/vastlimits/certificates/kafka.truststore.jks
ssl.truststore.password=PASSWORD1

# Server certificate for the broker
ssl.keystore.location=/opt/vastlimits/certificates/kafka.keystore.jks
ssl.keystore.password=PASSWORD2
ssl.key.password=PASSWORD2

# Both plaintext and SSL listeners on different ports. Plaintext can be used from the schema registry, for example.
listeners=PLAINTEXT://:9092,SSL://:9093

# Enforce authentication for clients connecting via SSL
ssl.client.auth=required
<!--NeedCopy-->

Client Certificate Propagation via SASL

We are waiting for information from Confluent on some specifics of this configuration. We will update this section once that is available.

Configuring Apache Kafka & Confluent REST Proxy