Skip to content

Latest commit

 

History

History
19 lines (10 loc) · 2.08 KB

README.md

File metadata and controls

19 lines (10 loc) · 2.08 KB

Monitoring Kafka streaming ETL deployments

This demo shows users how to deploy a Kafka streaming ETL using KSQL for stream processing and Confluent Control Center for monitoring. All the components in the Confluent platform have security enabled end-to-end. Run the demo with the playbook and video tutorials.

Table of Contents

Overview

The use case is a streaming ETL deployment on live edits to real Wikipedia pages. Wikimedia Foundation has IRC channels that publish edits happening to real wiki pages (e.g. #en.wikipedia, #en.wiktionary) in real time. Using Kafka Connect, a Kafka source connector kafka-connect-irc streams raw messages from these IRC channels, and a custom Kafka Connect transform kafka-connect-transform-wikiedit transforms these messages and then the messages are written to a Kafka cluster. This demo uses KSQL for data enrichment, or you can optionally develop and run your own Kafka Streams application. Then a Kafka sink connector kafka-connect-elasticsearch streams the data out of Kafka, applying another custom Kafka Connect transform called NullFilter. The data is materialized into Elasticsearch for analysis by Kibana.

image

Documentation

You can find the documentation for running this demo, playbook, and video tutorials at https://docs.confluent.io/current/tutorials/cp-demo/docs/index.html.