feature(kafka-localstack): introducing docker-compose base kafka setup #6946

fruch · 2023-12-14T12:36:40Z

Since we want to be able to run scylla kafka connectors with scylla clusters create by SCT, we are introducing here the first of kafka backend that would be used for local development (with SCT docker backend)

include a way to configure the connector as needed (also multi ones)
get it install from hub or by url

Testing

sdcm/kafka/kafaka_cluster.py

sdcm/sct_config.py

sdcm/kafka/kafaka_cluster.py

sdcm/sct_config.py

sdcm/kafka/kafka_cluster.py

sdcm/tester.py

sdcm/kafka/kafka_cluster.py

test-cases/kafka/longevity-kafka-cdc-docker.yaml

soyacz · 2024-02-12T07:36:43Z

are there kafka metrics worth to add to monitoring? If yes, can be done in followup task.

fruch · 2024-02-12T08:03:29Z

are there kafka metrics worth to add to monitoring? If yes, can be done in followup task.

this one is a local setup of kafka, I don't think monitoring is needed, as least not yet. (we have monitoring data of the sct-runner)

soyacz · 2024-02-12T08:08:48Z

are there kafka metrics worth to add to monitoring? If yes, can be done in followup task.

this one is a local setup of kafka, I don't think monitoring is needed, as least not yet. (we have monitoring data of the sct-runner)

they look nice: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/integrations/integration-reference/integration-kafka/
It could help us when there are issues with kafka-connector

fruch · 2024-02-12T08:27:24Z

are there kafka metrics worth to add to monitoring? If yes, can be done in followup task.

this one is a local setup of kafka, I don't think monitoring is needed, as least not yet. (we have monitoring data of the sct-runner)

they look nice: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/integrations/integration-reference/integration-kafka/ It could help us when there are issues with kafka-connector

JMX never looks nice...

it's too early for this, once we'll have VMs and a full cluster, we might consider installation of those.

now I care more on the functional side of things, and how this setup integrates with a longevity test.
and the real missing part is reading/writing to kafka for the actual test/verification

jenkins-pipelines/oss/kafka_connectors/longevity-kafka-cdc-aws.jenkinsfile

sdcm/kafka/kafka_consumer.py

unit_tests/test_kafka.py

sdcm/kafka/kafka_config.py

fruch · 2024-05-08T15:52:34Z

So the longevity code we have basically works

But it hangs cause we don't have code to stop the Kafka reading thread, might use the idea of teardown validator to validate and stop the reading thread

soyacz · 2024-05-14T14:10:09Z

So the longevity code we have basically works

But it hangs cause we don't have code to stop the Kafka reading thread, might use the idea of teardown validator to validate and stop the reading thread

I don't understand why we cannot add this verification to teardown itself? Why teardown validator is required?

fruch · 2024-05-14T17:10:46Z

So the longevity code we have basically works

But it hangs cause we don't have code to stop the Kafka reading thread, might use the idea of teardown validator to validate and stop the reading thread

I don't understand why we cannot add this verification to teardown itself? Why teardown validator is required?

It was an idea, validators seemed like a natural place for it

I'm now trying a different approach of adding this logic to the reader thread itself.

sdcm/kafka/kafka_cluster.py

sdcm/tester.py

test-cases/kafka/longevity-kafka-cdc.yaml

soyacz · 2024-05-23T07:43:50Z

sdcm/kafka/kafka_cluster.py

+
+
+class LocalKafkaCluster(cluster.BaseCluster):
+    def __init__(self, remoter=LOCALRUNNER):


will sct_runner survive high load on kafka?

It might, we can scale the runner as needed.

The idea is to have a setup that can work completely locally with docker backed for development

The next stage is building a cluster of kafka instead of the docker compose setup, and then some Kafka SaaS.

So for this initial step we can about building functionally, not yet about scale. scaling would be tested on VMs or on SaaS.

worth adding a note that it's on sct runner and its size should be increased.
Generally small docstring would be great.

test-cases/kafka/longevity-kafka-cdc.yaml

sdcm/kafka/kafka_consumer.py

fruch · 2024-05-27T18:50:49Z

Two Jobs introduced are passing now

One small pre-commit issue, and it's good to go

Since we want to be able to run scylla kafka connectors with scylla clusters create by SCT, we are introducing here the first of kafka backend that would be used for local development (with SCT docker backend) * inculde a way to configure the connector as needed (also multi ones) * get it intsall from hub or by url **Note**: this doesn't yet include any code that can read out of kafka

with this thread we'll be able to read the data written by the connector, and validate we are getting the information we expect (number of rows as the first validation)

first pipelines, based on docker and aws backends

fruch · 2024-05-27T22:47:30Z

@Bouncheck

I would recommend you try it again, to get familiar with it.

name of a property was changed from `version` to `source` and was missed on one of the configurtion files introduce in scylladb#6946 and it started failing on test case linting right after the merge

name of a property was changed from `version` to `source` and was missed on one of the configurtion files introduce in #6946 and it started failing on test case linting right after the merge

fruch requested review from vponomaryov, soyacz and juliayakovlev December 14, 2023 12:36

github-actions bot assigned fruch Dec 14, 2023

fruch requested a review from Bouncheck December 14, 2023 12:37

fruch added the test-integration Enable running the integration tests suite label Dec 14, 2023