[WIP] Add and distribute IQv2 information in KIP-1071 #18278

bbejeck · 2024-12-19T21:37:30Z

This PR adds IQv2 information to the StreamsGroupHeartbeat response.

For testing, all IQv2 related integration tests have been updated (and are passing) to use the new streams protocol to exercise the newly added code. A specific integration test and unit tests are coming.

Among the tests updated:

ConsistencyVectorIntegrationTest
EosIntegrationTest
IQv2IntegrationTest
IQv2StoreIntegrationTest
IQv2VersionedStoreIntegrationTest
LagFetchIntegrationTest
OptimizedKTableIntegrationTest
PositionRestartIntegrationTest
QueryableStateIntegrationTest
StoreQueryIntegrationTest

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

bbejeck · 2024-12-19T21:48:09Z

...p-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupMetadataManagerTest.java

@@ -110,6 +110,7 @@
 import org.apache.kafka.image.MetadataImage;
 import org.apache.kafka.image.MetadataProvenance;
 import org.apache.kafka.server.common.MetadataVersion;
+


checkstyle fix

bbejeck · 2024-12-19T21:48:18Z

...or/src/test/java/org/apache/kafka/coordinator/group/taskassignor/StickyTaskAssignorTest.java

@@ -18,6 +18,7 @@
 package org.apache.kafka.coordinator.group.taskassignor;

 import org.apache.kafka.coordinator.group.GroupCoordinatorConfig;
+


checkstyle fix

bbejeck · 2024-12-19T21:49:47Z

...ntegration-tests/src/test/java/org/apache/kafka/streams/integration/IQv2IntegrationTest.java

@@ -173,7 +179,7 @@ public void afterTest() {

    @AfterAll
    public static void after() {
-        CLUSTER.stop();
+        cluster.stop();


Checkstyle fix this applies to all CLUSTER -> cluster updates

bbejeck · 2024-12-19T21:58:30Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

+        for (Map.Entry<String, Set<Integer>> taskEntry : taskEntrySet) {
+            String subtopologyId = taskEntry.getKey();
+            List<Integer> partitions = new ArrayList<>(taskEntry.getValue());
+            ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId);


Went with the ConfiguredSubtopology since it contains any resolved regex topics.

Yes, that is correct.

bbejeck · 2024-12-19T22:00:27Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream
            );
        }

-        // 3. Determine the partition metadata and any internal topics if needed.
+        // Build the endpoint to topic partition information


Until I flush out how to only send the information to the group members that don't have it, this seems like the correct spot to build the endpointsToPartitions

Yes, this is the right place. I would maybe move the code to the end of this function (since everything related to "buidling the right response" is towards the of the function). Also, consider extracting the logic into a separate method, since this one is already very long.

lucasbru

Hey Bill! This is on a good way. I left some comments, I think there are some important details that we need to take of in the code.

lucasbru · 2024-12-20T10:01:10Z

...ation-tests/src/test/java/org/apache/kafka/streams/integration/IQv2StoreIntegrationTest.java

@@ -2038,7 +2043,7 @@ private static Properties streamsConfiguration(final boolean cache, final boolea
        config.put(StreamsConfig.TOPOLOGY_OPTIMIZATION_CONFIG, StreamsConfig.OPTIMIZE);
        config.put(StreamsConfig.APPLICATION_ID_CONFIG, "app-" + safeTestName);
        config.put(StreamsConfig.APPLICATION_SERVER_CONFIG, "localhost:" + (++port));
-        config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers());
+        config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, cluster.bootstrapServers());


All these integration tests don't seem to use the new protocol yet. You need to set GROUP_PROTOCOL.

Thanks for the pointer I missed that one, I'll parameterize that as well.

lucasbru · 2024-12-20T10:01:26Z

...tests/src/test/java/org/apache/kafka/streams/integration/OptimizedKTableIntegrationTest.java


    @BeforeAll
    public static void startCluster() throws IOException {
-        CLUSTER.start();
+        final Properties props = new Properties();
+        props.setProperty(GroupCoordinatorConfig.GROUP_COORDINATOR_REBALANCE_PROTOCOLS_CONFIG, "classic,consumer,streams");


How do you imagine this working on trunk? I guess we should parametrize the tests to test both the old and the new protocol?

Yes, I'll parameterize the tests. I took a shortcut to make sure everything worked.

lucasbru · 2024-12-20T10:03:57Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream
            );
        }

-        // 3. Determine the partition metadata and any internal topics if needed.
+        // Build the endpoint to topic partition information


Yes, this is the right place. I would maybe move the code to the end of this function (since everything related to "buidling the right response" is towards the of the function). Also, consider extracting the logic into a separate method, since this one is already very long.

lucasbru · 2024-12-20T10:26:18Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream
            );
        }

-        // 3. Determine the partition metadata and any internal topics if needed.
+        // Build the endpoint to topic partition information
+        final Map<String, org.apache.kafka.coordinator.group.streams.Assignment> assignmentMap = group.targetAssignment();


I wonder if it would be better to use the "Current assignment" instead of the target assignment here. The current assignment will more closely model what is currently available on the node. You can find the current assignment in StreamsGroupMember.assignedActiveTasks etc. pp.

I also considered deriving the topic partitions from the reported taskOffsets instead of the assignment, as this is precisely the tasks that are available on the node. However, I think in the original StreamsPartititonAssignor (see populatePartitionsByHostMaps), we seem to just take the union of active and standby tasks. I discussed this briefly with Bruno, and it seems this optimization wouldn't really be worth it, just leaving this here in case you have a different opinion.

lucasbru · 2024-12-20T10:27:44Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

+        for (Map.Entry<String, Set<Integer>> taskEntry : taskEntrySet) {
+            String subtopologyId = taskEntry.getKey();
+            List<Integer> partitions = new ArrayList<>(taskEntry.getValue());
+            ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId);


Yes, that is correct.

lucasbru · 2024-12-20T11:01:07Z

group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java

+            List<Integer> partitions = new ArrayList<>(taskEntry.getValue());
+            ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId);
+            if (configuredSubtopology != null) {
+                List<StreamsGroupHeartbeatResponseData.TopicPartition> topicPartitions = Stream.concat(


I think there is a corner case that needs to be handled. If you look at PartitionGrouper in the original StreamsPartitionAssignor code, there may be copartitioning groups with different number of partitions within the same subtopology. So while the number of tasks is always equal to the maximal number of partitions in all source topics, there maybe a subset of the source topics that have a lower number of partition. The example would be a subtopology that merges topic A with 1 partition and topic B with 2 partitions. We will get tasks 0 and 1, but we need to make sure that the topic partitions for task 0 is {A_0,B_0} and the topic partitions for task 1 is {B_1} (because topic A does not have a second partition).

To handle this corner case, it's probably enough to remove all elements of partitions that exceed the number of partitions in topic.

I'd suggest we add a little class for this logic, so that we can test these corner cases independently of GroupMetadatManager.

bbejeck added 3 commits December 19, 2024 14:42

Fix merge conflict with First pass at adding IQv2 information

d0cd04d

Added IQv2 information updated IQ related integration tests

dfa9008

Checkstyle fixes

088e26f

github-actions bot added the streams label Dec 19, 2024

bbejeck commented Dec 19, 2024

View reviewed changes

lucasbru reviewed Dec 20, 2024

View reviewed changes

Start addressing comments from initial review

3b655a2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add and distribute IQv2 information in KIP-1071 #18278

[WIP] Add and distribute IQv2 information in KIP-1071 #18278

bbejeck commented Dec 19, 2024

bbejeck Dec 19, 2024

bbejeck Dec 19, 2024

bbejeck Dec 19, 2024

bbejeck Dec 19, 2024

lucasbru Dec 20, 2024

bbejeck Dec 19, 2024

lucasbru Dec 20, 2024

lucasbru left a comment

lucasbru Dec 20, 2024

bbejeck Dec 20, 2024

lucasbru Dec 20, 2024

bbejeck Dec 20, 2024

lucasbru Dec 20, 2024

lucasbru Dec 20, 2024

lucasbru Dec 20, 2024

lucasbru Dec 20, 2024

		@@ -18,6 +18,7 @@
		package org.apache.kafka.coordinator.group.taskassignor;

		import org.apache.kafka.coordinator.group.GroupCoordinatorConfig;

[WIP] Add and distribute IQv2 information in KIP-1071 #18278

Are you sure you want to change the base?

[WIP] Add and distribute IQv2 information in KIP-1071 #18278

Conversation

bbejeck commented Dec 19, 2024

Committer Checklist (excluded from commit message)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lucasbru left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment