-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add and distribute IQv2 information in KIP-1071 #18278
base: kip1071
Are you sure you want to change the base?
Conversation
@@ -110,6 +110,7 @@ | |||
import org.apache.kafka.image.MetadataImage; | |||
import org.apache.kafka.image.MetadataProvenance; | |||
import org.apache.kafka.server.common.MetadataVersion; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkstyle fix
@@ -18,6 +18,7 @@ | |||
package org.apache.kafka.coordinator.group.taskassignor; | |||
|
|||
import org.apache.kafka.coordinator.group.GroupCoordinatorConfig; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checkstyle fix
@@ -173,7 +179,7 @@ public void afterTest() { | |||
|
|||
@AfterAll | |||
public static void after() { | |||
CLUSTER.stop(); | |||
cluster.stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checkstyle fix this applies to all CLUSTER
-> cluster
updates
for (Map.Entry<String, Set<Integer>> taskEntry : taskEntrySet) { | ||
String subtopologyId = taskEntry.getKey(); | ||
List<Integer> partitions = new ArrayList<>(taskEntry.getValue()); | ||
ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Went with the ConfiguredSubtopology
since it contains any resolved regex topics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct.
@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream | |||
); | |||
} | |||
|
|||
// 3. Determine the partition metadata and any internal topics if needed. | |||
// Build the endpoint to topic partition information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Until I flush out how to only send the information to the group members that don't have it, this seems like the correct spot to build the endpointsToPartitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the right place. I would maybe move the code to the end of this function (since everything related to "buidling the right response" is towards the of the function). Also, consider extracting the logic into a separate method, since this one is already very long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Bill! This is on a good way. I left some comments, I think there are some important details that we need to take of in the code.
@@ -2038,7 +2043,7 @@ private static Properties streamsConfiguration(final boolean cache, final boolea | |||
config.put(StreamsConfig.TOPOLOGY_OPTIMIZATION_CONFIG, StreamsConfig.OPTIMIZE); | |||
config.put(StreamsConfig.APPLICATION_ID_CONFIG, "app-" + safeTestName); | |||
config.put(StreamsConfig.APPLICATION_SERVER_CONFIG, "localhost:" + (++port)); | |||
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, CLUSTER.bootstrapServers()); | |||
config.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, cluster.bootstrapServers()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these integration tests don't seem to use the new protocol yet. You need to set GROUP_PROTOCOL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the pointer I missed that one, I'll parameterize that as well.
|
||
@BeforeAll | ||
public static void startCluster() throws IOException { | ||
CLUSTER.start(); | ||
final Properties props = new Properties(); | ||
props.setProperty(GroupCoordinatorConfig.GROUP_COORDINATOR_REBALANCE_PROTOCOLS_CONFIG, "classic,consumer,streams"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you imagine this working on trunk? I guess we should parametrize the tests to test both the old and the new protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll parameterize the tests. I took a shortcut to make sure everything worked.
@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream | |||
); | |||
} | |||
|
|||
// 3. Determine the partition metadata and any internal topics if needed. | |||
// Build the endpoint to topic partition information |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is the right place. I would maybe move the code to the end of this function (since everything related to "buidling the right response" is towards the of the function). Also, consider extracting the logic into a separate method, since this one is already very long.
@@ -2267,7 +2269,26 @@ private CoordinatorResult<StreamsGroupHeartbeatResult, CoordinatorRecord> stream | |||
); | |||
} | |||
|
|||
// 3. Determine the partition metadata and any internal topics if needed. | |||
// Build the endpoint to topic partition information | |||
final Map<String, org.apache.kafka.coordinator.group.streams.Assignment> assignmentMap = group.targetAssignment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it would be better to use the "Current assignment" instead of the target assignment here. The current assignment will more closely model what is currently available on the node. You can find the current assignment in StreamsGroupMember.assignedActiveTasks etc. pp.
I also considered deriving the topic partitions from the reported taskOffsets
instead of the assignment, as this is precisely the tasks that are available on the node. However, I think in the original StreamsPartititonAssignor (see populatePartitionsByHostMaps
), we seem to just take the union of active and standby tasks. I discussed this briefly with Bruno, and it seems this optimization wouldn't really be worth it, just leaving this here in case you have a different opinion.
for (Map.Entry<String, Set<Integer>> taskEntry : taskEntrySet) { | ||
String subtopologyId = taskEntry.getKey(); | ||
List<Integer> partitions = new ArrayList<>(taskEntry.getValue()); | ||
ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is correct.
List<Integer> partitions = new ArrayList<>(taskEntry.getValue()); | ||
ConfiguredSubtopology configuredSubtopology = group.configuredTopology().subtopologies().get(subtopologyId); | ||
if (configuredSubtopology != null) { | ||
List<StreamsGroupHeartbeatResponseData.TopicPartition> topicPartitions = Stream.concat( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a corner case that needs to be handled. If you look at PartitionGrouper
in the original StreamsPartitionAssignor
code, there may be copartitioning groups with different number of partitions within the same subtopology. So while the number of tasks is always equal to the maximal number of partitions in all source topics, there maybe a subset of the source topics that have a lower number of partition. The example would be a subtopology that merges topic A with 1 partition and topic B with 2 partitions. We will get tasks 0 and 1, but we need to make sure that the topic partitions for task 0 is {A_0,B_0} and the topic partitions for task 1 is {B_1} (because topic A does not have a second partition).
To handle this corner case, it's probably enough to remove all elements of partitions
that exceed the number of partitions in topic
.
I'd suggest we add a little class for this logic, so that we can test these corner cases independently of GroupMetadatManager
.
This PR adds IQv2 information to the
StreamsGroupHeartbeat
response.For testing, all IQv2 related integration tests have been updated (and are passing) to use the new streams protocol to exercise the newly added code. A specific integration test and unit tests are coming.
Among the tests updated:
ConsistencyVectorIntegrationTest
EosIntegrationTest
IQv2IntegrationTest
IQv2StoreIntegrationTest
IQv2VersionedStoreIntegrationTest
LagFetchIntegrationTest
OptimizedKTableIntegrationTest
PositionRestartIntegrationTest
QueryableStateIntegrationTest
StoreQueryIntegrationTest
Committer Checklist (excluded from commit message)