-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
deploy: support for read affinity options per cluster #4165
deploy: support for read affinity options per cluster #4165
Conversation
internal/rbd/nodeserver.go
Outdated
log.FatalLogMsg(err.Error()) | ||
} | ||
|
||
crushLocationMap, err := util.GetCrushLocationMap(crushLocationLabels, ns.Driver.NodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function call k8s api, which is an expensive operation, to get node labels.
Performance will severely be impacted if we perform such a call on each nodeStage request.
Please fetch the node labels when the driver starts and store it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fetch the node labels when the driver starts and store it.
- Is this a problem for a running Ceph-CSI setup when nodes are added later on?
- What happens if the Crush location map is modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fetch the node labels when the driver starts and store it.
- Is this a problem for a running Ceph-CSI setup when nodes are added later on?
No, We just use this in daemonset nodeplugin pod.
So, when a new node is added, a daemonset pod starts on it and we'll pick up the labels that time.
There is a requirement currently that when node labels are updated, they'll need to restart nodepugin pods andthe mounted volumes for the changed topology to take affect.
- What happens if the Crush location map is modified?
If you meant Crush Location map modified by means of node topology label update, it is the same as above.
If you meant changes were done to the configmap to update labels, they'll need to restart the nodeplugin pod too.
I expect these changes to be very rare and possibly untouched after initial creation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how often locations/labels are added to a crush map 🤷♂️ If it is clearly documented that changing labels of a node requires restarting, then thats fine.
3cc1578
to
377ff13
Compare
/test ci/centos/mini-e2e/k8s-1.27 |
aa9e7d3
to
8b601e7
Compare
@Rakshith-R @Madhu-1 |
internal/rbd/nodeserver.go
Outdated
@@ -45,8 +45,9 @@ type NodeServer struct { | |||
// A map storing all volumes with ongoing operations so that additional operations | |||
// for that same volume (as defined by VolumeID) return an Aborted error | |||
VolumeLocks *util.VolumeLocks | |||
// readAffinityMapOptions contains map options to enable read affinity. | |||
readAffinityMapOptions string | |||
NodeLabels map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please document this new map with a comment
internal/rbd/nodeserver.go
Outdated
@@ -280,14 +283,14 @@ func (ns *NodeServer) populateRbdVol( | |||
|
|||
// appendReadAffinityMapOptions appends readAffinityMapOptions to mapOptions | |||
// if mounter is rbdDefaultMounter and readAffinityMapOptions is not empty. | |||
func (ns NodeServer) appendReadAffinityMapOptions(rv *rbdVolume) { | |||
func appendReadAffinityMapOptions(rv *rbdVolume, readAffinityMapOptions string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can now be a function of the rbdVolume
struct:
func (rv *rbdVolume) appendReadAffinityMapOptions(readAffinityMapOptions string)
which would be cleaner
internal/util/crushlocation.go
Outdated
@@ -26,16 +26,11 @@ import ( | |||
// the crush location labels and their values from the CO system. | |||
// Expects crushLocationLabels in arg to be in the format "[prefix/]<name>,[prefix/]<name>,...",. | |||
// Returns map of crush location types with its array of associated values. | |||
func GetCrushLocationMap(crushLocationLabels, nodeName string) (map[string]string, error) { | |||
func GetCrushLocationMap(crushLocationLabels string, nodeLabels map[string]string) (map[string]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment is not correct anymore. The nodeLabels are not retrieved from the CO system within this function.
internal/util/topology.go
Outdated
@@ -34,7 +34,7 @@ const ( | |||
labelSeparator string = "," | |||
) | |||
|
|||
func k8sGetNodeLabels(nodeName string) (map[string]string, error) { | |||
func K8sGetNodeLabels(nodeName string) (map[string]string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you can move this function to a file under internal/util/k8s/
?
internal/rbd/nodeserver.go
Outdated
@@ -1396,9 +1399,46 @@ func getDeviceSize(ctx context.Context, devicePath string) (uint64, error) { | |||
return size, nil | |||
} | |||
|
|||
func (ns *NodeServer) SetReadAffinityMapOptions(crushLocationMap map[string]string) { | |||
func (ns *NodeServer) SetcmdReadAffinityMapOptions(crushLocationMap map[string]string) { | |||
ns.cmdReadAffinityMapOptions = constructReadAffinityMapOption(crushLocationMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a good reason to keep constructReadAffinityMapOption()
as a separate function? Why not make it
func (ns *NodeServer) constructReadAffinityMapOption(crushLocationMap map[string]string)
or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly, we can make cmdReadAffinityMapOptions
a public field. This would allow us to directly use it as follows:
ns.cmdReadAffinityMapOptions = ns.constructReadAffinityMapOption(crushLocationMap)
This change would eliminate the need for the SetcmdReadAffinityMapOptions()
method.
3e7df1b
to
52b3723
Compare
/test ci/centos/mini-e2e/k8s-1.27 |
internal/rbd/driver/driver.go
Outdated
nodeLabels, err = k8s.GetNodeLabels(conf.NodeID) | ||
if err != nil { | ||
log.FatalLogMsg(err.Error()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code will cause problems for non-kubernetes CO. do this check only if the labels are passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will move this under
if conf.EnableReadAffinity {
...
}
and will make --enable-ready-affinity
command line flag to be mandatory for enabling read affinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking do we really need to have this flag, if the labels are set assume that you will enable read-affinity like we do currently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code will cause problems for non-kubernetes CO. do this check only if the labels are passed?
@Madhu-1 we can have a new --container-orchestrator
flag (default - kubernetes). So we will be fetching node labels only when the flag is set to kubernetes?
Or this way
and will make --enable-ready-affinity command line flag to be mandatory for enabling read affinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would vote for new --container-orchestrator
which might be useful to decide when to make kube API calls. @nixpanic @Rakshith-R thought?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i was thinking do we really need to have this flag, if the labels are set assume that you will enable read-affinity like we do currently?
labels maybe set in the configmap instead.
Let's have the read-affinity flag set in cmdline as a pre-requisite.
This setting can be overridden by individual cluster id in configmap.
i would vote for new
--container-orchestrator
which might be useful to decide when to make kube API calls. @nixpanic @Rakshith-R thought?
I think this entails changes at a lot of places and is out of scope for this pr.
and will make --enable-ready-affinity command line flag to be mandatory for enabling read affinity.
This option seems sufficient for me.
internal/rbd/nodeserver.go
Outdated
func (ns *NodeServer) getReadAffinityMapOptionFromConfigMap(clusterID string) string { | ||
crushLocationLabels, err := util.GetReadAffinityOptions(util.CsiConfigFile, clusterID) | ||
if err != nil { | ||
log.FatalLogMsg(err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should not be Fatal you need to return the error message from the function for all cases
9c2ed9c
to
416e1a0
Compare
@@ -87,8 +87,7 @@ spec: | |||
{{- if .Values.nodeplugin.profiling.enabled }} | |||
- "--enableprofiling={{ .Values.nodeplugin.profiling.enabled }}" | |||
{{- end }} | |||
- "--enable-read-affinity={{ and .Values.readAffinity .Values.readAffinity.enabled }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any specific reason to remove enable-read-affinity
flag?
charts/ceph-csi-rbd/values.yaml
Outdated
@@ -282,7 +287,6 @@ topology: | |||
# readAffinity: | |||
# Enable read affinity for RBD volumes. Recommended to | |||
# set to true if running kernel 5.8 or newer. | |||
# enabled: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets not remove it, it can cause breaking change for existing clusters
cmd/cephcsi.go
Outdated
flag.StringVar( | ||
&conf.CrushLocationLabels, | ||
"crush-location-labels", | ||
"", | ||
"list of Kubernetes node labels, that determines the"+ | ||
" CRUSH location the node belongs to, separated by ','") | ||
flag.StringVar(&conf.ContainerOrchestrator, "container-orchestrator", "kubernetes", "container orchestrator") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my suggestion was to add this extra flag but not to remove the EnableReadAffinity
existing flag :)
docs/deploy-rbd.md
Outdated
@@ -224,6 +223,12 @@ If enabled, this option will be added to all RBD volumes mapped by Ceph CSI. | |||
Well known labels can be found | |||
[here](https://kubernetes.io/docs/reference/labels-annotations-taints/). | |||
|
|||
Read affinity can be configured for individual clusters within the | |||
`ceph-csi-config` ConfigMap. This allows configuring the crush location labels | |||
for each cluster separately. The crush location labels specified in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for each cluster separately
to for each ceph cluster separately
internal/rbd/driver/driver.go
Outdated
@@ -125,8 +126,13 @@ func (r *Driver) Run(conf *util.Config) { | |||
}) | |||
} | |||
|
|||
if conf.EnableReadAffinity { | |||
crushLocationMap, err = util.GetCrushLocationMap(conf.CrushLocationLabels, conf.NodeID) | |||
nodeLabels, err = k8s.GetNodeLabels(conf.NodeID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this function only if ContainerOrchestrator
is kubernetes
internal/rbd/nodeserver.go
Outdated
// ConstructReadAffinityMapOption constructs a read affinity map option based on the provided crushLocationMap. | ||
// It appends crush location labels in the format | ||
// "read_from_replica=localize,crush_location=label1:value1|label2:value2|...". | ||
func ConstructReadAffinityMapOption(crushLocationMap map[string]string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this function can be reused move this to util as we need this for cephfs as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I thought to move this to util during CephFS work.
internal/rbd/nodeserver.go
Outdated
// getReadAffinityMapOptions retrieves the readAffinityMapOptions from the CSI config file if it exists. | ||
// If not, it falls back to returning the `CLIReadAffinityMapOptions` from the command line. | ||
// If neither of these options is available, it returns an empty string. | ||
func (ns *NodeServer) getReadAffinityMapOptions(clusterID string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to until as this can be reused for cephfs asl well?
internal/util/k8s/node.go
Outdated
@@ -0,0 +1,39 @@ | |||
/* | |||
Copyright 2020 The CephCSI Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2023?
c9b845d
to
ad6465d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible add a test case where crush rules are set in configmap as well.
docs/deploy-rbd.md
Outdated
@@ -215,7 +214,7 @@ for more details. | |||
|
|||
This can be enabled by adding labels to Kubernetes nodes like | |||
`"topology.io/region=east"` and `"topology.io/zone=east-zone1"` and | |||
passing command line arguments `"--enable-read-affinity=true"` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revert this back?
internal/util/k8s/client.go
Outdated
@@ -46,5 +52,9 @@ func NewK8sClient() (*kubernetes.Clientset, error) { | |||
return nil, fmt.Errorf("failed to create client: %w", err) | |||
} | |||
|
|||
if kubeclient == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is not required here?
internal/util/read_affinity.go
Outdated
|
||
crushLocationMap, err := GetCrushLocationMap(configCrushLocationLabels, nodeLabels) | ||
if err != nil { | ||
log.FatalLogMsg(err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log as error and a return err is missing here?
04ff893
to
dd9d0eb
Compare
/test ci/centos/mini-e2e-helm/k8s-1.27 |
/test ci/centos/mini-e2e-helm/k8s-1.28 |
/test ci/centos/mini-e2e/k8s-1.27 |
/test ci/centos/mini-e2e/k8s-1.28 |
@Mergifyio rebase |
Signed-off-by: Praveen M <[email protected]>
This commit adds util functions related to read affinity and unit testcases for the same. Signed-off-by: Praveen M <[email protected]>
Signed-off-by: Praveen M <[email protected]>
Implemented the capability to include read affinity options for individual clusters within the ceph-csi-config ConfigMap. This allows users to configure the crush location for each cluster separately. The read affinity options specified in the ConfigMap will supersede those provided via command line arguments. Signed-off-by: Praveen M <[email protected]>
✅ Branch has been successfully rebased |
6c38ced
to
32bf571
Compare
@Mergifyio queue |
✅ The pull request has been merged automaticallyThe pull request has been merged automatically at c4e373c |
/test ci/centos/k8s-e2e-external-storage/1.27 |
/test ci/centos/upgrade-tests-cephfs |
/test ci/centos/mini-e2e-helm/k8s-1.27 |
/test ci/centos/upgrade-tests-rbd |
/test ci/centos/mini-e2e/k8s-1.27 |
/test ci/centos/k8s-e2e-external-storage/1.26 |
/test ci/centos/k8s-e2e-external-storage/1.28 |
/test ci/centos/mini-e2e-helm/k8s-1.26 |
/test ci/centos/mini-e2e-helm/k8s-1.28 |
/test ci/centos/mini-e2e/k8s-1.26 |
/test ci/centos/mini-e2e/k8s-1.28 |
Describe what this PR does
deploy: support for read affinity options per cluster
Implemented the capability to include read affinity options
for individual clusters within the ceph-csi-config ConfigMap.
This allows users to configure the crush location for each
cluster separately. The read affinity options specified in
the ConfigMap will supersede those provided via command
line arguments.
util: moved GetNodeLabels() under internal/util/k8s
util: added read affinity related functions and unit testcases
util: added RunsOnKubernetes() function
Fixes: #4161
Show available bot commands
These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:
/retest ci/centos/<job-name>
: retest the<job-name>
after unrelatedfailure (please report the failure too!)