Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: add checkpoint and max slots config policy enforcements in PATCH experiment #10125

Merged
merged 4 commits into from
Oct 25, 2024

Conversation

amandavialva01
Copy link
Contributor

@amandavialva01 amandavialva01 commented Oct 24, 2024

Ticket

CM-583

Description

add checkpoint and max slots config policy enforcements in PATCH experiment

Test Plan

CI passes.

Checklist

  • Changes have been manually QA'd
  • New features have been approved by the corresponding PM
  • User-facing API changes have the "User-facing API Change" label
  • Release notes have been added as a separate file under docs/release-notes/
    See Release Note for details.
  • Licenses have been included for new code which was copied and/or modified from any external code

@amandavialva01 amandavialva01 requested a review from a team as a code owner October 24, 2024 21:54
@cla-bot cla-bot bot added the cla-signed label Oct 24, 2024
Copy link

codecov bot commented Oct 24, 2024

Codecov Report

Attention: Patch coverage is 68.88889% with 28 lines in your changes missing coverage. Please review.

Project coverage is 54.56%. Comparing base (962810a) to head (74f65d1).
Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
master/internal/experiment.go 40.00% 12 Missing ⚠️
master/internal/api_experiment.go 42.10% 11 Missing ⚠️
...ternal/configpolicy/postgres_task_config_policy.go 90.32% 3 Missing ⚠️
master/internal/configpolicy/utils.go 90.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #10125      +/-   ##
==========================================
- Coverage   58.87%   54.56%   -4.32%     
==========================================
  Files         757     1267     +510     
  Lines      106045   159610   +53565     
  Branches     3637     3636       -1     
==========================================
+ Hits        62435    87084   +24649     
- Misses      43477    72393   +28916     
  Partials      133      133              
Flag Coverage Δ
backend 45.94% <68.88%> (+2.13%) ⬆️
harness 72.56% <ø> (ø)
web 54.02% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
master/internal/configpolicy/utils.go 74.05% <90.00%> (ø)
...ternal/configpolicy/postgres_task_config_policy.go 91.13% <90.32%> (ø)
master/internal/api_experiment.go 56.91% <42.10%> (ø)
master/internal/experiment.go 33.78% <40.00%> (ø)

... and 506 files with indirect coverage changes

Copy link

netlify bot commented Oct 24, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 74f65d1
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/671bfc2fe8efe30008aff3dd

Copy link
Contributor

@kkunapuli kkunapuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions - PR looks great in general.

}
ctx := context.TODO()
return workspace.WorkspaceByName(ctx, wkspName)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice refactor

require.NoError(t, err)
require.Equal(t, *wkspName, w.Name)
})
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️ love that you unit tested getWorkspaceByConfig!

@@ -113,3 +114,55 @@ func DeleteConfigPolicies(ctx context.Context,
}
return nil
}

// GetEnforcedConfig gets the fields of the global invariant config or constraint if specified, and
// the workspace invariant config or constraint otherwise. If neither is specified, returns nil.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice function description. I recommend describing what the function does, and then describing the priority order. So something like

// GetEnforcedConfig fetches the field from an invariant_config or constraints policyType, 
in order of precedence. Global scope has highest precedence, then workspace. 
Returns nil if none is found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the field format to be unintuitive. e.g. "'resources' -> 'max_slots'"

Could you add an example to the function description? Or at least as a comment within the function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the func description change! Changed this and added an example for the field arg

@@ -113,3 +114,55 @@ func DeleteConfigPolicies(ctx context.Context,
}
return nil
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetEnforcedConfig is fine as a name. I think something like GetConfigPolicyField is more descriptive. When I first read the function name, I thought it was only for invariant configs. It also wasn't clear that it was fetching a single field, rather than a whole or partial config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! Changed this

if policyType != "invariant_config" && policyType != "constraints" {
return nil, fmt.Errorf("invalid policy type :%s", policyType)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also validate workloadType; I think all our other postgres functions do. There's no need to add a test case for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like we actually don't validate workloadType in any of the postgres functions! At first this seemed odd, but then I remembered we made workload_type an enum!
I can still perform the validation if you'd like, but this function would be unique in that regard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, don't add it. I was mistaken!

*msg.MaxSlots > *maxSlotsLimit {
log.Warnf("unable to set max slots")
return
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this code should be in a separate function, preferably in configpolicies package. The main reason is that I prefer to keep "config policy logic" in configpolicies. In other words, an experiment API has no need to know how config policies work, that there's invariant_configs and constraints, precedence, etc. Ideally we would be able to change how config policies work by only changing code in configpolicies files.

The other reason is that the logic can be simplified if it's in its own function. The function could return at line 431 and then have no need to check enforcedMaxSlots == nil on line 442.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this idea, and makes sense! moved this work into its own func, great idea!

// enforced max slots for the workspace if that's set as an invariant config, and returns the
// requested max slots otherwise. Returns an error when max slots is not set as an invariant config
// and the requested max slots violates the constriant.
func CanSetMaxSlots(slotsReq *int, wkspID int) (bool, *int, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function should return bool, int, error or *int, error. In the first return group, the bool lets the caller know if int was set or not. In the second group, a valid int is inferred from whether or not the pointer is nil.

I would simplify it further to just int, error. If there's an error, max_slots cannot be updated. If error is nil, then set max_slots to the returned int value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, changed return type to *int, error!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh wait ok i see your point about int, error!
hmm, yes i see! ok ill change to this

Copy link
Contributor Author

@amandavialva01 amandavialva01 Oct 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gah ok actually, I think it's easier to keep this as is since the func takes in an optional *int (so it can return that same optional *int).
So when the caller gets the func output, it can just replace its input w the func output.
Are you cool w leaving it *int, error instead of int, error?

@amandavialva01 amandavialva01 enabled auto-merge (squash) October 25, 2024 20:30
@amandavialva01 amandavialva01 merged commit 233e095 into main Oct 25, 2024
80 of 94 checks passed
@amandavialva01 amandavialva01 deleted the amanda/CM583 branch October 25, 2024 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants