Improve client tools host resolution #50175

rosstimothy · 2024-12-12T22:08:49Z

Host resolution performed because labels, fuzzy search, or predicate expressions were supplied to commands that establish connections to a single host has historically been performed client side in tsh. While that works in most cases, it can prevent correctly resolving hosts in some situations, i.e. when there are ambiguous hosts and tsh is unaware that the cluster routing strategy is set to ROUTE_TO_MOST_RECENT.

To improve the experience, a new ResolveSSHTarget was added to Auth to allow host resolution to be performed server side. The resolution works in a similar manner to, and was inspired by GetSSHTargets. In the event that the new RPC is not implemented, because the client is newer than Auth, tsh has also been updated to pull the cluster networking config and address any host ambiguity if allowed.

As a result tsh scp and tsh proxy ssh should be much more tolerant to, and still permit access in situations where ambiguous hosts are present for some amount of time. Prior to this the only way to connect in these situations was to find the UUID of the correct target instance and try again after seeing an ambiguous host error.

Changelog: Honor the cluster routing strategy when client initiated host resolution via proxy templates or label matching is ambiguous.

lib/auth/auth_with_roles.go

fspmarshall · 2024-12-18T20:45:28Z

constants.go

+// ErrNodeIsAmbiguous serves as an identifying error string indicating that
+// the proxy subsystem found multiple nodes matching the specified hostname.
+var ErrNodeIsAmbiguous = &trace.NotFoundError{Message: "ambiguous host could match multiple nodes"}
+
 const (
 	// NodeIsAmbiguous serves as an identifying error string indicating that
 	// the proxy subsystem found multiple nodes matching the specified hostname.
+	// TODO(tross) DELETE IN v20.0.0
+	// Deprecated: Prefer using ErrNodeIsAmbiguous
 	NodeIsAmbiguous = "err-node-is-ambiguous"


nit: I might be missing something super obvious, but I'm not seeing where/how we're preserving the meaningfulness of ErrNodeIsAmbiguous across network boundaries/ssh. Is there some magic error handling I'm missing that is making sure that errors.Is(...) is working correctly when ErrNodeIsAmbiguous is generated during routing?

ErrNodeIsAmbiguous should be unpacked after traversing network boundaries automagically by our interceptors which make use of trail.FromGRPC. The only thing that maybe won't work is the wrapping I added to include NodeIsAmbiguous for backwards compatibility. I might have to actually include that in the error message.

Ah okay, I follow now. gravitational/trace has an implementation of the Is method that takes message string equality into account. Depending on entire message string equality feels a lot more brittle than depending on the presence of a single unique identifier within the message. It basically locks us into this exact error message forever. Also, string equality doesn't seem to usually be something that errors.Is cares about, and I think most people touching an error that is intended to be used with errors.Is wouldn't think of string equality as a potential compatibility issue. My preference would be to leave this the way it was.

api/proto/teleport/legacy/client/proto/authservice.proto

lib/auth/auth_with_roles.go

espadolini · 2024-12-19T16:18:28Z

lib/client/api.go

+	})
+	switch {
+	//TODO(tross): DELETE IN v20.0.0
+	case trace.IsNotImplemented(err):


Move this case to a separate method?

I was trying to avoid this "legacy" behavior being in it's own method because I don't think it should ever be called directly. Even with warnings and disclaimers added to the separate method it could still be abused. I'd rather force all callers to use a single unified way of retrieving hosts than each implement something slightly different that doesn't respect proxy templates or the routing strategy.

espadolini · 2024-12-19T16:19:37Z

lib/client/api.go

+		}
+	case err == nil:
+		if resp.GetServer() == nil {
+			return nil, trace.NotFound("no matching SSH hosts found")


This is a version mismatch or a bug, not a NotFound IMO.

I wasn't quite sure what would be the best error type to return here. I was trying to avoid anything that would trigger our automatic reauthentication as that is likely not going to help in this scenario.

lib/client/api.go

lib/tbot/service_ssh_multiplexer.go

espadolini · 2024-12-19T16:40:00Z

lib/tbot/ssh_proxy.go

+	switch {
+	//TODO(tross): DELETE IN v20.0.0
+	case trace.IsNotImplemented(err):
+		resources, err := client.GetAllUnifiedResources(ctx, clt, &proto.ListUnifiedResourcesRequest{


If you move this section back into its own method it'll just be a renaming of the old method.

Host resolution performed because labels, fuzzy search, or predicate expressions were supplied to commands that establish connections to a single host has historically been performed client side in tsh. While that works in most cases, it can prevent correctly resolving hosts in some situations, i.e. when there are ambiguous hosts and tsh is unaware that the cluster routing strategy is set to ROUTE_TO_MOST_RECENT. To improve the experience, a new ResolveSSHTarget was added to Auth to allow host resolution to be performed server side. The resolution works in a similar manner to, and was inspired by GetSSHTargets. In the event that the new RPC is not implemented, because the client is newer than Auth, tsh has also been updated to pull the cluster networking config and address any host ambiguity if allowed. As a result tsh scp and tsh proxy ssh should be much more tolerant to, and still permit access in situations where ambiguous hosts are present for some amount of time. Prior to this the only way to connect in these situations was to find the UUID of the correct target instance and try again after seeing an ambiguous host error.

public-teleport-github-review-bot · 2024-12-20T16:18:08Z

@rosstimothy See the table below for backport results.

Branch	Result
branch/v16	Failed
branch/v17	Failed

rosstimothy added backport/branch/v16 backport/branch/v17 labels Dec 12, 2024

rosstimothy force-pushed the tross/host_resolution branch 7 times, most recently from 8cdb983 to 84d6013 Compare December 16, 2024 18:30

rosstimothy requested review from fspmarshall and espadolini December 16, 2024 19:40

rosstimothy marked this pull request as ready for review December 16, 2024 19:42

github-actions bot added size/md tsh tsh - Teleport's command line tool for logging into nodes running Teleport. labels Dec 16, 2024

github-actions bot requested review from creack and hugoShaka December 16, 2024 19:43

rosstimothy changed the title ~~Improve tsh host resolution~~ Improve client tools host resolution Dec 16, 2024

rosstimothy force-pushed the tross/host_resolution branch from b85e58f to f964532 Compare December 18, 2024 13:50

fspmarshall approved these changes Dec 18, 2024

View reviewed changes

espadolini reviewed Dec 19, 2024

View reviewed changes

rosstimothy force-pushed the tross/host_resolution branch from f964532 to 05b337b Compare December 19, 2024 22:39

espadolini approved these changes Dec 20, 2024

View reviewed changes

public-teleport-github-review-bot bot removed request for creack and hugoShaka December 20, 2024 13:14

rosstimothy force-pushed the tross/host_resolution branch from 6cf0529 to 67136e4 Compare December 20, 2024 15:08

rosstimothy added this pull request to the merge queue Dec 20, 2024

Merged via the queue into master with commit c61461b Dec 20, 2024
43 checks passed

rosstimothy deleted the tross/host_resolution branch December 20, 2024 16:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve client tools host resolution #50175

Improve client tools host resolution #50175

rosstimothy commented Dec 12, 2024 •

edited

Loading

fspmarshall Dec 18, 2024

rosstimothy Dec 18, 2024

fspmarshall Dec 20, 2024

espadolini Dec 19, 2024

rosstimothy Dec 19, 2024

espadolini Dec 19, 2024

rosstimothy Dec 19, 2024

espadolini Dec 19, 2024

public-teleport-github-review-bot bot commented Dec 20, 2024

Improve client tools host resolution #50175

Improve client tools host resolution #50175

Conversation

rosstimothy commented Dec 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

public-teleport-github-review-bot bot commented Dec 20, 2024

rosstimothy commented Dec 12, 2024 •

edited

Loading