Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update error handling for Apollo 4 #2483

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jeffm2001
Copy link
Collaborator

@jeffm2001 jeffm2001 commented Oct 8, 2024

Description

This PR updates our GraphQL error handling for Apollo Server 4, based on the documentation here: https://www.apollographql.com/docs/apollo-server/data/errors

There are several key advantages:

  • Better error messages: In reviewing the code, there are many places where we are throwing a GraphQLError with a message that gives important context to the user, however since version 9.1 (#16a3c3f) those messages are no longer shown, instead showing "Internal server error". I believe I've struck a middle ground that maintains security while still giving context to users. (See security note below).

  • Avoid stuck contacts: This was my main motivation for the PR. There is currently a scenario where if a contact exists in multiple active campaigns and they auto-optout from one of them, when a texter reaches that contact in the other campaign they become stuck in a loop where when they hit Send, it errors ("Something went wrong") and reloads the same contact.

  • Restore some logging: In the update to node v20, logging of GraphQL errors was removed (see src/server/index.js here. I was able to partially restore the logging. I believe there is a better solution, but I will save that for a separate logging focused PR.

  • Remove dead code: Found some ancient code that wasn't being used. For example, the check for a 402 status in AssignmentTexterContact appears to be tied to an "account balance feature that was removed sometime before v1.0

Note about security

As mentioned in the first bullet point, as of version 9.1 we began masking most errors as "Internal server error" rather than showing the error message. There were valid security reasons for this — basically that you don't want to reveal too much about internal state and risk giving attackers insight into way to exploit. In this PR, I was hoping to find a middle ground by returning the text of the message while stripping additional details like paths and stack trace.

It would be good to get a second opinion on this with an eye towards security. Do the error messages still potentially reveal too much? A good way to review this is by searching the code base for new GraphQLError to see where these errors are being thrown.

Checklist:

  • I have manually tested my changes on desktop and mobile
  • The test suite passes locally with my changes
  • If my change is a UI change, I have attached a screenshot to the description section of this pull request
  • My change is 300 lines of code or less, or has a documented reason in the description why it’s longer
  • I have made any necessary changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • My PR is labeled [WIP] if it is in progress

@engelhartrueben
Copy link
Collaborator

Seen. As you mentioned, there may be concerns about security. Getting some additional eyes on this. Thanks

sduveen-pivot

This comment was marked as off-topic.

Copy link
Collaborator

@schuyler1d schuyler1d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Support the change/upgrade in general, but I think we still need to be a bit more cautious about what errors/messages we leak.

(apologies -- reviewed from wrong account)

);
// Strip out stacktrace and other potentially sensative details.
return {
message: formattedError.message,
Copy link
Collaborator

@schuyler1d schuyler1d Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my main concern is what formattedError.message is/can be if an unexpected (e.g. internal library) error happens. E.g. Can this leak a database connection error/failure? I think some courses would make us more confident of this:

  1. Don't include the error in this (non SHOW_SERVER_ERROR mode)
  2. Create a test (and maybe necessarily some test endpoint) that throws a pure JS new Error("sensitive error") and formattedError.message does not include "sensitive error" -- unfortunately, I don't expect this behavior to be born out.

I feel like the code line might still be possible to keep as long as this isn't a new (I'm not up on new JS stuff) thing that goes beyond Apollo frameworks, though I'm a bit nervous of that -- best to whitelist a specific set of codes or only allow it if error typeof 'GraphQLError' -- i.e. something we've consciously wrapped (and you have explicit sets of codes above).

The thing we must block is the ability for a hacker to probe the app for ways to trigger a specific kind of failure or learn more info in the contexts of auth of which stage it fails in (e.g. authentication vs authorization).

[deleted and re-created comment from before]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely want to find the right balance. There's just a lot of places where giving users errors like "That invitation is no longer valid" or "Not allowed to add contacts after the campaign starts" is a lot more helpful than "Internal server error".

Another possible option, which I thought about but didn't test, would be to create our own error class that extends GraphQLError and throw that instead in the places where we want the message to be shown. Then we could check for that type in formatError.

Not 100% sure if that's possible, but I think it might be.

@schuyler1d
Copy link
Collaborator

schuyler1d commented Oct 9, 2024

Another possible option, which I thought about but didn't test, would be to create our own error class that extends GraphQLError and throw that instead in the places where we want the message to be shown. Then we could check for that type in formatError.

@jeffm2001 well what about that option of just exposing code when it IS type GraphQLError -- then the places you throw explicitly in the code w/ an error are implicitly white-listed, but other random Errors are hidden (along with the message which is uncontrollable).

Then if you want to share something on the front-end, you inspect the code. This still makes me slightly nervous because if we accidentally get code which does some test w/ an error before the authorization code (e.g. for efficiency or structure of graphql object paths) it can leak info, but if we keep them small and consider those options (I haven't yet for the errors you have above, but mutations, I think are a little safer....:eyes:)

"Cannot fake a reply to a contact that has no existing thread yet"
};
throw new GraphQLError(errorStatusAndMessage);
throw new GraphQLError("Cannot fake a reply to a contact that has no existing thread yet");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just add a code here like in the others.

src/server/api/schema.js Show resolved Hide resolved
@jeffm2001
Copy link
Collaborator Author

what about that option of just exposing code when it IS type GraphQLError

I think (though not positive) that everything passed to formatError is a GraphQLError. And if not, error.extensions.code is specific to Apollo.

Then if you want to share something on the front-end, you inspect the code.

In general, this is probably a sound idea. That's basically whats happening with the SENDERR_OPTEDOUT error that I was mainly trying to deal with. I do think it creates a challenge for developers who may be good with backend dev but not have experience with React.

@schuyler1d
Copy link
Collaborator

what about that option of just exposing code when it IS type GraphQLError

I think (though not positive) that everything passed to formatError is a GraphQLError. And if not, error.extensions.code is specific to Apollo.

Right, but instead test * error * (not formattedError) for that type?

Then if you want to share something on the front-end, you inspect the code.

In general, this is probably a sound idea. That's basically whats happening with the SENDERR_OPTEDOUT error that I was mainly trying to deal with. I do think it creates a challenge for developers who may be good with backend dev but not have experience with React.

Yeah, I also like the locality of being able to send a nice (self-documenting) message from the backend. It's just the danger of what can get leaked, but that's at least mitigated by it being within try...catch blocks of our own code-base. -- I think the type-check of error helps here -- we can send our messages but not arbitrary ones, maybe. But curious what you think of the risks after the above example popped up :-/

@jeffm2001
Copy link
Collaborator Author

But curious what you think of the risks after the above example popped up :-/

More concerned. Definitely see what you're getting at. I'll play around with some things and see if I can find a good approach.

@jeffm2001 jeffm2001 requested a review from schuyler1d October 10, 2024 17:53
Copy link
Collaborator

@schuyler1d schuyler1d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffm2001 thanks for making SpokeError -- this all looks good now!
🎉

src/server/api/schema.js Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants