Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query by primary key support #4

Merged
merged 5 commits into from
Nov 22, 2023
Merged

Query by primary key support #4

merged 5 commits into from
Nov 22, 2023

Conversation

daniel-chambers
Copy link
Collaborator

@daniel-chambers daniel-chambers commented Nov 22, 2023

Description

This PR adds the ability to query Dynamo DB and return rows querying by their primary keys. It adds the query implementation of the <table name>_by_keys function that the schema declares (added in #2).

This means you can now perform the following GraphQL queries from v3-engine:

query {
  forumByKeys(
    keys: [
      {ForumName: "Amazon DynamoDB"},
      {ForumName: "Amazon Athena"}
    ],
    consistent_read: false
  ) {
    ForumName
    Category
    Messages
    Threads
    Views
  }
}

This PR uses some hacks (marked with TODOs) to work around issues in the TypeScript SDK. These issues have since been resolved, but I'd rather perform the SDK upgrade in a subsequent PR.

Solution Design

Enhanced schema information

The query execution pipeline needed a bit more information about the function definition, so the ByKeysFunction type in schema-ndc.ts was enhanced with extra information about the primary key schema and the table row type. This information is already computed during schema generation, and capturing it on the function definition ensures we don't need to recompute it at query time.

Query

The new query code lives in query.ts. The most important function is performByKeysFunction. This function first builds a projection expression from the query (queryFieldsToProjectionExpression), ensuring we only pull back attributes we are interested in, and then creates a BatchGetItemCommand to query for the rows by key. This is complicated by the need to alias attribute names (see AttributeNameAliaser) to avoid projection expression syntax errors caused by weird attribute names that conflict with the expression syntax.

We may need to issue multiple BatchGetItemCommand queries because Dynamo only accepts 100 keys at a time, and it may not return all the rows for keys you requested (UnprocessedKeys), requiring you to try again. We maintain a queue of keys (unprocessedKeys) we haven't got results for and then we repeatedly query dynamo, draining the queue as we receive rows and putting any unprocessedKeysFromThisResponse back onto the queue for retry.

As we receive rows mkNdcResponseRowFromDynamoResponseRow maps the Dynamo response row format into our desired NDC row response format.

Other Changes

  • noUncheckedIndexedAccess has been turned on in the TypeScript compiler to tighten up map lookups from objects. Various places that needed explicit error handling if the expected key wasn't in the object map have been updated.
  • A bug in the schema generation where index names would get mangled has been fixed.

@daniel-chambers daniel-chambers changed the title By primary key query support Query by primary key support Nov 22, 2023
@daniel-chambers daniel-chambers self-assigned this Nov 22, 2023
Copy link

@dmoverton dmoverton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@daniel-chambers daniel-chambers merged commit e3534b7 into main Nov 22, 2023
1 check passed
@daniel-chambers daniel-chambers deleted the daniel/by-pk-query branch November 22, 2023 06:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants