diff --git a/docs/advanced-features/tabulator.md b/docs/advanced-features/tabulator.md index 83b5efcc6ce..6a57bf365ba 100644 --- a/docs/advanced-features/tabulator.md +++ b/docs/advanced-features/tabulator.md @@ -78,13 +78,15 @@ In addition to the columns defined in the schema, Tabulator will add: ### Using Athena to Access Tabulator -Due to the way permissions are configured, Tabulator cannot be accessed from -the AWS Console or Athena views by default -(unless [unrestricted access](#unrestricted-access) is enabled). -You must access Tabulator via the Quilt stack in order to query those tables. -This can be done by users via the per-bucket -"Queries" tab in the Quilt Catalog, or programmatically via `quilt3`. See -"Usage" below for more details. +The primary way of accessing Tabulator is using the Quilt stack to query those +tables. This can be done by users via the per-bucket "Queries" tab in the Quilt +Catalog, or programmatically via `quilt3`. See "Usage" below for more details. + +As of Quilt Platform version 1.57, admins can enable [open query](#open-query) +(below) to allow external users to access Tabulator tables directly from the AWS +Console, Athena views, or JDBC connectors. This is especially useful for +customers who want to access Tabulator from external services, such as Tableau +and Spotfire. ### Caveats @@ -113,7 +115,7 @@ This can be done by users via the per-bucket Once the configuration is set, users can query the tables using the Athena tab from the Quilt Catalog. Note that because Tabulator runs with elevated permissions, it cannot be accessed from the AWS Console by default -(unless [unrestricted access](#unrestricted-access) is enabled). +(unless [open query](#open-query) is enabled). For example, to query the `ccle_tsv` table from the appropriate workgroup in the `quilt-tf-stable` stack, where the database (bucket name) is `udp-spec`: @@ -151,7 +153,7 @@ page from which you must paste in the appropriate access token. Use `get_boto3_session()` to get a session with the same permissions as your Quilt Catalog user, then use the `boto3` Athena client to run queries. -> If [unrestricted access](#unrestricted-access) is enabled, you can use any +> If [open query](#open-query) is enabled, you can use any > AWS credentials providing access to Athena resources associated with Tabulator. Here is a complete example: @@ -195,127 +197,37 @@ else: print(f'Query did not succeed. Final state: {state}') ``` -## Unrestricted Access +## Open Query > Available since Quilt Platform version 1.57 -By default, Tabulator is only accessible via a session provided by the Quilt Catalog, -and the access is scoped to the permissions of the Catalog user associated with -that session. However, an admin can enable **unrestricted access** to Tabulator, -deferring all access control to AWS. The underlying data in S3 is accessed using -the Tabulator's dedicated "unrestricted" role, which has read-only access to all -the S3 buckets attached to the given stack. This allows querying the data directly -from the AWS Console or Athena views, given the caller has the necessary permissions -to access Athena resources associated with Tabulator. - -![Tabulator Settings](../imgs/admin-tabulator-settings.png) +By default, Tabulator is only accessible via a session provided by the Quilt +Catalog, and the access is scoped to the permissions of the Catalog user +associated with that session. However, admins can choose to enable **open +query** to Tabulator tables, deferring all access control to AWS, thus enabling +access from external services. This allows querying Tabulator from the AWS +Console, Athena views or JDBC connectors -- as long as the caller has been +granted the necessary permissions to access Athena resources associated with +Tabulator. -### Permissions & Configuration +### 1. Enable Open Query -In order to access Tabulator in unrestricted mode, the caller must: +To enable open query, an admin must set the `open_query` field to `true` in +Tabulator configuration. This can be done via the Admin UI or the +`quilt3.admin.tabulator` API. -1. Provide a workgroup with output location set and compatible with that of the - Tabulator (`s3://${UserAthenaResultsBucket}/athena-results/non-managed-roles/`). +![Tabulator Settings](../imgs/admin-tabulator-settings.png) -2. Have the following permissions: +### 2. Configure Permissions - - Athena query execution on the designated workgroup - - Access to the Tabulator data catalog - - Invoking the Tabulator Lambda function - - Read access to the Tabulator bucket for spill files - - Read/write access to the Athena results bucket +In order to access Tabulator in open query mode, the caller must use a special +workgroup, and have permissions to use that workgroup and access tabulator +resources. Both of these are created by the Quilt stack, and are available in +the "Resources" tab. -Here is an example CloudFormation template that creates the necessary resources: +1. Find the ARN for the Tabulator Open Query Policy, then copy that into the + relevant IAM role. +2. Find the name of the Tabulator Open Query Workgroup, and configure your + Athena client or connector to use that workgroup. -```yaml -AWSTemplateFormatVersion: 2010-09-09 -Description: "Resources for accessing Tabulator in unrestricted mode" - -Parameters: - UserAthenaResultsBucket: - Type: String - Description: "UserAthenaResultsBucket from the Quilt stack hosting the Tabulator" - TabulatorBucket: - Type: String - Description: "TabulatorBucket from the Quilt stack hosting the Tabulator" - TabulatorDataCatalogArn: - Type: String - Description: | - ARN of the TabulatorDataCatalog from the Quilt stack hosting the Tabulator - TabulatorLambdaArn: - Type: String - Description: "ARN of the TabulatorLambda from the Quilt stack hosting the Tabulator" - -Resources: - AthenaWorkGroup: - Type: AWS::Athena::WorkGroup - Properties: - Name: "TabulatorUnrestrictedAccessDogfood" - Description: "Workgroup for testing Tabulator with unrestricted access" - WorkGroupConfiguration: - EnforceWorkGroupConfiguration: true - ResultConfiguration: - OutputLocation: !Sub "s3://${UserAthenaResultsBucket}/athena-results/non-managed-roles/" - TabulatorAccessRole: - Type: AWS::IAM::Role - Properties: - AssumeRolePolicyDocument: - Version: 2012-10-17 - Statement: - - Effect: Allow - Principal: - AWS: "*" - Action: sts:AssumeRole - Policies: - - PolicyName: TabulatorAccess - PolicyDocument: - Version: 2012-10-17 - Statement: - - Effect: Allow - Action: - - athena:BatchGetNamedQuery - - athena:BatchGetQueryExecution - - athena:GetNamedQuery - - athena:GetQueryExecution - - athena:GetQueryResults - - athena:GetWorkGroup - - athena:StartQueryExecution - - athena:StopQueryExecution - - athena:ListNamedQueries - - athena:ListQueryExecutions - Resource: !Sub "arn:${AWS::Partition}:athena:${AWS::Region}:${AWS::AccountId}:workgroup/${AthenaWorkGroup}" - - Effect: Allow - Action: - - athena:ListWorkGroups - - athena:ListDataCatalogs - - athena:ListDatabases - Resource: "*" - - Effect: Allow - Action: athena:GetDataCatalog - Resource: !Ref TabulatorDataCatalogArn - - Effect: Allow - Action: lambda:InvokeFunction - Resource: !Ref TabulatorLambdaArn - - Effect: Allow - Action: - - s3:GetBucketLocation - - s3:GetObject - - s3:PutObject - - s3:AbortMultipartUpload - - s3:ListMultipartUploadParts - Resource: - - !Sub "arn:aws:s3:::${UserAthenaResultsBucket}" - - !Sub "arn:aws:s3:::${UserAthenaResultsBucket}/athena-results/non-managed-roles/*" - - Effect: Allow - Action: - - s3:GetObject - - s3:ListBucket - Resource: - - !Sub "arn:aws:s3:::${TabulatorBucket}" - - !Sub "arn:aws:s3:::${TabulatorBucket}/spill/unrestricted/*" - -Outputs: - RoleArn: - Description: "ARN of the created IAM role" - Value: !GetAtt TabulatorAccessRole.Arn -``` +![Tabulator Resources](../imgs/admin-tabulator-resources.png) diff --git a/docs/imgs/admin-tabulator-resources.png b/docs/imgs/admin-tabulator-resources.png new file mode 100644 index 00000000000..14bac31d4ae Binary files /dev/null and b/docs/imgs/admin-tabulator-resources.png differ diff --git a/docs/imgs/admin-tabulator-settings.png b/docs/imgs/admin-tabulator-settings.png index 01b571b57e0..262002a6ab1 100644 Binary files a/docs/imgs/admin-tabulator-settings.png and b/docs/imgs/admin-tabulator-settings.png differ