Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HJ-87 and HJ-88: model changes #5600

Open
wants to merge 36 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
bc0a5a5
migration and model methods only
thingscouldbeworse Dec 12, 2024
8099de7
black tests
thingscouldbeworse Dec 12, 2024
0a60fcf
data_use added to taxonomy
thingscouldbeworse Dec 12, 2024
204d957
more taxonomy
thingscouldbeworse Dec 12, 2024
3e78a4f
remove unneeded .fides files
thingscouldbeworse Dec 13, 2024
899a20a
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 13, 2024
703ace0
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 13, 2024
833874e
patch on system hidden and test
thingscouldbeworse Dec 16, 2024
0832e24
remove hidden column from stagedresource
thingscouldbeworse Dec 16, 2024
c1eafb8
stagedresource no hidden column, use diff_status == 'muted'
thingscouldbeworse Dec 17, 2024
084c2a1
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 17, 2024
c5e24e1
merge main
thingscouldbeworse Dec 18, 2024
e200e7b
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 18, 2024
df8e884
change downrev on migration after main merge
thingscouldbeworse Dec 18, 2024
e167770
add a ConnectionConfigResponse that contains the monitor configs
thingscouldbeworse Dec 18, 2024
5da33c8
optional
thingscouldbeworse Dec 18, 2024
7181a02
Revert "add a ConnectionConfigResponse that contains the monitor conf…
thingscouldbeworse Dec 18, 2024
61f38a4
PR feedback and reset system endpoint to exactly what it was before
thingscouldbeworse Dec 19, 2024
88314a8
pylint
thingscouldbeworse Dec 19, 2024
1962341
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 19, 2024
b152e93
downrev
thingscouldbeworse Dec 19, 2024
5f88545
fix tests
thingscouldbeworse Dec 19, 2024
3d636cb
erge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 19, 2024
36b9915
yet another downrev
thingscouldbeworse Dec 19, 2024
aa4689c
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 19, 2024
63564b7
downrev
thingscouldbeworse Dec 19, 2024
b6dbdd6
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 20, 2024
ef91b70
move test
thingscouldbeworse Dec 20, 2024
6e52636
try not doing a DB call and make test run first
thingscouldbeworse Dec 20, 2024
4437235
sync endpoint
thingscouldbeworse Dec 20, 2024
820c58d
static checks
thingscouldbeworse Dec 20, 2024
b00e706
rename back test file
thingscouldbeworse Dec 20, 2024
0932586
Merge branch 'main' into HJ-88-87_model-changes
thingscouldbeworse Dec 20, 2024
ff7fd8c
add back in tests
thingscouldbeworse Dec 20, 2024
408beb5
reset
thingscouldbeworse Dec 20, 2024
5e0afc1
add logic back
thingscouldbeworse Dec 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .fides/db_dataset.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2227,6 +2227,10 @@ dataset:
data_categories: [system]
- name: user_assigned_data_categories
data_categories: [system]
- name: hidden
data_categories: [system]
- name: data_use
data_categories: [system]
- name: fides_user_invite
fields:
- name: created_at
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ The types of changes are:
- New page in the Cookie House sample app to demonstrate the use of embedding the FidesJS SDK on the page [#5564](https://github.com/ethyca/fides/pull/5564)
- Added event based communication example to the Cookie House sample app [#5597](https://github.com/ethyca/fides/pull/5597)
- Added new erasure tests for BigQuery Enterprise [#5554](https://github.com/ethyca/fides/pull/5554)
- Migration to add the `hidden` and `data_use` columns to `stagedresource` table, prereqs for Data Catalog work in Fidesplus [#5600](https://github.com/ethyca/fides/pull/5600/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
- Migration to add the `hidden` and `data_use` columns to `stagedresource` table, prereqs for Data Catalog work in Fidesplus [#5600](https://github.com/ethyca/fides/pull/5600/)
- Migration to add the and `data_use` column to `stagedresource` table, prereq for Data Catalog work in Fidesplus [#5600](https://github.com/ethyca/fides/pull/5600/)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


### Fixed
- Fixing quickstart.py script [#5585](https://github.com/ethyca/fides/pull/5585)
Expand Down
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i'd probably update the filename here just to be consistent with the new functionality - it can be hard to track down migrations sometimes and relying on the filename can help with looking it up, so this could add a bit of confusion moving forward!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in 61f38a4#diff-6ea847dd5dab7785c69f11369a07c894290e837501325a3b8eb5f003059c33ea and made sure column isn't pluralized in the comment

Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""add `hidden` and `data_use` columns to stagedresource

Revision ID: d9237a0c0d5a
Revises: c90d46f6d3f2
Create Date: 2024-11-21 13:18:24.085858

"""

import sqlalchemy as sa
from alembic import op

# revision identifiers, used by Alembic.
revision = "d9237a0c0d5a"
down_revision = "c90d46f6d3f2"
branch_labels = None
depends_on = None


def upgrade():
op.add_column("stagedresource", sa.Column("hidden", sa.Boolean(), nullable=False))
op.add_column("stagedresource", sa.Column("data_use", sa.String(), nullable=True))
adamsachs marked this conversation as resolved.
Show resolved Hide resolved
# ### end Alembic commands ###


def downgrade():
op.drop_column("stagedresource", "hidden")
op.drop_column("stagedresource", "data_use")
# ### end Alembic commands ###
53 changes: 4 additions & 49 deletions src/fides/api/api/v1/endpoints/system.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import datetime
from typing import Annotated, Dict, List, Optional, Union

from fastapi import Depends, HTTPException, Query, Response, Security
Expand All @@ -10,7 +9,6 @@
from fideslang.validation import FidesKey
from loguru import logger
from pydantic import Field
from sqlalchemy import or_
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.future import select
from sqlalchemy.orm import Session
Expand Down Expand Up @@ -391,63 +389,21 @@ async def ls( # pylint: disable=invalid-name
data_uses: Optional[List[FidesKey]] = Query(None),
data_categories: Optional[List[FidesKey]] = Query(None),
data_subjects: Optional[List[FidesKey]] = Query(None),
dnd_relevant: Optional[bool] = Query(None),
show_hidden: Optional[bool] = Query(False),
show_deleted: Optional[bool] = Query(False),
) -> List:
"""Get a list of all of the Systems.
If any parameters or filters are provided the response will be paginated and/or filtered.
Otherwise all Systems will be returned (this may be a slow operation if there are many systems,
so using the pagination parameters is recommended).
"""
if not (
size
or page
or search
or data_uses
or data_categories
or data_subjects
or dnd_relevant
or show_hidden
):
# if no advanced parameters are passed, we return a very basic list of all System resources
# to maintain backward compatibility of the original API, which backs some important client usages, e.g. the fides CLI

if not (size or page or search or data_uses or data_categories or data_subjects):
return await list_resource(System, db)

query = select(System)

pagination_params = Params(page=page or 1, size=size or 50)
# Need to join with PrivacyDeclaration in order to be able to filter
# by data use, data category, and data subject
if any([data_uses, data_categories, data_subjects]):
query = query.outerjoin(
PrivacyDeclaration, System.id == PrivacyDeclaration.system_id
)

# Fetch any system that is relevant for Detection and Discovery, ie any of the following:
# - has connection configurations (has some integration for DnD or SaaS)
# - has dataset references
if dnd_relevant:
query = query.filter(
(System.connection_configs != None) # pylint: disable=singleton-comparison
| (System.dataset_references.any())
)

# Filter out any hidden systems, unless explicilty asked for
if not show_hidden:
query = query.filter(
System.hidden == False # pylint: disable=singleton-comparison
)

# Filter out any vendor deleted systems, unless explicitly asked for
if not show_deleted:
query = query.filter(
or_(
System.vendor_deleted_date.is_(None),
System.vendor_deleted_date >= datetime.datetime.now(),
)
)
query = select(System).outerjoin(
adamsachs marked this conversation as resolved.
Show resolved Hide resolved
PrivacyDeclaration, System.id == PrivacyDeclaration.system_id
)

filter_params = FilterParams(
search=search,
Expand All @@ -464,7 +420,6 @@ async def ls( # pylint: disable=invalid-name

# Add a distinct so we only get one row per system
duplicates_removed = filtered_query.distinct(System.id)

return await async_paginate(db, duplicates_removed, pagination_params)


Expand Down
46 changes: 46 additions & 0 deletions src/fides/api/models/detection_discovery.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,14 @@
from enum import Enum
from typing import Any, Dict, Iterable, List, Optional, Type

from loguru import logger
from sqlalchemy import ARRAY, Boolean, Column, DateTime, ForeignKey, String
from sqlalchemy.dialects.postgresql import JSONB
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.ext.mutable import MutableDict
from sqlalchemy.future import select
from sqlalchemy.orm import Session, relationship
from sqlalchemy.orm.query import Query

from fides.api.db.base_class import Base, FidesBase
from fides.api.models.connectionconfig import ConnectionConfig
Expand Down Expand Up @@ -278,6 +280,9 @@ class StagedResource(Base):
default=dict,
)

# hidden flag, used by some parts of the dataset lifecycle experience
hidden = Column(Boolean, default=False, nullable=False)
adamsachs marked this conversation as resolved.
Show resolved Hide resolved

@classmethod
def get_urn(cls, db: Session, urn: str) -> Optional[StagedResource]:
"""Utility to retrieve the staged resource with the given URN"""
Expand Down Expand Up @@ -337,3 +342,44 @@ def mark_as_addition(
)
if parent_resource:
parent_resource.add_child_diff_status(DiffStatus.ADDITION)


def fetch_staged_resources_by_type_query(
resource_type: str,
monitor_config_ids: Optional[List[str]] = None,
show_hidden: bool = False,
) -> Query[StagedResource]:
"""
Fetches staged resources by type and monitor config ID. Optionally filters out hidden resources.
"""
logger.info(
f"Fetching staged resources of type {resource_type}, show_hidden={show_hidden}, monitor_config_ids={monitor_config_ids}"
)
query = select(StagedResource).where(StagedResource.resource_type == resource_type)

if monitor_config_ids:
query = query.filter(StagedResource.monitor_config_id.in_(monitor_config_ids))
if not show_hidden:
query = query.where(
StagedResource.hidden == False # pylint: disable=singleton-comparison
)

return query


async def mark_resources_hidden(
db: AsyncSession,
urns: List[str],
hidden: bool,
) -> None:
"""
Marks the resources with the given URNs as hidden or not hidden
"""
logger.info(f"Marking {len(urns)} resources as hidden={hidden}")
resources = await StagedResource.get_urn_list_async(db, urns)
if not resources:
logger.warning("No resources found with the given URNs")
return
for resource in resources:
resource.hidden = hidden
await db.commit()
132 changes: 1 addition & 131 deletions tests/ctl/core/test_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"""Integration tests for the API module."""
import json
import typing
from datetime import datetime, timedelta, timezone
from datetime import datetime, timezone
from json import loads
from typing import Dict, List, Tuple
from uuid import uuid4
Expand Down Expand Up @@ -1536,102 +1536,6 @@ def test_list_with_pagination_and_multiple_filters(

assert result_json["items"][0]["fides_key"] == tcf_system.fides_key

def test_list_with_dnd_filter(
self,
test_config,
system_with_cleanup, # one that has a connection config
system_third_party_sharing, # one that doesn't have a connection config
):
result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={
"page": 1,
"size": 5,
"dnd_relevant": "true",
},
)

assert result.status_code == 200
result_json = result.json()
assert result_json["total"] == 1
assert len(result_json["items"]) == 1

# only "system_with_cleanup" has a connection config attached to it in fixtures
assert result_json["items"][0]["fides_key"] == system_with_cleanup.fides_key

def test_list_with_show_hidden(
self,
test_config,
system_hidden,
system_with_cleanup,
):

result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={
"page": 1,
"size": 5,
"show_hidden": "true",
},
)

assert result.status_code == 200
result_json = result.json()
assert result_json["total"] == 2
assert len(result_json["items"]) == 2

actual_keys = [item["fides_key"] for item in result_json["items"]]
assert system_hidden.fides_key in actual_keys
assert system_with_cleanup.fides_key in actual_keys

result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={
"page": 1,
"size": 5,
"show_hidden": "false",
},
)

assert result.status_code == 200
result_json = result.json()
assert result_json["total"] == 1
assert len(result_json["items"]) == 1

assert result_json["items"][0]["fides_key"] == system_with_cleanup.fides_key

def test_list_with_show_hidden_and_dnd_relevant(
self,
test_config,
system_hidden,
system_with_cleanup,
):

result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={
"page": 1,
"size": 5,
"show_hidden": "true",
"dnd_relevant": "true",
},
)

assert result.status_code == 200
result_json = result.json()
assert result_json["total"] == 1
assert len(result_json["items"]) == 1

assert result_json["items"][0]["fides_key"] == system_with_cleanup.fides_key

@pytest.mark.skip("Until we re-visit filter implementation")
def test_list_with_pagination_and_multiple_filters_2(
self,
Expand Down Expand Up @@ -1665,40 +1569,6 @@ def test_list_with_pagination_and_multiple_filters_2(

assert result_json["items"][0]["fides_key"] == tcf_system.fides_key

@pytest.mark.parametrize(
"vendor_deleted_date, expected_systems_count, show_deleted",
[
(datetime.now() - timedelta(days=1), 1, True),
(datetime.now() - timedelta(days=1), 0, False),
(datetime.now() + timedelta(days=1), 1, False),
(None, 1, False),
],
)
def test_vendor_deleted_systems(
self,
db,
test_config,
system_with_cleanup,
vendor_deleted_date,
expected_systems_count,
show_deleted,
):

system_with_cleanup.vendor_deleted_date = vendor_deleted_date
db.commit()

result = _api.ls(
url=test_config.cli.server_url,
headers=test_config.user.auth_header,
resource_type="system",
query_params={"show_deleted": show_deleted, "size": 50},
)

assert result.status_code == 200
result_json = result.json()

assert len(result_json["items"]) == expected_systems_count


@pytest.mark.unit
class TestSystemUpdate:
Expand Down
Loading