Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial commit for audit logs #160

Closed
wants to merge 5 commits into from

Conversation

sinisaos
Copy link
Member

Related to #159

@codecov-commenter
Copy link

codecov-commenter commented Jun 26, 2022

Codecov Report

Merging #160 (96173db) into master (96a8a26) will increase coverage by 0.43%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #160      +/-   ##
==========================================
+ Coverage   88.38%   88.82%   +0.43%     
==========================================
  Files          29       31       +2     
  Lines        1576     1637      +61     
==========================================
+ Hits         1393     1454      +61     
  Misses        183      183              
Impacted Files Coverage Δ
piccolo_api/audit_logs/commands.py 100.00% <100.00%> (ø)
piccolo_api/audit_logs/tables.py 100.00% <100.00%> (ø)
piccolo_api/crud/endpoints.py 95.43% <100.00%> (+0.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 96a8a26...96173db. Read the comment docs.

action_time = Timestamp()
action_type = Varchar(choices=ActionType)
action_user = Varchar()
change_message = Text()
Copy link
Member

@dantownsend dantownsend Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one really nice thing would be recording which changes were made. There could be a JSON field which would have something like {'name': 'bob'} if the name field was changed to 'bob'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I will try to implement it.

Comment on lines 1071 to 1073
await AuditLog.post_patch_action(
cls, row_id=row_id, user_id=request.user.user_id
)
Copy link
Member

@dantownsend dantownsend Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should all be optional. So PiccoloCRUD should have a new argument which is audit_log_table: t.Optional[t.Type[AuditLog]] = None. And then this line would be:

if self.audit_log_table:
    await self.audit_log_table.post_patch_action(
        cls, row_id=row_id, user_id=request.user.user_id
     )

Otherwise when we release this it will be a breaking change - anyone without the AuditLog table installed will have their code break. Also, it allows people to disable this functionality if they don't want to track changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I will change that.

Comment on lines 38 to 39
change_message=f"User {cls.get_user_username(user_id)} "
f"create new row in {table._meta.tablename.title()} table",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if this said which row was created.

The ``primary key`` of authenticated user.
"""
result = cls(
action_time=datetime.now(),
Copy link
Member

@dantownsend dantownsend Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be omitted - it should default to now if you don't pass in a value.

await result.save().run()

@classmethod
async def post_patch_action(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider naming these methods slightly differently. I know what you mean by them, and they are correct. But having post_patch could be confusing - someone might misread it as being about posts and not patches.

Instead of post_, how about just record_? So it would be record_patch_action?

We will never have a pre_ version, so no need to distinguish with post_.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I will change that.

@dantownsend
Copy link
Member

@sinisaos I think this is really cool - thanks. Left some comments.

@@ -218,6 +219,7 @@ def __init__(
}
else:
self._hook_map = None # type: ignore
self.audit_log_table = audit_log_table or AuditLog
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to the or here - the user should be allowed to set it to None.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But without that we cannot add records to the AuditLog table (or I doing something wrong). If the user does not want to use the audit_logs app, just does not register it in APP_REGISTRY and can use PiccoloCRUD without audit_logs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if people don't register the audit_logs app, PiccoloCRUD will still be trying to make database queries to the AuditLog table, so we need to be able to set it to None to prevent this. I think when people use PiccoloCRUD standalone within their apps they probably won't need the audit logs - it's mostly for Piccolo Admin.

In create_admin we could add an enable_auditing argument - any tables listed will have auditing switched on in PiccoloCRUD.

def create_admin(..., audit_log_table: AuditLog = AuditLog, enable_auditing: t.List[t.Type[Table]] = [])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Later today I will try to do that in Piccolo Admin.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right about this. We need to add a few lines of code to Piccolo Admin and everything works great and we can choose which table we want audit logs for in future enable_auditing argument in create_admin.

result = cls(
action_type=cls.ActionType.DELETING,
action_user=cls.get_user_username(user_id),
change_message=f"User {cls.get_user_username(user_id)} delete row "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to get rid of the change message all together.

One option is just storing everything in JSON.

For delete, the json can be:

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

Likewise, for the save, it could be:

// Maps to a list, so we can also use it for a bulk creation endpoint.
{
    "created_ids": [1, 2]
}

Or we have a field called effected_rows which is an integer or array of integers.

It's just about making it easy to query - imagine you wanted to know who deleted row 42. You would have to something like:

await AuditLog.select().where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.change_message.like('%42%')
)

It's quite imprecise, ad would be better to just do:

await AuditLog.select(AuditLog.action_user).where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.effected_rows.any(42)
)

Copy link
Member

@dantownsend dantownsend Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, table name should probably be a column too, to make it easier to filter out the changes on a certain table. I'd be tempted to add table_name and table_class - again, just to make it easy to filter. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to get rid of the change message all together.

One option is just storing everything in JSON.

For delete, the json can be:

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

Likewise, for the save, it could be:

// Maps to a list, so we can also use it for a bulk creation endpoint.
{
    "created_ids": [1, 2]
}

But all actions are for endpoints that work with a single record. We don't have the bulk_delete option yet (PR here) and we don't have bulk_creation at all. Audit logs would be primarily used in Piccolo Admin and there we delete multiple records by looping through the delete_single endpoint.

Or we have a field called effected_rows which is an integer or array of integers.

That would be possible if we have only primary_key type Serial. But we also have primary_key type string and UUID, and we can't use Integer column for that (which will be great for precise filtering). We need to use Varchar and query would be very similar for effected_row and change_message using like.

It's just about making it easy to query - imagine you wanted to know who deleted row 42. You would have to something like:

await AuditLog.select().where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.change_message.like('%42%')
)

It's quite imprecise, ad would be better to just do:

await AuditLog.select(AuditLog.action_user).where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.effected_rows.any(42)
)

Here is screenshot with new columns.

changed_columns

I don't think we should remove the change_message column because it nicely describes the action the user has taken. Filtering is maybe imprecise, but we can't use the Integer column for effected_row.

Similarly, table name should probably be a column too, to make it easier to filter out the changes on a certain table. I'd be tempted to add table_name and table_class - again, just to make it easy to filter. What do you think?

I added table_name, but I don't understand what the table_class column would be for and what it should go into that column?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points.

But all actions are for endpoints that work with a single record. We don't have the bulk_delete option yet (#145) and we don't have bulk_creation at all. Audit logs would be primarily used in Piccolo Admin and there we delete multiple records by looping through the delete_single endpoint.

You're right, there's no bulk action endpoints at the moment. But I was just thinking about the future. Imagine someone deletes 1000 rows - we would then have to insert 1000 rows into the audit table.

That would be possible if we have only primary_key type Serial. But we also have primary_key type string and UUID, and we can't use Integer column for that (which will be great for precise filtering). We need to use Varchar and query would be very similar for effected_row and change_message using like.

Good point - the primary key might not be an integer. Storing them all as strings is OK.

I don't think we should remove the change_message column because it nicely describes the action the user has taken. Filtering is maybe imprecise, but we can't use the Integer column for effected_row.

Maybe change_message could be achieved using get_readable instead.

I added table_name, but I don't understand what the table_class column would be for and what it should go into that column?

I was just thinking that if there's a table class called MyTable, and it refers to the database table my_table, maybe we would want to store them both. But it's not that important. Just need to decide which one is best to store - my_table or MyTable.

Copy link
Member Author

@sinisaos sinisaos Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, there's no bulk action endpoints at the moment. But I was just thinking about the future. Imagine someone deletes 1000 rows - we would then have to insert 1000 rows into the audit table.

I agree, but we don’t currently have the classic bulk delete in Piccolo Admin (we loop through delete_single endpoint) and audit logs would primarily be used in Piccolo Admin (page size limit is 100 per page, which is not so bad).
Once we have the proper bulk_delete, we can change the delete action for the audit logs to something like this

# in piccolo_api.crud.endpoints.py
@apply_validators
async def delete_bulk(
self, request: Request, params: t.Optional[t.Dict[str, t.Any]] = None
) -> Response:
    """
    Bulk deletes rows - query parameters are used for filtering.
    """
    params = self._clean_data(params) if params else {}
    split_params = self._split_params(params)
    split_params_ids = split_params.ids.split(",")
  
    try:
        query: t.Union[
            Select, Count, Objects, Delete
        ] = self.table.delete()
        try:
            # Serial or UUID primary keys enabled in query params
            value_type = self.table._meta.primary_key.value_type
            ids = [value_type(item) for item in split_params_ids]
            query_ids = query.where(
                self.table._meta.primary_key.is_in(ids)
            )
            query = self._apply_filters(query_ids, split_params)
            # this is not tested but it should be work
            # in ids list can be one or more items 
            deleted_rows_map = {"deleted_ids": ids}
            try:
                if self.audit_log_table:
                    await self.audit_log_table.record_delete_action(
                        self.table,
                        deleted_rows_pk=deleted_rows_map,
                        user_id=request.user.user_id,
                    )
            except Exception as exception:
                logger.log(msg=f"{exception}", level=logging.WARNING)
        except ValueError:
            query = self._apply_filters(query, split_params)
    except MalformedQuery as exception:
         return Response(str(exception), status_code=400)

    await query.run()
    return Response(status_code=204)

# in piccolo_api.audit_logs.tables.py

class AuditLog(Table):
    ...
    deleted_rows = JSON()
    ...
    @classmethod
    async def record_delete_action(
        cls,
        table: t.Type[Table],
        deleted_rows_pk: t.Dict[str, t.Any],
        user_id: int,
    ):
        result = cls(
            action_type=cls.ActionType.deleting,
            action_user=cls.get_user_username(user_id),
            deleted_rows=deleted_rows_pk,
        )
        await result.save().run()

But for now we don't have that option because we're looping through the delete_single endpoint and we can't store multiple IDs in a row like this

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

We should stick to what we have at the moment, and later we can always change (when we have bulk_delete we will have to make changes also in Piccolo Admin).

I was just thinking that if there's a table class called MyTable, and it refers to the database table my_table, maybe we would want to store them both. But it's not that important. Just need to decide which one is best to store - my_table or MyTable.

I think table_name is a better option because it's Varchar and we can use the filter sidebar to search and filter records.

Comment on lines 14 to 16
CREATING = "CREATING"
UPDATING = "UPDATING"
DELETING = "DELETING"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to make these lower case.

I looked at the Python docs, and all of the examples are uppercase, which I found surprising:

https://docs.python.org/3/library/enum.html

However, in Piccolo they're always been lowercase, and I think it's a bit more convenient to type if lowercase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will change that.

@sinisaos sinisaos marked this pull request as draft March 8, 2023 06:40
@sinisaos
Copy link
Member Author

sinisaos commented Mar 8, 2023

Remainder: when #145 is merged we can finish this.

@sinisaos sinisaos closed this Sep 1, 2023
@sinisaos sinisaos deleted the audit_logs branch September 1, 2023 12:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants