Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial commit for audit logs #160

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/source/audit_logs/commands.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Commands
========

If you've registered the ``audit_logs`` app in your ``piccolo_conf.py`` file
(see the :ref:`migrations docs <AuditLogMigrations>`), it gives you access to a
custom command.

clean
-----

If you run the following on the command line, it will delete all logs
from the database.

.. code-block:: bash

piccolo audit_logs clean
63 changes: 63 additions & 0 deletions docs/source/audit_logs/tables.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
Tables
======

``audit_logs`` is a ``Piccolo`` app that records changes made by users to database tables.
We store the audit logs in :class:`AuditLog <piccolo_api.audit_logs.tables.AuditLog>`.

-------------------------------------------------------------------------------

.. _AuditLogMigrations:

Migrations
----------

We recommend creating the ``audit_logs`` tables using migrations.

You can add ``piccolo_api.audit_logs.piccolo_app`` to the ``apps`` arguments
of the :class:`AppRegistry <piccolo.conf.apps.AppRegistry>` in ``piccolo_conf.py``.

.. code-block:: bash

APP_REGISTRY = AppRegistry(
apps=[
...
"piccolo_api.audit_logs.piccolo_app",
...
]
)

To learn more about Piccolo apps, see the `Piccolo docs <https://piccolo-orm.readthedocs.io/en/latest/piccolo/projects_and_apps/index.html>`_.

To run the migrations and create the table, run:

.. code-block:: bash

piccolo migrations forwards audit_logs

-------------------------------------------------------------------------------

Creating them manually
----------------------

If you prefer not to use migrations, and want to create them manually, you can
do this instead:

.. code-block:: python

from piccolo_api.audit_logs.tables import AuditLog
from piccolo.tables import create_db_tables_sync

create_db_tables_sync(AuditLog, if_not_exists=True)

-------------------------------------------------------------------------------

Source
------

AuditLog
~~~~~~~~

.. currentmodule:: piccolo_api.audit_logs.tables

.. autoclass:: AuditLog
:members:
7 changes: 7 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,13 @@ ASGI app, covering authentication, security, and more.
./change_password/index
./advanced_auth/index

.. toctree::
:caption: Audit logs
:maxdepth: 1

./audit_logs/tables.rst
./audit_logs/commands.rst

.. toctree::
:caption: Contributing
:maxdepth: 1
Expand Down
Empty file.
10 changes: 10 additions & 0 deletions piccolo_api/audit_logs/commands.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from .tables import AuditLog


async def clean():
"""
Removes all audit logs.
"""
print("Removing audit logs ...")
await AuditLog.delete(force=True).run()
print("Successfully removed audit logs")
24 changes: 24 additions & 0 deletions piccolo_api/audit_logs/piccolo_app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""
Import all of the Tables subclasses in your app here, and register them with
the APP_CONFIG.
"""

import os

from piccolo.conf.apps import AppConfig

from .commands import clean
from .tables import AuditLog

CURRENT_DIRECTORY = os.path.dirname(os.path.abspath(__file__))


APP_CONFIG = AppConfig(
app_name="audit_logs",
migrations_folder_path=os.path.join(
CURRENT_DIRECTORY, "piccolo_migrations"
),
table_classes=[AuditLog],
migration_dependencies=[],
commands=[clean],
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
from enum import Enum

from piccolo.apps.migrations.auto.migration_manager import MigrationManager
from piccolo.columns.column_types import JSON, Text, Timestamp, Varchar
from piccolo.columns.defaults.timestamp import TimestampNow
from piccolo.columns.indexes import IndexMethod

ID = "2022-06-26T22:48:09:433352"
VERSION = "0.80.0"
DESCRIPTION = ""


async def forwards():
manager = MigrationManager(
migration_id=ID, app_name="audit_logs", description=DESCRIPTION
)

manager.add_table("AuditLog", tablename="audit_log")

manager.add_column(
table_class_name="AuditLog",
tablename="audit_log",
column_name="action_time",
db_column_name="action_time",
column_class_name="Timestamp",
column_class=Timestamp,
params={
"default": TimestampNow(),
"null": False,
"primary_key": False,
"unique": False,
"index": False,
"index_method": IndexMethod.btree,
"choices": None,
"db_column_name": None,
"secret": False,
},
)

manager.add_column(
table_class_name="AuditLog",
tablename="audit_log",
column_name="action_type",
db_column_name="action_type",
column_class_name="Varchar",
column_class=Varchar,
params={
"length": 255,
"default": "",
"null": False,
"primary_key": False,
"unique": False,
"index": False,
"index_method": IndexMethod.btree,
"choices": Enum(
"ActionType",
{
"CREATING": "CREATING",
"UPDATING": "UPDATING",
"DELETING": "DELETING",
},
),
"db_column_name": None,
"secret": False,
},
)

manager.add_column(
table_class_name="AuditLog",
tablename="audit_log",
column_name="action_user",
db_column_name="action_user",
column_class_name="Varchar",
column_class=Varchar,
params={
"length": 255,
"default": "",
"null": False,
"primary_key": False,
"unique": False,
"index": False,
"index_method": IndexMethod.btree,
"choices": None,
"db_column_name": None,
"secret": False,
},
)

manager.add_column(
table_class_name="AuditLog",
tablename="audit_log",
column_name="change_message",
db_column_name="change_message",
column_class_name="Text",
column_class=Text,
params={
"default": "",
"null": False,
"primary_key": False,
"unique": False,
"index": False,
"index_method": IndexMethod.btree,
"choices": None,
"db_column_name": None,
"secret": False,
},
)

manager.add_column(
table_class_name="AuditLog",
tablename="audit_log",
column_name="changes_in_row",
db_column_name="changes_in_row",
column_class_name="JSON",
column_class=JSON,
params={
"default": "{}",
"null": False,
"primary_key": False,
"unique": False,
"index": False,
"index_method": IndexMethod.btree,
"choices": None,
"db_column_name": None,
"secret": False,
},
)

return manager
118 changes: 118 additions & 0 deletions piccolo_api/audit_logs/tables.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
import typing as t
import uuid
from enum import Enum

from piccolo.apps.user.tables import BaseUser
from piccolo.columns import JSON, Text, Timestamp, Varchar
from piccolo.table import Table


class AuditLog(Table):
class ActionType(str, Enum):
"""An enumeration of AuditLog table actions type."""

CREATING = "CREATING"
UPDATING = "UPDATING"
DELETING = "DELETING"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to make these lower case.

I looked at the Python docs, and all of the examples are uppercase, which I found surprising:

https://docs.python.org/3/library/enum.html

However, in Piccolo they're always been lowercase, and I think it's a bit more convenient to type if lowercase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I will change that.


action_time = Timestamp()
action_type = Varchar(choices=ActionType)
action_user = Varchar()
change_message = Text()
Copy link
Member

@dantownsend dantownsend Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one really nice thing would be recording which changes were made. There could be a JSON field which would have something like {'name': 'bob'} if the name field was changed to 'bob'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea. I will try to implement it.

changes_in_row = JSON()

@classmethod
async def record_save_action(
cls,
table: t.Type[Table],
user_id: int,
new_row_id=t.Union[str, uuid.UUID, int],
):
"""
A method for tracking creating record actions.

:param table:
A table for which we monitor activities.
:param user_id:
The ``primary key`` of authenticated user.
"""
result = cls(
action_type=cls.ActionType.CREATING,
action_user=cls.get_user_username(user_id),
change_message=f"User {cls.get_user_username(user_id)} "
f"create row {new_row_id} in {table._meta.tablename.title()} "
f"table",
)
await result.save().run()

@classmethod
async def record_patch_action(
cls,
table: t.Type[Table],
row_id: t.Union[str, uuid.UUID, int],
user_id: int,
changes_in_row: t.Dict[str, t.Any],
):
"""
A method for tracking updating record actions.

:param table:
A table for which we monitor activities.
:param row_id:
The ``primary key`` of the table for which we
monitor activities.
:param user_id:
The ``primary key`` of authenticated user.
:param changes_in_row:
JSON with all changed columns in the existing row.
"""
result = cls(
action_type=cls.ActionType.UPDATING,
action_user=cls.get_user_username(user_id),
change_message=f"User {cls.get_user_username(user_id)} update row "
f"{row_id} in {table._meta.tablename.title()} table",
changes_in_row=changes_in_row,
)
await result.save().run()

@classmethod
async def record_delete_action(
cls,
table: t.Type[Table],
row_id: t.Union[str, uuid.UUID, int],
user_id: int,
):
"""
A method for tracking deletion record actions.

:param table:
A table for which we monitor activities.
:param row_id:
The ``primary key`` of the table for which we
monitor activities.
:param user_id:
The ``primary key`` of authenticated user.
"""
result = cls(
action_type=cls.ActionType.DELETING,
action_user=cls.get_user_username(user_id),
change_message=f"User {cls.get_user_username(user_id)} delete row "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to get rid of the change message all together.

One option is just storing everything in JSON.

For delete, the json can be:

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

Likewise, for the save, it could be:

// Maps to a list, so we can also use it for a bulk creation endpoint.
{
    "created_ids": [1, 2]
}

Or we have a field called effected_rows which is an integer or array of integers.

It's just about making it easy to query - imagine you wanted to know who deleted row 42. You would have to something like:

await AuditLog.select().where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.change_message.like('%42%')
)

It's quite imprecise, ad would be better to just do:

await AuditLog.select(AuditLog.action_user).where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.effected_rows.any(42)
)

Copy link
Member

@dantownsend dantownsend Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, table name should probably be a column too, to make it easier to filter out the changes on a certain table. I'd be tempted to add table_name and table_class - again, just to make it easy to filter. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be tempted to get rid of the change message all together.

One option is just storing everything in JSON.

For delete, the json can be:

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

Likewise, for the save, it could be:

// Maps to a list, so we can also use it for a bulk creation endpoint.
{
    "created_ids": [1, 2]
}

But all actions are for endpoints that work with a single record. We don't have the bulk_delete option yet (PR here) and we don't have bulk_creation at all. Audit logs would be primarily used in Piccolo Admin and there we delete multiple records by looping through the delete_single endpoint.

Or we have a field called effected_rows which is an integer or array of integers.

That would be possible if we have only primary_key type Serial. But we also have primary_key type string and UUID, and we can't use Integer column for that (which will be great for precise filtering). We need to use Varchar and query would be very similar for effected_row and change_message using like.

It's just about making it easy to query - imagine you wanted to know who deleted row 42. You would have to something like:

await AuditLog.select().where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.change_message.like('%42%')
)

It's quite imprecise, ad would be better to just do:

await AuditLog.select(AuditLog.action_user).where(
    AuditLog.action_type == AuditLog.ActionType.DELETING,
    AuditLog.effected_rows.any(42)
)

Here is screenshot with new columns.

changed_columns

I don't think we should remove the change_message column because it nicely describes the action the user has taken. Filtering is maybe imprecise, but we can't use the Integer column for effected_row.

Similarly, table name should probably be a column too, to make it easier to filter out the changes on a certain table. I'd be tempted to add table_name and table_class - again, just to make it easy to filter. What do you think?

I added table_name, but I don't understand what the table_class column would be for and what it should go into that column?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points.

But all actions are for endpoints that work with a single record. We don't have the bulk_delete option yet (#145) and we don't have bulk_creation at all. Audit logs would be primarily used in Piccolo Admin and there we delete multiple records by looping through the delete_single endpoint.

You're right, there's no bulk action endpoints at the moment. But I was just thinking about the future. Imagine someone deletes 1000 rows - we would then have to insert 1000 rows into the audit table.

That would be possible if we have only primary_key type Serial. But we also have primary_key type string and UUID, and we can't use Integer column for that (which will be great for precise filtering). We need to use Varchar and query would be very similar for effected_row and change_message using like.

Good point - the primary key might not be an integer. Storing them all as strings is OK.

I don't think we should remove the change_message column because it nicely describes the action the user has taken. Filtering is maybe imprecise, but we can't use the Integer column for effected_row.

Maybe change_message could be achieved using get_readable instead.

I added table_name, but I don't understand what the table_class column would be for and what it should go into that column?

I was just thinking that if there's a table class called MyTable, and it refers to the database table my_table, maybe we would want to store them both. But it's not that important. Just need to decide which one is best to store - my_table or MyTable.

Copy link
Member Author

@sinisaos sinisaos Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, there's no bulk action endpoints at the moment. But I was just thinking about the future. Imagine someone deletes 1000 rows - we would then have to insert 1000 rows into the audit table.

I agree, but we don’t currently have the classic bulk delete in Piccolo Admin (we loop through delete_single endpoint) and audit logs would primarily be used in Piccolo Admin (page size limit is 100 per page, which is not so bad).
Once we have the proper bulk_delete, we can change the delete action for the audit logs to something like this

# in piccolo_api.crud.endpoints.py
@apply_validators
async def delete_bulk(
self, request: Request, params: t.Optional[t.Dict[str, t.Any]] = None
) -> Response:
    """
    Bulk deletes rows - query parameters are used for filtering.
    """
    params = self._clean_data(params) if params else {}
    split_params = self._split_params(params)
    split_params_ids = split_params.ids.split(",")
  
    try:
        query: t.Union[
            Select, Count, Objects, Delete
        ] = self.table.delete()
        try:
            # Serial or UUID primary keys enabled in query params
            value_type = self.table._meta.primary_key.value_type
            ids = [value_type(item) for item in split_params_ids]
            query_ids = query.where(
                self.table._meta.primary_key.is_in(ids)
            )
            query = self._apply_filters(query_ids, split_params)
            # this is not tested but it should be work
            # in ids list can be one or more items 
            deleted_rows_map = {"deleted_ids": ids}
            try:
                if self.audit_log_table:
                    await self.audit_log_table.record_delete_action(
                        self.table,
                        deleted_rows_pk=deleted_rows_map,
                        user_id=request.user.user_id,
                    )
            except Exception as exception:
                logger.log(msg=f"{exception}", level=logging.WARNING)
        except ValueError:
            query = self._apply_filters(query, split_params)
    except MalformedQuery as exception:
         return Response(str(exception), status_code=400)

    await query.run()
    return Response(status_code=204)

# in piccolo_api.audit_logs.tables.py

class AuditLog(Table):
    ...
    deleted_rows = JSON()
    ...
    @classmethod
    async def record_delete_action(
        cls,
        table: t.Type[Table],
        deleted_rows_pk: t.Dict[str, t.Any],
        user_id: int,
    ):
        result = cls(
            action_type=cls.ActionType.deleting,
            action_user=cls.get_user_username(user_id),
            deleted_rows=deleted_rows_pk,
        )
        await result.save().run()

But for now we don't have that option because we're looping through the delete_single endpoint and we can't store multiple IDs in a row like this

// Maps to a list, so we can also use it for a bulk delete endpoint.
{
    "deleted_ids": [1, 2]
}

We should stick to what we have at the moment, and later we can always change (when we have bulk_delete we will have to make changes also in Piccolo Admin).

I was just thinking that if there's a table class called MyTable, and it refers to the database table my_table, maybe we would want to store them both. But it's not that important. Just need to decide which one is best to store - my_table or MyTable.

I think table_name is a better option because it's Varchar and we can use the filter sidebar to search and filter records.

f"{row_id} in {table._meta.tablename.title()} table",
)
await result.save().run()

@classmethod
def get_user_username(cls, user_id: int) -> str:
"""
Returns the username of authenticated user.

:param user_id:
The ``primary key`` of authenticated user.
"""
user = (
BaseUser.select(BaseUser.username)
.where(BaseUser._meta.primary_key == user_id)
.first()
.run_sync()
)
return user["username"]
Loading