Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce typeguard in tests #2073

Merged
merged 9 commits into from
Dec 11, 2023
Merged

Conversation

Kakadus
Copy link
Collaborator

@Kakadus Kakadus commented Nov 20, 2023

close #2066

Comment on lines 76 to 83
class Meta(type):
def __instancecheck__(cls, instance: object) -> TypeGuard["ValidCourseData"]:
if not isinstance(instance, CourseData):
return False
return all_fields_valid(instance)


class ValidCourseData(CourseData, metaclass=Meta):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, didn't know about __instancecheck__. On one hand this is so cool, but on the other hand it allows programmers to do so many things that are just wrong :D

Two notes:

  1. should probably be named ValidCourseDataMeta instead of just Meta?
  2. does this allow us to just use isinstance in the importer when we currently check all_fields_valid? If yes, is that desirable? Feels consistent to me, but I'm unsure.

Copy link
Collaborator Author

@Kakadus Kakadus Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if 2., should all_fields_valid be merged with this instancecheck? What do you think @niklasmohrin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we are overwriting isinstance(x, ValidCourseData) here. Do you know if this has any implications internally? Afaik, all Python objects have their own __class__ field to look up methods etc. So if we had a method on ValidCourseData, we could have the situation that an object passes isinstance, but still uses a method from a parent, right? This is odd to think about, can't we just have def as_valid(self) -> Optional[ValidCourseData] method where we construct the object of the child class?

(If the answer is no, I would be fine with merging the two methods)

Copy link
Collaborator Author

@Kakadus Kakadus Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I understood you correctly.
As far as I understood it, isinstance(x, cls) will use the implementation of __instancecheck__ of the metaclass of cls, for all and every x. The metaclass is also "inherited" when subclassing ValidCourseData in this case. So, the answer should be no I guess? (I may have misunderstood the question)

Context: typeguard uses isinstance to determine if the annotated return type matches the type of the returned object. typeguard does not support TypeGuards though (originally intended for static typechecking only) and complains that valid CourseData objects aren't actually ValidCourseData objects at runtime. This is just a limitation of typeguard.

Since metaclasses always cause some headache (see this thread), maybe it is still better to use as_valid all along? Then I feel like we may be loosing some conveniences of TypeGuard though.

We can also just don't do 2., i.e. revert 123af29 and leave the __instancecheck__ meta thingy only for typeguard and continue to use all_fields_valid

Copy link
Collaborator Author

@Kakadus Kakadus Nov 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading your question again: You mean when we override a function of CourseData in ValidCourseData? Then indeed, the implementation in CourseData should be used, because the type of the object did not change. Since we are talking about TypeGuards here, this is not really relevant because you won't override a function in ValidCourseData, if I understood TypeGuard correctly (a "feature" is that the runtime type remains the same). You would only change the function parameter types / return types in ValidCourseData

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really convinced to be honest, I feel that we should generally avoid playing with isinstance, because it is a function that programmers have strong assumptions about and we are breaking them here. In particular, this typing-related change has implications in non-typing situations, because we actually changed the code.

In the case here, the change is probably harmless and the easiest we will get. If you two prefer the current approach here, I am fine with that, still, I would vote against __instancecheck__ though :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be ok with it here since ValidCourseData is a dataclass with no methods, but I see your point, so I'm also open to other workarounds / solutions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, reverted 123af29 first of all

Copy link
Member

@niklasmohrin niklasmohrin Dec 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After taking another look at the usage of ValidCourseData, I am also fine with moving forward with the current version. Constructing a true instance of ValidCourseData would mean that we need to inject this new instance back into the process, adding another Mapper. Even worse, we would have to add ValidEnrollmentParsedRow etc. which have ValidCourseData instead of CouseData.

Maybe, as a fun exercise, we could come up with a class decorator that adds a "validated" version of a class into the class namespace, so we would have

@partial_invalid
@dataclass
class CourseData:
    """Holds information about a course, retrieved from an import file."""

    name_de: str
    name_en: str
    degrees: MaybeInvalid[set[Degree]]
    course_type: MaybeInvalid[CourseType]
    is_graded: MaybeInvalid[bool]
    responsible_email: str

def foo(course_data: CourseData):
    bar: CourseData.Validated = course_data.validate()
    # ...

... and it recurses into member variables that are annotated with partial_invalid. It is clearly overkill and I am unsure how well typecheckers will be able to see through this, but it sounds like a fun project :^)

manage.py Outdated Show resolved Hide resolved
@Kakadus Kakadus force-pushed the typeguard branch 2 times, most recently from 0daed1b to 4451d51 Compare November 27, 2023 19:10
@Kakadus
Copy link
Collaborator Author

Kakadus commented Nov 27, 2023

Github does not update to the last changes somehow... Maybe my changes will come up soon...

@richardebeling richardebeling marked this pull request as ready for review November 27, 2023 20:05
@Kakadus Kakadus force-pushed the typeguard branch 3 times, most recently from 6e7598d to 123af29 Compare November 27, 2023 20:17
Copy link
Member

@niklasmohrin niklasmohrin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we verified that this catches the kinds of errors we expect it to / are there any examples yet?

Some final remarks:

evap/settings.py Outdated Show resolved Hide resolved
evap/evaluation/tools.py Show resolved Hide resolved
evap/evaluation/tools.py Outdated Show resolved Hide resolved
@niklasmohrin niklasmohrin merged commit 162500b into e-valuation:main Dec 11, 2023
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Check correctness of type annotations at test runtime
3 participants