-
-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement rolling groupby #460
Conversation
if self.groupby_kwargs is not None: | ||
return type(parent)( | ||
type(self)(self.frame[columns], *self.operands[1:]), | ||
*parent.operands[1:], | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm honestly not even sure what this is doing and it does feel like we're duplicating work but this is what I found for the groupby implementations and this seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This drops all columns except the parent columns and the by columns from the df before the groupby and then drops the by column after the groupby if not in parent.columns
Yes this is needed for groupby to avoid dropping too many columns
dask_expr/tests/test_groupby.py
Outdated
|
||
actual = ddf.groupby("group1").rolling("1D").sum()["column1"] | ||
expected = df.groupby("group1").rolling("1D").sum()["column1"] | ||
actual.optimize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, we would test this against
df[["column1", "group1"]].groupby("group1").rolling("1D").sum()["column1"].optimize()._name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
69c0651
to
ed090dc
Compare
ed090dc
to
7b8a654
Compare
thx |
So, I'm just expanding the existing
Rolling
because the groupby rolling is actually doing the same thing as the ordinary rolling just with a local groupby. The implementation feels very unclean. I'll review the class hierarchy again if there is a more elegant way to modify the logic here. Any pointers appreciated