Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP PR for getting CI working as desired #10

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

cshaley
Copy link
Collaborator

@cshaley cshaley commented Nov 16, 2023

No description provided.

@cshaley
Copy link
Collaborator Author

cshaley commented Nov 16, 2023

Looks like the env file changes need to be merged in order to be pulled into the build

Note I skipped test_logical because it's failing. It looks like the cause is that logical.avro is not a valid avro file.

@cshaley
Copy link
Collaborator Author

cshaley commented Nov 17, 2023

@martindurant any idea where to look for resolving this error? It'll take me a good chunk of diving into this to understand what is the right way to fix this.

@martindurant
Copy link
Member

The values seem of completely the wrong order, so it could be an int32->64 kind of thing. In pandas v2, more time types became allowed.
I don't have a huge amount of time right now, but I can move this repo to the intake org and make you collaborator if you like.

@cshaley
Copy link
Collaborator Author

cshaley commented Nov 17, 2023

sure, I'm fine with that.

This looks like a bug in fastparquet.

dask/fastparquet@f2f53b7

t should always be a dtype. I think it should just be t = str(t) + "[ns]" - then the following line (d = np.empty(0, dtype=t)) works as expected. Will raise a PR to fastparquet.

@martindurant
Copy link
Member

Before going further, you might be interested by https://awkward-array.org/doc/main/reference/generated/ak.from_avro_file.html in awkward-array, which I was just made aware of. This respects the truly nested nature of avro and might be what you want, but you can also transform to arrow or pandas if the dataset is really tabular.

@cshaley
Copy link
Collaborator Author

cshaley commented Nov 20, 2023

Before going further, you might be interested by https://awkward-array.org/doc/main/reference/generated/ak.from_avro_file.html in awkward-array

Hmm, the reason I was updating uavro is to get intake-avro working. I wasn't familiar with awkward-array, but I bet I could drop it in to replace uavro in intake-avro. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants