-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add linl33 v2 #678
Add linl33 v2 #678
Conversation
893730b
to
1ed0358
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's faster for the standard key set, but takes unduly lon for 10K keys (see create_measurements_3.sh). Could you take a look, would be sad to see it with DNF on the 10K leaderboard (I am evaluating the top 25 entries against that key seet).
if [ -f ${{ format('src/main/java-22/dev/morling/onebrc/CalculateAverage_{0}.java', github.event.pull_request.user.login || '') }} ]; then | ||
sdk install java 22.ea.32-open || true | ||
sdk use java 22.ea.32-open | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed, the evaluate.sh
script takes care of installing all required JDKs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_ci.sh
installs the JDKs but it runs after maven so without that line, my class under java-22/
wouldn't be compiled.
Here's the original run without the change
https://github.com/gunnarmorling/1brc/actions/runs/7724135717/job/21055662835
The 10k dataset takes ~40s on my laptop (6 cores, no HT) now. Still not fast but I don't think there's anything quick I can do to fix it now. |
Hi @gunnarmorling ! One suggestion for when you run the 10k:
Because the 10k is what this challenge is really asking. The top 10 for the 10k might be really different. =) |
It's pretty much doing that (see create_measurements_3.sh which creates that file).
Yes, agreed. I had not anticipated this, and in hindsight I should have not exposed the specific key set. But by the time folks started to optimize for the 413 key set and I realized it, I felt it was too late for changing the evaluation key set, as it would have created lots of frustration. In the end, people did what they were asked to do, so that's totally fair :)
Yes, which is why there is that separate leaderboard for the 10K keyset which I update occassionally by running the top 25 or so entries from the regular leaderboard against that keyset. Now there could be entries which score lower on the standard leaderboard but would be in the top 25 for 10K, but I felt that's acceptable, given it's just a side eval. Learning a lot myself on how to run these kinds of things :) The reason I am asking submitters of top entries like this one to also make sure they do reasonably well with 10K is to make sure that they pass this test, so it's another probe on correctness. |
Thanks a lot for the update, @linl33. This looks good now. A bit faster on the standard key set and also comes in at ~19 sec for 10K with the correct results. Thx for participating in 1BRC and creating one of the top entries!
|
@gunnarmorling Thank you for organizing this event! |
Hey @linl33! Congrats again on being in the Top 20 of the One Billion Row Challenge! To celebrate this amazing achievement, I would like to send you a 1BRC t-shirt and coffee mug. To claim your prize, fill out this form by Feb 18. After submitting the form, please provide a comment with the random value you've specified in the form, so that I know it is you who submitted it. All data entered will solely be used in relation to processing this shipment. Shipments can be sent to any country listed here or here (I'll use whichever one is cheaper for me to ship to your location). A big thank you to Decodable for sponsoring these prizes! Thanks a lot for participating in 1BRC, --Gunnar |
Hi @gunnarmorling, My 2 random numbers are
|
Check List:
./mvnw verify
and the project builds successfully./test.sh <username>
shows no differences between expected and actual outputs)calculate_average_<username>.sh
(make sure to match casing of your GH user name) and is executablecalculate_average_baseline.sh