Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update seijikun implementation #60

Merged
merged 7 commits into from
Jan 5, 2024
Merged

Conversation

seijikun
Copy link
Contributor

@seijikun seijikun commented Jan 3, 2024

Updated with

  • SIMD for station name parsing
  • Fewer String allocations
  • Moved from double to integer maths

@gunnarmorling
Copy link
Owner

Hey, could you please rebase this one and also make sure all the tests pass (added those today to cover some corner cases):

./test.sh seijikun

Thx!

@seijikun seijikun force-pushed the main branch 2 times, most recently from 6f2bc8b to fc9773f Compare January 5, 2024 15:14
@seijikun
Copy link
Contributor Author

seijikun commented Jan 5, 2024

Hey, could you please rebase this one and also make sure all the tests pass (added those today to cover some corner cases):

./test.sh seijikun

Thx!

Okay, done! Tests seem to succeed

@gunnarmorling
Copy link
Owner

Hum, hum, I still see failures:

Validating calculate_average_seijikun.sh -- src/test/resources/samples/measurements-10.txt
WARNING: Using incubator modules: jdk.incubator.vector
java.lang.IllegalArgumentException: Negative size
	at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1255)
	at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1186)
	at dev.morling.onebrc.CalculateAverage_seijikun$ChunkReader.run(CalculateAverage_seijikun.java:183)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
java.io.IOException: Channel not open for writing - cannot extend file to required size
	at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1285)
	at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1186)
	at dev.morling.onebrc.CalculateAverage_seijikun$ChunkReader.run(CalculateAverage_seijikun.java:183)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)
java.io.IOException: Channel not open for writing - cannot extend file to required size
	at java.base/sun.nio.ch.FileChannelImpl.mapInternal(FileChannelImpl.java:1285)
	at java.base/sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:1186)
	at dev.morling.onebrc.CalculateAverage_seijikun$ChunkReader.run(CalculateAverage_seijikun.java:183)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

real	0m0.192s
user	0m0.363s
sys	0m0.059s
1c1
< {Adelaide=15.0/15.0/15.0, Cabo San Lucas=14.9/14.9/14.9, Dodoma=22.2/22.2/22.2, Halifax=12.9/12.9/12.9, Karachi=15.4/15.4/15.4, Pittsburgh=9.7/9.7/9.7, Ségou=25.7/25.7/25.7, Xi'an=24.2/24.2/24.2, Zagreb=12.2/12.2/12.2}
---
> {Adelaide=15.0/15.0/15.0, Cabo San Lucas=14.9/14.9/14.9, Dodoma=22.2/22.2/22.2, Halifax=12.9/12.9/12.9, Karachi=15.4/15.4/15.4, Pittsburgh=9.7/9.7/9.7, Ségou=25.7/25.7/25.7, Tauranga=38.2/38.2/38.2, Xi'an=24.2/24.2/24.2, Zagreb=12.2/12.2/12.2}

Originally, StationIdent was using byte[] to store names, so the extra
String allocation could be avoided. However, that produced incorrect
sorting.
Sorting is now moved to the result merging step. Here, names are
converted to Strings.
@seijikun
Copy link
Contributor Author

seijikun commented Jan 5, 2024

Whoops, sorry! You're right.
The bug depended on the amount of cores in the test machine. I rewrote the chunking logic to now be hopefully much more robust.

@gunnarmorling
Copy link
Owner

00:22.709 now, thx for participating!

@gunnarmorling gunnarmorling merged commit 36dac25 into gunnarmorling:main Jan 5, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants