Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA_vaidhy: 1.9 seconds vs 3 minute 20 seconds. (4.7 seconds on 10K) #699

Closed
wants to merge 71 commits into from

Conversation

anitasv
Copy link
Contributor

@anitasv anitasv commented Jan 31, 2024

Check List:

  • You have run ./mvnw verify and the project builds successfully
  • Tests pass (./test.sh <username> shows no differences between expected and actual outputs)
  • All formatting changes by the build are committed
  • Your launch script is named calculate_average_<username>.sh (make sure to match casing of your GH user name) and is executable
  • Output matches that of calculate_average_baseline.sh
  • For new entries, or after substantial changes: When implementing custom hash structures, please point to where you deal with hash collisions (line number)

Collision Handling:

Line 73-86 is the first of the many in the code base. The way it works is by imagining a string a list of longs (8 bytes), since it is not always a multiple of 8, an extra long with padded zeroes are kept called "suffix". So we check 8 bytes until the multiple, and check the extra suffix. This ensures that we don't have to check last few bytes. In case string is lesser than 8 bytes then only suffix check is enough because length 7 will indicate null will be part of suffix, which can't be part of UTF-8. And when length <=8 hash is already suffix because we use xor to create hash. So no need to check suffix as well.

10K keys dataset.

It also matches the output, had to increase hash size to twice we had for better performance.

  • Execution time: 1.9 seconds
  • Execution time of reference implementation: 3minutes 20 seconds
  • Execution time in 10K dataset : 4.7 seconds

@gunnarmorling
Copy link
Owner

Could you rebase and squash this into a single commit off of current main? Thx!

@anitasv
Copy link
Contributor Author

anitasv commented Jan 31, 2024

Opened another PR: #708

@anitasv anitasv closed this Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants