Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contextful validity errors, null-byte specs fix #9

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

diamondburned
Copy link
Contributor

@diamondburned diamondburned commented Mar 22, 2021

This pull request adds meaningful errors while decoding as well as proper null-byte encoding.

Benchmarks

New is the pull request and old is upstream/main. Test performed on a laptop with an Intel Core i5-8250U.

name          old time/op    new time/op    delta
Encode-8         269ns ±14%     269ns ±21%     ~     (p=1.000 n=8+8)
EncodeTo-8      1.78µs ±13%    1.70µs ± 9%     ~     (p=0.266 n=8+8)
Decode-8        2.74µs ±12%    3.40µs ± 6%  +24.26%  (p=0.000 n=8+8) <-- more is slower
DecodeFrom-8    3.38µs ±16%    4.69µs ± 7%  +38.76%  (p=0.000 n=8+8) <-- more is slower

name          old speed      new speed      delta
Encode-8      44.8MB/s ±15%  45.2MB/s ±19%     ~     (p=1.000 n=8+8)
EncodeTo-8     185MB/s ±14%   193MB/s ± 9%     ~     (p=0.279 n=8+8)
Decode-8       120MB/s ±11%    96MB/s ± 6%  -19.68%  (p=0.000 n=8+8) <-- less is slower
DecodeFrom-8  97.5MB/s ±14%  69.9MB/s ± 7%  -28.33%  (p=0.000 n=8+8) <-- less is slower

The performance loss in DecodeFrom is probably scanner.Text() having to allocate a new string for each chunk. The speed is still fairly decent, though.

@codecov-io
Copy link

codecov-io commented Mar 22, 2021

Codecov Report

Merging #9 (20a958c) into main (2b0e3cd) will decrease coverage by 6.88%.
The diff coverage is 50.94%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main       #9      +/-   ##
==========================================
- Coverage   66.01%   59.13%   -6.89%     
==========================================
  Files           1        1              
  Lines         103      115      +12     
==========================================
  Hits           68       68              
- Misses         27       36       +9     
- Partials        8       11       +3     
Impacted Files Coverage Δ
bottom/bottom.go 59.13% <50.94%> (-6.89%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b0e3cd...20a958c. Read the comment docs.

@diamondburned diamondburned force-pushed the io-sanitize branch 2 times, most recently from 20a958c to b5ec11a Compare March 22, 2021 01:38
@nihaals nihaals linked an issue Mar 22, 2021 that may be closed by this pull request
bottom/bottom.go Outdated Show resolved Hide resolved
@nihaals
Copy link
Member

nihaals commented Mar 26, 2021

Based on GitHub Actions benchmarks, BenchmarkDecodeFrom is the main benchmark affected, the rest looks like random variance.

It would be nice to have some tests for invalid inputs, maybe even checking the position (although that might be overkill).

@diamondburned
Copy link
Contributor Author

maybe even checking the position

type InvalidRuneError rune can be feasibly changed to something else to add this, for example

type InvalidRuneError struct {
    Rune     rune
    Position int
}

@diamondburned diamondburned requested a review from nihaals March 26, 2021 20:49
@nihaals
Copy link
Member

nihaals commented Mar 26, 2021

type InvalidRuneError rune can be feasibly changed to something else to add this, for example

type InvalidRuneError struct {
    Rune     rune
    Position int
}

I think this change would almost exclusively be used for tests, so I'm not too sure. I was originally thinking parsing the error string for the reported position and checking if it matches what is expected (hence me calling it overkill).

Copy link
Member

@nihaals nihaals left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should add the position to InvalidRuneError and add tests for invalid inputs, also checking the position to ensure it matches.

@diamondburned
Copy link
Contributor Author

add tests for invalid inputs

Do you have a set of reference invalid inputs?

@nihaals
Copy link
Member

nihaals commented Mar 27, 2021

add tests for invalid inputs

Do you have a set of reference invalid inputs?

A few examples:

  1. Decode: 💖✨✨✨,,,,👉👈test💖💖,👉👈❤️👉👈💖💖✨🥺👉👈💖💖✨🥺,👉👈 (non-bottom characters)
  2. Decode: 💖✨✨✨,,,,👉test👈💖💖,👉👈❤️👉👈💖💖✨🥺👉👈💖💖✨🥺,👉👈 (non-bottom characters)
  3. Decode: 💖✨✨✨,,,,👉💖👈💖💖,👉👈❤️👉👈💖💖✨🥺👉👈💖💖✨🥺,👉👈 (character in byte separator)
  4. Decode: 💖✨✨✨,,,,👉test👈💖💖,👉👈❤️👉👈💖💖✨🥺👉👈💖💖✨🥺,👉👈 (character in byte separator)
  5. Decode: 💖✨✨✨,,,,👉👈,💖💖👉👈❤️👉👈💖💖✨🥺👉👈💖💖✨🥺,👉👈 (characters not in order)
  6. Decode: 💖✨✨✨,,,,👉👈💖💖,👉👈❤️💖💖✨🥺👉👈💖💖✨🥺,👉👈 (missing byte separator after null byte)

@diamondburned diamondburned marked this pull request as draft March 27, 2021 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 byte
3 participants