Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string.strip_prefix and string.strip_suffix implementation proposal #771

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

GiregL
Copy link

@GiregL GiregL commented Dec 17, 2024

Hi !

Here's a simple proposal about the implementation of string.strip_prefix and string.strip_suffix

This PR references issue #743 .

Feel free to comment.

@joshi-monster
Copy link
Contributor

joshi-monster commented Dec 17, 2024

I'd really like having these functions in the standard lib, thank you!!

Unfortunately, your implementation is not correct:

strip_prefix("👩‍👩‍👧‍👦", prefix: "👩")
// --> Ok("")

starts_with tests prefixes based on codepoints. If they match, you drop the same number of grapheme clusters, even if the grapheme clusters are not the same.

It is also slow - you loop over the strings multiple times, performing grapheme segmentation twice. Once you know that that a string has a prefix, you can immediately slice the source string based on its code unit length.

We don't usually bump the version in PRs; this is left to the maintainers. You should add your changes to the changelog though!

~ yoshi 💜

@GiregL
Copy link
Author

GiregL commented Dec 17, 2024

I changed the implementation to a JS side and Erlang side implementations instead of Gleam.
I based them on starts_with and ends_with implementations. It should improve performances as well as fixing the code points issues.

I didn't know where to put my changes in the changelog other than in a new version, I let you decide what to do with the version number.

Thanks for your feedback !

Copy link
Contributor

@joshi-monster joshi-monster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ~ 💜

Looks good to me now; I left some comments inline! Also please keep in mind that the wonderful @lpil is the maintainer not me, I'm just a random person on the internet with ~opinions~ ^^

src/gleam/string.gleam Outdated Show resolved Hide resolved
test/gleam/string_test.gleam Outdated Show resolved Hide resolved
src/gleam_stdlib.erl Outdated Show resolved Hide resolved
src/gleam/string.gleam Show resolved Hide resolved
Copy link
Member

@lpil lpil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello! Thank you for this, it will be super useful.

How come the implementation is in Erlang and JavaScript? Given their complexity and the algorithm being the same across both it should be in Gleam unless there's some particular reason to do otherwise.

The new implementation of starts_with and ends_with is slower due to doing wasted work, please revert that change.

You'll need to rebase to remove the merge commit, they are not permitted in Gleam repos.

Thanks again!!

string.strip_suffix("👨‍👩‍👧‍👦", suffix: "👦")
|> should.equal(
Ok(
string.from_utf_codepoints([
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come this isn't a string literal?

}

return new Error(undefined)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files have to have a newline at the end to be valid!

@@ -569,4 +577,4 @@ slice(String, Index, Length) ->
case string:slice(String, Index, Length) of
X when is_binary(X) -> X;
X when is_list(X) -> unicode:characters_to_binary(X)
end.
end.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files have to have a newline at the end to be valid!

string_strip_prefix(String, <<>>) when is_binary(String) -> {ok, String};
string_strip_prefix(String, _) when is_binary(String), String == <<>> -> {error, nil};
string_strip_prefix(String, Prefix) when is_binary(String), is_binary(Prefix), byte_size(Prefix) > byte_size(String) -> {error, nil};
string_strip_prefix(String, Prefix) when is_binary(String), is_binary(Prefix) ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove all the unneeded guards please 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants