Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rename_all = "camelCase" option to automatically rename methods and functions #4215

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

RunDevelopment
Copy link
Contributor

@RunDevelopment RunDevelopment commented Oct 20, 2024

fixes #1818

This adds support for rename_all = "camelCase" to camelify names inside a block.

Under the hood, I added support for a more general renaming system mimicking serde's renaming system. If we want to camelify methods by default in the next breaking change, we just have to change the default values for those renaming rules.

Examples:

#[wasm_bindgen(rename_all = "camelCase")]
extern "C" {
    #[wasm_bindgen(js_name = foo_bar)] // explicit js name
    fn foo_bar();
    fn qux_corge(); // js_name = quxCorge
    #[wasm_bindgen(js_namespace = Baz)]
    fn yes_no(); // js_name = yesNo
}


#[wasm_bindgen(rename_all = "camelCase")]
pub struct RustStruct {
    pub foo: u32, // js_name = foo
    pub some_cool_field: u32, // js_name = someCoolField
    #[wasm_bindgen(js_name = "another_field_for_you")]
    pub another_field: u32,
}

#[wasm_bindgen] // renaming is opt in, so nothing changes for existing code
impl RustStruct {
    pub fn i_dont_get_renamed() {}
}

#[wasm_bindgen(rename_all = "camelCase")]
impl RustStruct {
    // js_name = incrementFoo
    pub fn increment_foo(&mut self, amount: Option<u32>) {
        self.foo += amount.unwrap_or(1);
    }

    // js_name = setFoo
    pub fn set_foo(&mut self, foo: u32) {
        self.foo = foo;
    }

    #[wasm_bindgen(js_name = get_another)] // explicit js name
    pub fn another(&self) -> u32 {
        self.another_field
    }
}

Open questions

I mostly don't like the name rename_all, because it's not clear that it:

  1. Only affects methods/functions and fields,
  2. Don't affect explicitly given js_names.

Copy link
Collaborator

@daxpedda daxpedda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mostly don't like the name rename_all, but it's not clear that it:

  1. Only affects methods/functions and fields,
  2. Don't affect explicitly given js_names.

I think your current implementation sounds good to me: affects methods, static methods and fields and explicit js_names aren't overwritten.

camelify_by_default sounds a bit too funny to me. How about default_rename_camel_case? Kinda terrible as well, let me know if you have any better ideas.

crates/macro-support/src/lib.rs Outdated Show resolved Hide resolved
crates/macro-support/src/lib.rs Outdated Show resolved Hide resolved
@daxpedda daxpedda added the waiting for author Waiting for author to respond label Oct 31, 2024
@RunDevelopment
Copy link
Contributor Author

camelify_by_default sounds a bit too funny to me. How about default_rename_camel_case? Kinda terrible as well, let me know if you have any better ideas.

This comment suggests auto_camel_case, which I kinda like.

Also, maybe we could use something like experimental_auto_camel_case? Right now, this feature feels really underspecified to me, so I think it would be good to get some actual user feedback on it before we commit to a specific behavior. Especially since changing the behavior would be a breaking change.

What do you think?


On a different note: static properties. Static properties and readonly variables sometimes use SCREAMING_SNAKE_CASE similar Rust const and static variables. We should probably detect that, right?

In general, the renaming assumes snake_case, but doesn't verify it. Should we only rename names that we can verify to be snake_case?

@daxpedda
Copy link
Collaborator

daxpedda commented Nov 7, 2024

On a different note: static properties. Static properties and readonly variables sometimes use SCREAMING_SNAKE_CASE similar Rust const and static variables. We should probably detect that, right?

In general, the renaming assumes snake_case, but doesn't verify it. Should we only rename names that we can verify to be snake_case?

Hm, good point. In which case wouldn't we be looking at something like auto_jsify doing it all in one go instead of having multiple ones? Which in the future would be enabled by default.

@RunDevelopment
Copy link
Contributor Author

RunDevelopment commented Nov 8, 2024

In which case wouldn't we be looking at something like auto_jsify doing it all in one go instead of having multiple ones?

Not really. My point was the JS devs sometimes use camelCase and sometimes SCREAMING_SNAKE_CASE for constants/static variables/getters, so we probably shouldn't make everything camelCase.

Thinking about this, JS devs generally don't use lower_snake_case for anything. So how about we make the behavior of auto_camel_case (bike shed pending) that it converts all lower_snake_case names to camelCase, regardless of whether it's a method, field, or something else? This means that there is only one simple rule that people have to learn and it applies equally everywhere. This makes the behavior predictable and consistent, which is pretty important for something that might be on by default in the future.

What do you think?

🐍💼➡️🐪💼

@daxpedda
Copy link
Collaborator

daxpedda commented Nov 9, 2024

In which case wouldn't we be looking at something like auto_jsify doing it all in one go instead of having multiple ones?

Not really. My point was the JS devs sometimes use camelCase and sometimes SCREAMING_SNAKE_CASE for constants/static variables/getters, so we probably shouldn't make everything camelCase.

Is there no standard? If the goal is to do this automatically in the next breaking change, what are we going to convert it to then?

Thinking about this, JS devs generally don't use lower_snake_case for anything. So how about we make the behavior of auto_camel_case (bike shed pending) that it converts all lower_snake_case names to camelCase, regardless of whether it's a method, field, or something else? This means that there is only one simple rule that people have to learn and it applies equally everywhere. This makes the behavior predictable and consistent, which is pretty important for something that might be on by default in the future.

What do you think?

Basically boils down to the question above. We should figure out what we want to do automatically in the future and implement this here. But yes, I agree we should "fix" all names, not just methods.

But I think we should do it regardless of what the user names their items, explicit names should be set with the js_name attribute.

@RunDevelopment
Copy link
Contributor Author

Is there no standard?

No. In JS, the best we have are common naming conventions. JS itself uses SCREAMING_SNAKE_CASE for constants (e.g. Math.PI and Number.MAX_SAFE_INTEGER), but camelCase for constants in common in the wider ecosystem as well. But aside from constants, almost everyone uses camelCase for function/methds/variables and PascalCase for classes/interfaces/types.

This means that names that are already in PascalCase, camelCase, or SCREAMING_SNAKE_CASE, are most likely correct. So we only really have to rename snake_case names, because snake_case is most likely "wrong".

But I think we should do it regardless of what the user names their items, explicit names should be set with the js_name attribute.

I don't think that what you're saying is wrong, but I think it might not be practical - for us and our users.

If you want to convert more than just snake_case names, then you have to be able to detect and convert different naming conversions. That's not easy, because there are many many edge cases (e.g. a constant called MATRIX_4x3). Once set, we can't really change the behavior of renaming anymore. Any change to renaming, no matter how minor, will a breaking change. And the more complex cases renaming has to handle, the higher the chance for bugs and edge cases to sneak in.

For users, it's also less practical. Renaming is essentially guessing the correct name for the user. If we guess wrong, the user has to fix it with js_name. I think that the way we guess should be simple and predictable for the user, so that they can rely on it. I don't want the renaming behavior to be so complex that users essentially just have to "try and find out" what WBG renames more difficult names to.

@daxpedda
Copy link
Collaborator

daxpedda commented Nov 11, 2024

No. In JS, the best we have are common naming conventions.

Hm, that's bad.

This means that names that are already in PascalCase, camelCase, or SCREAMING_SNAKE_CASE, are most likely correct. So we only really have to rename snake_case names, because snake_case is most likely "wrong".

But I think we should do it regardless of what the user names their items, explicit names should be set with the js_name attribute.

I don't think that what you're saying is wrong, but I think it might not be practical - for us and our users.

If you want to convert more than just snake_case names, then you have to be able to detect and convert different naming conversions. That's not easy, because there are many many edge cases (e.g. a constant called MATRIX_4x3). Once set, we can't really change the behavior of renaming anymore. Any change to renaming, no matter how minor, will a breaking change. And the more complex cases renaming has to handle, the higher the chance for bugs and edge cases to sneak in.

For users, it's also less practical. Renaming is essentially guessing the correct name for the user. If we guess wrong, the user has to fix it with js_name. I think that the way we guess should be simple and predictable for the user, so that they can rely on it. I don't want the renaming behavior to be so complex that users essentially just have to "try and find out" what WBG renames more difficult names to.

I don't think I'm following here. It seems much more straightforward to me that we simply rename everything regardless, instead of making any exceptions in specific cases that users have to understand in the first place. No guessing required. It seems more predictable to me.

I'm also not concerned about the practicality, as we should generally discourage users to name anything in Rust not following the Rust naming conventions. E.g. if users want their static properties to be PascalCase, they should still name them in Rust in SCREAMING_SNAKE_CASE, following the Rust naming conventions, but explicitly define how they are differently named in JS via the js_name attribute.

I'm also unsure about edge-cases, I feel like anything should be possible to correctly convert to camelCase, whats the problem with MATRIX_4x3?

Please let me know if and why you don't agree with my point of view here!

@RunDevelopment
Copy link
Contributor Author

No. In JS, the best we have are common naming conventions.

Hm, that's bad.

Yep. But if we take SCREAMING_SNAKE_CASE for static constants, naming isn't too different from Rust:

Rust JavaScript/TypeScript
variables/parameters snake_case camelCase
classes/enums/types PascalCase PascalCase
enum variants PascalCase n/a (or PascalCase if implemented like tsify)
instance/static methods snake_case camelCase
instance fields/properties snake_case camelCase
class static fields/properties n/a camelCase
(static) constants SREAMKING_SNAKE_CASE SREAMKING_SNAKE_CASE

I don't think I'm following here. It seems much more straightforward to me that we simply rename everything regardless, instead of making any exceptions in specific cases that users have to understand in the first place. No guessing required. It seems more predictable to me.

My point was that to convert everything to one naming convention, you first need to detect the casing of the identifier. This is why I said that MATRIX_4x3 is an edge case, but it's neither snake_case nor SCREAMING_SNAKE_CASE strictly speaking.

But that's not really a problem IF we assume Rust naming conventions. I just didn't want to make that assumption, because there are a lot of WBG users that use camelCase for methods in Rust right now due to the lack of automatic renaming. Even if we discourage not following Rust naming conventions, I think we should still support it to some degree for compatibility with existing camelCase identifiers.

That said, it feels like we're not 100% talking about the same thing here. I'm not quite sure what you mean by "rename everything regardless". As you can see in the above table, we only need to rename snake_case identifiers, so what's everything here?

@daxpedda
Copy link
Collaborator

No. In JS, the best we have are common naming conventions.

Hm, that's bad.

Yep. But if we take SCREAMING_SNAKE_CASE for static constants, naming isn't too different from Rust:

Sure, lets just go with that. However, we might want to consider not converting anything that matches naming conventions. E.g. Struct and enum types should be pascal case in both languages, so if the user doesn't use pascal case in Rust, we can take that as being set explicit. So no conversion necessary. The same would apply to static constants.

That might just resolve your concern of practicality here.

My point was that to convert everything to one naming convention, you first need to detect the casing of the identifier. This is why I said that MATRIX_4x3 is an edge case, but it's neither snake_case nor SCREAMING_SNAKE_CASE strictly speaking.

We only need to detect the casing and spacing, not necessarily which convention its following. I think its pretty clear how MATRIX_4x3 will be converted to lets say SCREAMING_SNAKE_CASE: MATRIX_4X3.

But that's not really a problem IF we assume Rust naming conventions. I just didn't want to make that assumption, because there are a lot of WBG users that use camelCase for methods in Rust right now due to the lack of automatic renaming. Even if we discourage not following Rust naming conventions, I think we should still support it to some degree for compatibility with existing camelCase identifiers.

Compatibility with current code is not a concern, right? Users still need to explicitly set the attribute.

That said, it feels like we're not 100% talking about the same thing here. I'm not quite sure what you mean by "rename everything regardless". As you can see in the above table, we only need to rename snake_case identifiers, so what's everything here?

I was explicitly responding to this:

This means that names that are already in PascalCase, camelCase, or SCREAMING_SNAKE_CASE, are most likely correct. So we only really have to rename snake_case names, because snake_case is most likely "wrong".

So regardless of what naming convention a user has used, we should convert every property and method to the naming convention according to the table you set above.

Though keep in mind my suggestion at the start of the post, where we could just not do any conversion for static constants, as they should be SCREAMING_SNAKE_CASE for both languages. In which case if a user is using PascalCase for their static constants, we won't convert it to SCREAMING_SNAKE_CASE. They should get a Rust warning though.

@RunDevelopment
Copy link
Contributor Author

we might want to consider not converting anything that matches naming conventions. E.g. Struct and enum types should be pascal case in both languages, so if the user doesn't use pascal case in Rust, we can take that as being set explicit. So no conversion necessary.

I agree. This is also what I wanted to do. I also agree that users not following Rust naming conventions is a strong signal that they are doing it on purpose.

So to summarize, the renaming strategy for identifiers will be:

  1. If the identifier conforms to the Rust naming convention and JS/TS uses a different naming convention (as define in the previous table), convert the Rust identifier to the JS/TS naming convention.
  2. Otherwise, use the Rust identifier as is in JS.

This is what you mean, right? If so, then I 100% agree with it.

On a related note: should we rename function parameters? They fulfill the criteria, but parameter names aren't really observable in JS (kinda, you can observe them with hacks, but you must not rely on them). Renaming them might also cause problems with user-defined JSDoc @param tags.

We may also want to change the attribute from auto_camel_case to something like auto_rename. It only "camelCases" right now, but I don't want to exclude the possibility of having different conversions in the future. E.g. we may want to generate getters for public associated constants on structs/enums in the future.

In which case if a user is using PascalCase for their static constants, we won't convert it to SCREAMING_SNAKE_CASE. They should get a Rust warning though.

Clippy already emits a warning for not following Rust naming conventions. So we don't have to do anything, I think.

@daxpedda
Copy link
Collaborator

daxpedda commented Nov 12, 2024

we might want to consider not converting anything that matches naming conventions. E.g. Struct and enum types should be pascal case in both languages, so if the user doesn't use pascal case in Rust, we can take that as being set explicit. So no conversion necessary.

I agree. This is also what I wanted to do. I also agree that users not following Rust naming conventions is a strong signal that they are doing it on purpose.

So to summarize, the renaming strategy for identifiers will be:

  1. If the identifier conforms to the Rust naming convention and JS/TS uses a different naming convention (as define in the previous table), convert the Rust identifier to the JS/TS naming convention.
  2. Otherwise, use the Rust identifier as is in JS.

This is what you mean, right? If so, then I 100% agree with it.

Yes, I see now that there was a misunderstanding on my part before. I was somehow assuming that we need to figure out which naming convention e.g. enum types are to check if they follow the Rust convention. But because there is nothing for us to do if they don't, we don't actually need to that.

On a related note: should we rename function parameters? They fulfill the criteria, but parameter names aren't really observable in JS (kinda, you can observe them with hacks, but you must not rely on them). Renaming them might also cause problems with user-defined JSDoc @param tags.

Yes. Lets move the user-defined JSDoc conversation to real-time because this needs a bit of a longer back and forth.

We may also want to change the attribute from auto_camel_case to something like auto_rename. It only "camelCases" right now, but I don't want to exclude the possibility of having different conversions in the future. E.g. we may want to generate getters for public associated constants on structs/enums in the future.

Yes ... finally a good name.

In which case if a user is using PascalCase for their static constants, we won't convert it to SCREAMING_SNAKE_CASE. They should get a Rust warning though.

Clippy already emits a warning for not following Rust naming conventions. So we don't have to do anything, I think.

Yeah, we just need to check via UI tests, but I know this is hard to get right, so its not a requirement for a PR.

@RunDevelopment
Copy link
Contributor Author

What's left to do:

  • Docs: Document the feature. I started by adding import/export docs for the auto_rename attribute, but I want another page explaining the whole thing.
  • UI: Right now, unused auto_rename attribute don't result in compiler errors.
  • Naming conventions UI: I don't think the UI tests we currently have capture clippy warnings, so checking that identifiers not following Rust naming conventions may be a bit difficult.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for author Waiting for author to respond
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically camelify names
2 participants