Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default values of to_numeric() #544

Closed
strengejacke opened this issue Oct 1, 2024 · 7 comments · Fixed by #547
Closed

Default values of to_numeric() #544

strengejacke opened this issue Oct 1, 2024 · 7 comments · Fixed by #547
Assignees

Comments

@strengejacke
Copy link
Member

Regarding easystats/easystats#434, I was thinking what would be a good default for the dummy_factors argument in to_numeric()? Currently, the default is TRUE, splitting one variable into multiple ones.

Personally, I think this - as default - is a bit odd and I would expect a "numeric" variant of the factor, what dummy_factors = FALSE returns by default.

WDYT?

@etiennebacher
Copy link
Member

I don't have a strong opinion on this (especially since I don't use this function), but since the next release is 0.13.0 and already contains a breaking change, it might be a good time to change this behavior too.

@etiennebacher
Copy link
Member

Btw I lost track a bit of whether we had a discussion on this but is there any reason for having coerce_to_numeric() and to_numeric() instead of to_numeric() only?

@strengejacke
Copy link
Member Author

Not sure about the current state, but we had:
#206 (comment)

I think the functions behave slightly different, and we needed both, at some point decided to make it (or keep it) exported. The major difference is that coerce_to_numeric() only returns numeric values when character values or factor levels were "numeric" before. Else, returns in the input as is.

@strengejacke
Copy link
Member Author

See

datawizard::coerce_to_numeric(c("1", "2", "3"))
#> [1] 1 2 3
datawizard::to_numeric(c("1", "2", "3"))
#> [1] 1 2 3

datawizard::coerce_to_numeric(c("1", "2", "A"))
#> [1] "1" "2" "A"
datawizard::to_numeric(factor(c("1", "2", "A")))
#>   1 2 A
#> 1 1 0 0
#> 2 0 1 0
#> 3 0 0 1
datawizard::to_numeric(factor(c("1", "2", "A")), dummy_factors = FALSE)
#> [1] 1 2 3

Created on 2024-10-02 with reprex v2.1.1

@strengejacke
Copy link
Member Author

(in the above example you may see why dummy_factors = TRUE is a strange default)

@strengejacke
Copy link
Member Author

I don't have a strong opinion on this (especially since I don't use this function), but since the next release is 0.13.0 and already contains a breaking change, it might be a good time to change this behavior too.

Let me know, I can make the changes and commit them to #546.

@strengejacke strengejacke self-assigned this Oct 2, 2024
strengejacke added a commit that referenced this issue Oct 2, 2024
@etiennebacher
Copy link
Member

etiennebacher commented Oct 2, 2024

Let's keep the distinction between to_numeric() and coerce_to_numeric() then, and move the default to dummy_factors = FALSE. Can you make this in a separate PR? I'll push other changes to #546 and I'd like to avoid conflicts there

Edit: my bad, didn't see #547

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants