-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lack of consistency + lack of awareness re. cultural processing of strings #311
Comments
👍 |
Hi @andrew-webb, I agree this is a significant problem - we need a clear position on this issue and we need that to be mapped through to fixes in the code. There is considerable discussion going on about quality guidelines for F# upstack components, we'd welcome your input, for example: http://fsharp.github.io/2014/09/19/fsharp-libraries.html. It is likely the profile of culture-specific processing needds to be raised more highly in that dicsussion. Please help us do that. Cheers |
Hi Don, I'll be happy to help. After spotting this problem in FSharpx, I took a good look at the source of FSharp.Data and was pleased to see that it was far less of a problem there. Things are not perfect in FSharp.Data in that cultural intent isn't always specified explicitly, but I couldn't spot any real causes for concern. Perhaps after new guidelines are published there could be a cleanup of FSharp.Data to bring it into total alignment, but at this time I don't feel moved to create an issue about it. All this brings back fond memories of 9 years ago when Dave Fetterman of the BCL Team wrote a paper entitled Strings in .NET 2.0. I got a kick out of finding that page again today, and seeing my enthusiastic comment from June 2005. Don't bother clicking on the link at the top to the (Word) document, because it's long gone. However, it lives on in this page:- Best Practices for Using Strings in the .NET Framework. Just about every word that Dave wrote in 2005 still applies today. Needless to say, I consider it essential reading for every .NET developer. You will see that Best Practices article referenced from the MSDN pages of method overloads that should be avoided, e.g. String.EndsWith Method (String). Scroll to Notes to Callers in the Remarks section. However, I don't blame developers for not noticing, because it isn't very obvious. What got me to notice was passing all code through FxCop / Code Analysis, and investigating what the cryptic message "use an overload that takes an IFormatProvider arg" was all about! Being relatively new to F#, I'm not sure if FxCop / Code Analysis applies to F# code. And if not, what should be used instead? But it's the next thing on my list to check out. I'll see if I can come up with some guidance that will help F# developers specifically. But the main part of my guidance will be: read, mark, learn and inwardly digest the guidance that Dave Fetterman came up with 9 years ago and which still applies to this day. Oh, and Michael Kaplan always had interesting things to say in Sorting it all Out. Here's just one little gem. Yours, |
Can you please create an issue there with the problems you found? We take care to always either explicitly pass InvariantCulture or use a parameter, if there's still any place using the .Net default of using the current culture it's a bug |
Happy to do it. Here's the issue:- fsprojects/FSharp.Data#742 |
Thanks @andrew-webb, this is a very useful contribution. We'd be grateful if you could submit a PR to https://github.com/fsharp/fsharp.github.io/blob/master/_posts/2014-09-19-fsharp-libraries.md to add this as a specific recommendation. We can discuss the exact recommendation on the PR thread. Note that FSharp.Core and F# always uses Ordinal comparison by default for "compare", "=", "<>", "<", ">", Seq.sort, Seq.sortBy, HashIdentity.Structural, ComparisonIdentity.Structural and so on. This was a decision made in F# 1.0-2.0. Upstack F#-specific libraries should follow this design pattern, and explicitly call it out in documentation where necessary. If you could include this information in the guidance PR that would be great. Cheers! |
Done. Andrew |
Looking at the code base as a whole, there seems to be a strong lack of consistency when it comes to dealing with cultural variation wrt the processing of strings. E.g. some people use s.ToUpper(), while others, perhaps knowing better, use s.ToUpperInvariant().
Some people use #if FX_NO_CULTURE_INFO_ARGS to determine whether to specify culture info and/or StringComparison args to standard .NET methods, while most others don't.
Also, there seems to be a complete lack of awareness of the part of many contributors about the whole issue of a) cultural vs non-cultural conversion of strings, and b) current culture vs invariant culture vs ordinal (with/without ignore case) comparison of strings.
As it stands, FSharpx is a library I would be fearful to use. Looking at the code, I know that I cannot predict the outcome of many operations when run on other-culture (e.g. non-English) machines.
It has been drummed into me over the last 10+ years the critical importance of always stating your cultural / ordinal intent when it comes to string conversion and comparison.
I would suggest a cleanup throughout the code, along these lines...
The text was updated successfully, but these errors were encountered: