Standard treatment of null values · dotnet/fsharp · Discussion #14005

gusty
Oct 1, 2022

I wanted to start the discussion on "How F# functions are expected to deal with null values?"

It turns out that in FSharp.Core the null handling for String functions is completely different and I would say completely opposite to what it does with say Arrays.

It's important to note that both types have null as a proper value, I mean

let x:'T = null compiles fine when 'T is a string or an array, whereas it fails when 'T is something like type MyType = MyType with an error FS0043: The type 'MyType' does not have 'null' as a proper value message.

With Arrays, nulls are not allowed at all, functions throw an exception if one of the arguments is null, even in cases where it could be saved, like:

> Array.contains 1 null ;;
System.ArgumentNullException: Value cannot be null. (Parameter 'array')
 at <StartupCode$FSI_0051>.$FSI_0051.main@() in C:\Users\gmpl2\AppData\Local\Temp\stdin:line 14
Stopped due to error

Which could have resulted in false as it's clear that null doesn't contains 1.

With Strings, what FSharp.Core does instead of checking nulls is it converts them to empty strings, which IMHO is going too far, sitting on the other extreme, and let me tell you why I think so:
- If we have a set of string functions that convert nulls to empty string, the user might eventually rely on some string conversions, say toUpper as it would take care of nulls, but suddenly the business requirement to convert those strings to Uppercase is removed. Months later they find out some null pointer exceptions in some specific cases in production, which generates a lot of research.
- Some companies may have a business rule which includes null string, like for instance this value was not supplied at all, which differs from an empty string. Whereas this is a good idea or not is subject of debate, but the fact is that using those functions would break their logic.
- Apart from those points, stuff like toUpper null = "" simply feels wrong to me.
- Finally, the function toObj suggests that an isomorphism may arise between options and nullable types. We can actually do

> let s: string = None |> Option.toObj ;;
val s: string = <null>

and then the question is, if we're fighting back nulls, by silently converting them to empty strings, why do we generate them here?

Note: toUpper is not in FSharp.Core but this discussion is broader than FSharp.Core the idea is to define how it should behave in any F# library.

My feeling is that this was done at early stages without thinking too much. Probably having nullAsEmpty as a separate function would have been a better idea, so user can freely combine it with the rest of the functions.

Both behaviors can be surprising to users, but specially the one with strings. We can't do breaking changes to FSharp.Core but still we can decide on what's the ideal criteria and apply it to most F# library functions and eventually to new functions in FSharp.Core and add it to the Design Guidelines, which btw currently they only mention null checks when interoperating with other languages.

@dsyme what do you think it should be the standard expectation when designing F# libraries?

Replies: 1 comment 5 replies

dsyme
Oct 27, 2022
Collaborator

@gusty Yes, it's an important question that was largely glossed over in F# 2.0. I believe what's there is under test in FSharp.Core.UnitTests, so will be hard to change either way.

Practically speaking most people consider the string and _ array types to be non-nullable in F# and use them like they are. When nullable-reference-type support finally lands then it is likely we'll make the FSharp.Core signature of any function that throws an exception on a null input to be "non-nullable" (that is, generate a warning if used with something nullable). That's not part of the nullable-reference-type RFC, but we should consider the issue

I think the normative guidance would thus be to fail fast on null inputs, with the String functions being an exception for legacy reasons. That is we would not encourage attributing any semantic meaning to "null".

Do you have a specific proposal on a mod to the design-guidelines to capture this better?

5 replies

@gusty

gusty Oct 27, 2022
Author

In short my proposal would be to fail fast on both string and array.

Regarding the existing functions on String module, I would leave them as they are, to avoid unnecessary breaking changes, maybe add a comment in the xml docs clarifying that specific behavior.

But definitely for new string functions I would recommend to apply fail-fast, and that would be the recommendation for libraries as well.

I just had a look and keeping the existing functions as they are is not a big problem as IMHO they don't appear frequently in code bases, expect for String.lenght and String.concat in 2nd place, the former doesn't return nulls, the latter does and it would be in practical terms the only one we have to be aware of (the rest of the string functions: map, iter, mapi, etc are rarely used). I would say another function commonly used is indexing .[0] which does already fail with nulls, probably because it involves an instance.

I think there's a potential to add more functions in the String module and it's not too late to decide to code them from now with fail-fast. Also this guideline recommendation will apply to F# libraries which contains more string functions and which do tent to appear more in code bases, like toUpper, .trim, etc.

On the other hand, recommending this null-to-empty-auto-conversion behavior I think is bad, because it will encourage incidental null checkings as opposed to intentional. An example is what I said before:

If we have a set of string functions that convert nulls to empty string, the user might eventually rely on some string conversions, say toUpper as it would take care of nulls, but suddenly the business requirement to convert those strings to Uppercase is removed. Months later they find out some null pointer exceptions in some specific cases in production, which generates a lot of research.

and this is because it violates the principle of single responsibility: the function does 2 unrelated things and especially in this case the 2nd thing it does is something the average F# developer is not aware of, for instance after more than 10 years of writing F# I wasn't aware of so it was really surprising to me at least.

@abelbraaksma

abelbraaksma Dec 15, 2022
Collaborator

But definitely for new string functions I would recommend to apply fail-fast

My stance would be: stick to the current approach for strings. If people want to attach semantic meaning on null: string, they can do so by using Option.ofObj, which allows them to treat the string in an F# idiomatic way.

Once we start appending new functions (there actually are some large PRs that I think have since been closed, and approved-in-principle extensions to the surface area of string) which have a different behavior from the existing ones, it would only add to the confusion.

Currently, null-behavior is clearly not consistent. But any agreement on a way forward should take the common denominator of the current approaches and try to unify them as best as possible.

My take would be:

string actions remain the same: F# Core functions should not NRE on them, they are safe
we don't attach semantic meaning for null in collections, these remain an error (even though a valid argument can be made to consider null an empty collection)

My personal preference would be to have a version of F# that never raises an NRE. This would mean considering null a proper value for arrays, lists, records (if forced), DUs, enumerations. Basically everything that F# Core supports out of the box.

It's simple enough, Seq.concat with null seems totally fine to me, it's just empty. And Seq.map would just return an empty sequence when null.

Anyway, that ship has long sailed and just like any other language out there, we suffer the unfortunate truth of having to deal with the Billion Dollar Bug from the 1960s. Thanks Tony Hoare and ALGOL W! Quote:

"My goal was to ensure that all use of references should be absolutely safe, with checking performed automatically by the compiler. But I couldn't resist the temptation to put in a null reference, simply because it was so easy to implement."
-- Tony Hoare

almost 60 years later, this automatic checking is still hard to get....

@Xyncgas

Xyncgas Jan 21, 2023

My stance would be, null is a good thing because we can point to null when we don't have the value but we want the variable. I understand F# doesn't like null and use options instead. Yet in scenarios such as I am going to have a global variable that holds a value later by injection from other library/codes, I am relying on the users to assign the value to the variable before they use it instead of having the language promising it won't be null because it can't be null, while I don't have to write Some in the rest of my F# codes while I can assume the variable has a valid value or that I am not the one to blame when it doesn't have a proper value.

2 sides of the coin : if I care about the possibility of something being null or I don't
They are situational, therefore instead of making something extremely hard to use therefore making it harder for ourselves in those situations, we might reasonably deserve to use whatever without too much trouble so I say let's welcome null back and functions should stop throwing NRE that goes for C# too, if you receive null just return null it's that simple and you can throw NRE if you want but we shouldn't do that in the standard library. Algorithms should act like null is float.NaN, and handle the value instead of not handling the value just like how we can do floating point calculations with float.NaN without getting NRE. And finally if people don't want to use null they can not use it

@abelbraaksma

abelbraaksma Jan 21, 2023
Collaborator

welcome null back and functions should stop throwing NRE that goes for C# too, if you receive null just return null it's that simple and you can throw NRE if you want but we shouldn't do that in the standard library

It’d be nice if one could make F# null-safe, by not throwing NREs, but we have backwards compat to take care of, so that’s not possible. The issue at hand is, what to with new functions. If we introduce different semantics it becomes harder to reason about code.

Another issue is that null will lead to NREs when accessing a member. That’s not F#, that’s the CLR. The simple rule of "just return null when it is given null" cannot be used all the time. What if the returned value is non-nullable, like a value type or a record, DU or list? Or any standard F# class for that matter.

Algorithms should act like null is float.NaN, and handle the value instead of not handling the value just like how we can do floating point calculations with float.NaN without getting NRE

While NaN has clear semantics (even though many don’t agree with it), these only apply to one domain: floating point. Decimals and integers don’t even have NaN, they just throw where floating point would not. With null, the domain is literally everything, except value types. It’s impossible to find a simple rule that fits all domains.

I agree that individual algos can be written such that they treat null as special value (for instance, like string and the String module does), and one could argue that a null collection is an empty one. But not all collections can be empty, and not all types are collections.

Anyway, while I like to entertain the idea, solving this issue that’s plagued software development for over half a century isn’t going to be so trivial to do, I’m afraid. If it was, it was already done (btw, the language Eiffel is null-free, but they have Nil and other notions in their OO paradigm, which come with their own challenges).

@Xyncgas

Xyncgas Jan 22, 2023

Hey I like talking with you about this and hearing from you it's so interesting.

I agree that NRE is thrown in the standard library for backward compatibilities. Although, imagine a world where new functions all return nullable and when operations are performed on null they act like our option computation expression in FsToolkit.ErrorHandling.

I am delighted with the fact that situations exists and I can't pretend they are not, such as people wants to use null values, and something are historically represented with null. I believe just like with everything, with integrity and professionalism, we will make it through because life overcomes difficulties

Standard treatment of null values #14005

Uh oh!

Uh oh!

gusty Oct 1, 2022

Replies: 1 comment · 5 replies

Uh oh!

dsyme Oct 27, 2022 Collaborator

Uh oh!

gusty Oct 27, 2022 Author

Uh oh!

Uh oh!

abelbraaksma Dec 15, 2022 Collaborator

Uh oh!

Uh oh!

Xyncgas Jan 21, 2023

Uh oh!

Uh oh!

abelbraaksma Jan 21, 2023 Collaborator

Uh oh!

Xyncgas Jan 22, 2023

gusty
Oct 1, 2022

Replies: 1 comment 5 replies

dsyme
Oct 27, 2022
Collaborator

gusty Oct 27, 2022
Author

abelbraaksma Dec 15, 2022
Collaborator

abelbraaksma Jan 21, 2023
Collaborator