Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistencies in definitions of bigint operations and Fraction operations #3374

Open
gwhitney opened this issue Jan 31, 2025 · 33 comments
Open

Comments

@gwhitney
Copy link
Collaborator

gwhitney commented Jan 31, 2025

Describe the bug
Many operations on integers, like division or square root or logarithm, do not necessarily produce integers. Similarly, many operations on rational numbers, like square root or logarithm, do not necessarily produce rational numbers. However, the approach that mathjs currently takes in determining whether to accept such arguments and what to return if it does varies widely both between the two different types and within the types. For some examples:

  • Division of a bigint by a bigint returns the floor of the exact answer when that is not an integer

  • Square root of a bigint returns the number/Complex (approximation, or exact) square root, even when the answer is an exact integer.

  • logarithm of a bigint follows the same scheme

  • Division of Fraction by Fraction is no issue, it's always a Fraction except 0/0 which throws (as it presumably will in any scheme)

  • Square root of a Fraction always throws

  • logarithm of a Fraction to a Fraction base returns a Fraction when the exact answer happens to be rational, and throws an error otherwise, meaning that you must use try/catch to use it unless you for some reason happen to know that the outcome will be rational (which seems like an unusual situation).

To Reproduce
There are many examples, such as:

math.log(math.fraction(3,2), math.fraction(9,4)) // returns 1/2
math.sqrt(math.fraction(9,4)) // throws error, even though there is a rational answer

math.sqrt(math.fraction(9,1)) // throws error
math.sqrt(9n) // returns **number** 3

Discussion
With growing support of bigint, it seems important to adopt and make clear a consistent philosophy on how mathjs will handle the results of mathematical operations that go outside the domain that the inputs come from. Otherwise, it seems likely the problems illustrated above will become worse, leaving mathjs prone to producing inscrutable behavior.

Here are some possibilities:


Proposal (1): Take a page from the very early story of mathjs and sqrt and number. When x is negative, there is no number that is the sqrt of -4. So mathjs has a config option predictable. When it is true, sqrt(-4) therefore returns NaN -- an entity of type number, that informs the user there was otherwise no appropriate number answer. When it is false (the default), mathjs is allowed to go to the best available type to represent the result , and so returns complex(0,2).

A slight hitch in the case of bigint and Fraction is that neither domain contains an analogue of NaN. So to truly remain within the type, when there is no suitable answer, our only alternative would be to throw. On the other hand, we might want to return null or NaN (choose one and use it always) even though it is not a bigint or Fraction, so that we are returning a sentinel value that can be checked for without try/catch, and which will presumably propagate into any further calculation, making the whole thing null or NaN to signal that somewhere within, something failed. So this option (1) splits into (1a) and (1b) depending on whether operations that cannot be satisfied throw an error, or whether they return null or NaN. To not have to repeat the below, I will just say the computation "barfs", and we would just need to pick one consistently (or allow it to be configured, perhaps by allowing additional values of the predictable option). (For the particular case of Fraction, it could be extended to allow the indeterminate ratio 0/0 as its own sort of NaN within its domain, as its particular kind of barfing.)

So proposal (1) would be:

  • When predictable is false (default), all operations strive to return the best result in the best available type whenever possible. When a result is irrational, that best type could potentially be number or bignumber, and perhaps the fallbackNumber configuration option should control the choice of which one. I will just say "floating point" below to be agnostic between number and bignumber. So for example, in this situation:

    • Dividing a bigint by another would produce a bigint when the quotient is an integer, and a Fraction otherwise

    • sqrt of a bigint would produce a bigint for perfect squares, a floating point for other positive numbers, and a Complex for negative bigints. (Note no integer has a rational but non-integer square root, as a matter of mathematical fact. This is the current behavior.)

    • logarithm of a bigint would produce a bigint for perfect powers, a Fraction for rational powers, a floating point for other positive arguments, and a Complex otherwise.

    • Dividing a Fraction by a Fraction would always be a Fraction

    • sqrt of a rational square would produce a Fraction, a floating point for other positive numbers, and a Complex for negative Fractions.

    • logarithm of Fraction to a Fraction base would produce a Fraction when it happens to be a rational power; in all other cases or if either input is a floating point type, it would produce a floating point or Complex result if possible.

  • When predictable is true, all operations barf whenever the answer cannot be the same type (with perhaps the variant of barfing being configurable). In particular, in this situation

    • Dividing a bigint by another would produce a bigint when the quotient is an integer, and barf otherwise

    • sqrt of a bigint would produce a bigint for perfect squares, and barf otherwise

    • logarithm of a bigint would produce a bigint for perfect powers, and barf otherwise

    • sqrt of a rational square would produce a Fraction, and barf otherwise

    • logarithm of Fraction to a Fraction base would produce a Fraction when it is a rational power, and barf otherwise.


Proposal (2): Take a page from JavaScript's definition of bigint division, and have mathjs always strive to produce the "best" approximation to an answer within the arguments' domain:

  • Dividing a bigint by a bigint produces the bigint that is the floor of the actual quotient (the current behavior).

  • sqrt of a nonnegative bigint would produce the floor of the actual quotient, otherwise a sentinel value like 0 or -1.

  • logarithm of a bigint would produce the floor of the actual logarithm when that is real, otherwise a sentinel value.

  • sqrt of a nonnegative fraction would produce the exact Fraction when there is one, otherwise an approximation with minimal denominator within some set precision. One could press the existing precision configuration option into use, and say we want an approximation within 10^(-precision), to match roughly the precision we would get out of BigNumber. Or we could add a new configuration 'rationalPrecision', perhaps as the log of the maximum denominator allowed. For a negative fraction, you would get a sentinel value like 0, -1, or if we add such a thing, 0/0.

  • logarithm of Fraction would produce the exact Fraction when there is one, otherwise a minimum-denominator rational approximation within a set precision when there is a real value, and if there is no real-number logarithm, a sentinel value.

Note for full consistency throughout mathjs in Proposal (2), predictable really ought to be abolished (it's only used in sqrt, pow, logarithms, and inverse trig and hyperbolic functions anyway) and sqrt on number should just return a sentinel value like NaN or 0 or -1 for negative numbers (etc.).


Proposal (3): Both proposals (1) and (2) have their virtues, so further extend/enhance/replace the predictable config with settings that produce any one of these classes of consistent behavior. E.g, an outOfDomainRule parameter that could be 'throw', 'null', 'NaN', 'sentinel' (to produce a specific sentinel chosen strictly within each domain) -- this first group of options are all roughly like predictable: true but just differ in detail; 'closest' -- proposal (2); or 'promote' -- the current behavior with predictable: false except extended to Fraction, which currently acts most closely like 'throw' but not exactly. (And in the case of 'promote', the type(s) to promote to, number or bignumber, might have to be configurable, perhaps by fallbackNumber.)


Proposal (4): Each of the settings in (3) has its virtues, but this whole configuring thing is overcomplex and bogs down implementations too much. Any one consistent class of behavior is feasible to work with, and you can always get other reasonable behavior(s) by explicit casts or by trying and then casting if need be. So just abolish "predictable" and any other config option like it, and in essence pick one outOfDomainRule, and implement it everywhere.



Frankly, any of these proposals is defensible and there are surely other reasonable options I haven't thought of. The key thing is that any consistent approach will be more understandable and scalable than the current type-dependent hodgepodge. And as I dive into the details of the bigint implementation, it would really be helpful to settle, sooner rather than later, on a general philosophical direction that mathjs will commit to in the long run, even if it doesn't move quickly toward strict compliance. It would really inform the refinement of the bigint implementation. Thanks so much for your thoughts!

@gwhitney
Copy link
Collaborator Author

(I guess if I had to vote myself, at this moment I would say that in the spirit of the existing predictable config but in a new world where there are a more types, Proposal (3) seems the most consistent with mathjs history. But the only options I can ever imagine actually using myself are 'promote' and 'closest', so I'd actually be fine only supporting those two options. And as I already said, if it had to be just one scheme, period, I'd choose 'closest' but it only very slightly edges out 'promote'.)

@gwhitney
Copy link
Collaborator Author

Upon sleeping on it, while what I said about 'closest' (Proposal 2) providing new mathematical capabilities is true, in a scheme where it is the only one directly supported by mathjs, it makes it the hardest for someone using the library to implement a different strategy themselves. The obstruction is that under 'closest', it can be difficult to tell when you've gotten an exact answer and when you've only gotten an approximation (in which case maybe you want to try another datatype). So I will edit out my recommendation that if there is only one that should be it. I guess if there is only one it should be 'promote'. But maybe it is best for mathjs to be at least somewhat configurable in this, as each proposal has its virtues.

@josdejong
Copy link
Owner

Thanks Glen, this is a good discussion point.

We have to propose your try/barf construct to the TC39 commission and make it a part of JavaScript :)

For context: the general idea in mathjs is that the output type is the same as the input type. When that is not possible and the config.predictable option is false, the output will be the type configured as config.number. If that is not possible config.numberFallback is used. And if that is not possible use what is best suited.

I think there are two main use cases that we need to serve, and the idea behind the option config.predictable was to cater for these two use cases:

  1. A user which "just" wants a correct answer formatted on screen. This user is fine with "any" data type, and does not want type conversion errors (i.e. no-barfing-mode).
  2. A user which uses the results programmatically and needs predictability in the returned data type.

Thoughts:

  1. I think removing the option config.predictable it would be a serious step back in usability (hurting use case 1). It is really helpful when having mixed real numbers and complex numbers. We can indeed think through whether we can refine the configuration option(s).
  2. About outOfDomainRule: my initial feeling is that we're overengineering things if we make the behavior for returning throw, null, or NaN configurable. Before we go into that I would love to list all the relevant cases and see if we can come up with sensible behavior that is not configurable and is as consistent as possible. Thinking aloud here: maybe it is a good idea to go for throw, since, as soon as you get a NaN somewhere in a nested operation, you can't do any meaningful operations on it anyway. Are there any cases where NaN is a desirable outcome?
  3. About proposal (1): interesting idea to rethink more "special" cases where the functions can return a better answer by returning a different type.
  4. I know from practice that the bigint/bigint that you can easily shoot yourself in the foot when the returned result is a bigint which is the floor of the actual float. In order to cater for use case (1), I think we should not return the bigint floor but the configured config.number (so you can configure that as Fraction if you want).
  5. Great to have an open brainstorm right now, but in the end, lets think through if the planned changes are breaking changes and/or backward compatible.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 5, 2025

Thanks for that feedback! With that we can start to converge to a workable plan.

First, your description of the current state of mathjs w/r/t config.predictable is not quite reflected in the code as it stands. Nowhere in the code when predictable is false does mathjs consult config.number or config.numberFallback. Here is a complete catalog of the current uses of predictable:

  1. When the units cancel in a Unit operation, leaving a unitless value, whether to return the numeric type (predictable = false) or the unitless Unit type (predictable = true).
  2. In operations on real numbers whose domain can be extended by providing Complex results on some inputs, for numbers/BigNumbers whether to return that Complex result (predictable = false) or number/BigNumber NaN (predictable = true), with sporadic/inconsistent outcomes for other types like Fraction.

That's all.

So in listening to your feedback in terms of the two use cases, I would suggest that we keep predictable, and have its two values mean:

T. When predictable is true and all inputs are a given type, any result returned must be of that same type.

F. When predictable is false, and the result of the Platonic mathematical operation cannot be represented by the input type(s), mathjs returns its judgment of the "best" type it knows of to return that Platonic exact result.

Those two options seem to correspond well to the two use cases you mention. But they do leave two open questions:

M. (For "mixed") When predictable is true and inputs are of mixed type, what type shall we return? Should we just presume that the conversion operations will promote this situation in some way or another to a case of all inputs of the same type, and then apply principle T? I think that is roughly what is happening in the status quo, but there could well be cases in which we could produce "better" answers by choosing one of the types of the operands based on knowing what all of the supplied values are (e.g. a bigint times a fraction could result in a bigint if the denominator cancels). Should we worry about finding any of those cases? Or is it central to the idea of principle T that operations should first be reduced to cases of uniform type?

B. (For "barf") All current cases of T applied in practice in mathjs code deal with number and BigNumber, which are convenient in that those number systems contain their respective NaN values, which can never be "wrong" as the outcome of an operation. But as we become systematic about predictable, consider say pow applied to two bigints. Often it can have a bigint value, but for many cases where one or both inputs are negative, there is no correct bigint value. But bigint has no element analogous to NaN -- it only contains the integers. So in such a case of pow applied to two bigints when predictable is true, do we:

  1. Return a sentinel bigint (say 0n) and let the client of the library worry about whether that was the actual answer or it was a case that had no actual answer and so mathjs is resorting to this sentinel?
  2. Actually make the type that mathjs uses be an "ExtendedBigint" which is (say) the union of bigint and the numbers NaN and ±Infinity so that those three (only!) are OK to return even when predictable is true (and those three are handled gracefully with other bigint inputs). (Alternatively, we could make our own Symbols to be -InfiniteBigint, InfiniteBigint, and NaB (Not a Bigint), but I think that would be more of a pain because nothing else in JavaScript could smoothly deal with those values.)
  3. Throw a RangeError (essentially forcing clients to either pre-check their inputs for validity or use try/catch).
  4. Do something else?

We need to make some at least initial decision on M and B. My recommendations:

M. At least for now, leave the status quo where we presume typed-function/conversions transmute everything we need to implement into the uniform-type case, and just focus on implementing that case.

B. Option 2 in which we explicitly say that mathjs will operate on this ExtendedBigint type (to the point of the TypeScript typings, etc.), and not precisely on the built-in bigint type, seems like the path of least resistance. (1) seems like a potential trap for clients ("How can I trust the result when I get back a 0?") and (3) seems like more of a pain ("I just want the answer, I don't want to have to wrap all my mathjs calls in try/catch"). But I could totally be convinced otherwise: if 0n is the only sentinel bigint uniformly used in this way, maybe it's not too hard to check if it's a real or sentinel answer; or maybe I shouldn't be so allergic to try/catch. So really I would be fine with any proposal here, as long as we do it uniformly across types that don't have NaNs and across all functions. (It was this ambivalence that led me to propose that the barfing style be configurable.) The answer could be different for bigint and Fraction, because Fraction is so easy/natural to extend to infinities and not a number, but bigint isn't.

And I think with decisions on M and B, we would be ready to systematize the (existing and future) mathjs functions. To examine the cases in my original post, under these recommendations:

  • We extend the Fraction type with -1/0, 0/0, and 1/0 for -FractionInfinity, NotAFraction (NaF), and FractionInfinity. (Sort of not sure why they aren't in the Fraction package already, they are definitely meaningful and useful.)

  • Division:

    • Fractions: always returns a Fraction, now truly possible with the extended type. So predictable irrelevant.
    • Bigints: when the quotient happens to be a bigint, we return it. Otherwise:
      • predictable: return the ExtendedBigint -Infinity, NaN, or Infinity as appropriate (e.g. 3n/5n -> NaN, -2n/0n -> -Infinity)
      • !predictable: return the appropriate (extended) Fraction.
  • sqrt:

    • Fractions: return the rational square root of a rational square. otherwise:
      • predictable: return NaF
      • !predictable: return BigNumber of the square root for positive fractions (since Fraction is arbitrary precision, BigNumber is the best match and the client can always just convert to number to throw away precision); return Complex for negative fractions
    • Bigints: return the integer square root of perfect squares, otherwise proceed exactly as for Fraction
  • log(x, base)

    • Fractions: when x is a rational power of base, return that fraction. When x is 0, return -FractionInfinity. Otherwise:
      • predictable: return NaF
      • !predictable: return BigNumber approximation when x is positive, and Complex when x is negative
    • Bigints: when x is an integer power of base, return that bigint. When x is 0, return -Infinity. Otherwise:
      • predictable: return NaN
      • !predictable: proceed exactly as for Fractions.

How does that all sound? What are your feelings on questions M and B?

@josdejong
Copy link
Owner

First, your description of the current state of mathjs w/r/t config.predictable is not quite reflected in the code as it stands. Nowhere in the code when predictable is false does mathjs consult config.number or config.numberFallback.

Ah, you're right, sorry for the confusion. config.number is used only when there is no information on what the desired type of output is.

M. (For "mixed") that is an interesting point. I agree with you, I think it's fine to keep it like it's working now: there is a set way to resolve mixed data types, like mixing a number and BigNumber will always return a BigNumber. So it boils down to: when predictable is true, mixing types will always return a predictable type, but different numeric values will not affect the returned data type. On a side note: it would be good to document the actual mixed type resolutions.

B. (For "barf") I agree that option (1) a sentinel bigint would be tricky to use. I expect that option (3) introducing an ExtendedBigint will alienate mathjs from JavaScript and make it harder to interop between those two, this is not my preference. Another option (4) would be to introduce a new Symbol('NaN') and use that as NaN value for bigint calculations, or use the number NaN value.

When doing a calculation of which the result cannot be represented as a bigint, somehow the user needs to be informed of that and the error must be propagated. We can do that in two ways: return a special value (like a NaN value), or throw an exception. In both cases, the user has to check the outcome. To me, having to write either if (isNaN(result)) {...} or try { ...} catch () {...} is just a different syntax. (side note: a third option would be returning an [err, success] pair like more and more programming languages use, but that doesn't align with our current API I think). My preference goes to a try/catch, since it stops all calculations at once with an exception and rules out the possibility of accidentally producing a falsive result, hiding that there was an error somewhere during the calculation. Also, with an exception, you can pass an explanatory message that can help debugging. And third, this just aligns with how bigint is implemented in JavaScript, so no inconsistency there. Would it be ok with you to go for try/catch?

What I am thinking about though is whether we should change the cases where mathjs currently returns NaN, like math.divide(0, 0). Should we remove support for NaN altogether and throw exceptions instead? Python throws too in case of 0/0 for example. I have the feeling that NaN may be a mistake, just like having both null and undefined in JavaScript. This would be a serious breaking change though, and feels a bit like going for purity at the cost of practicality.

I like the idea of improving Fraction support by always returning a Fraction (and possibly internally using number to actually do the calculation, but convert that into a Fraction in the end, like with trigonometric functions).

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 7, 2025

  • of course it would be fine to throw when predictable is true and a bigint operation leads to a non-integer value. If so, I think we should keep division to mean mathematical division and throw on 3n / 5n, and introduce a means for "truncated division" like python's //. How do you feel about that? I would just like each mathjs function to represent the same mathematical operation, regardless of the types it is operating.

  • I think this throwing behavior should be unique to bigint since it has no special values. I think that the IEEE is quite accomplished and wise, and that NaN is actually better than throwing, because its propagation gives you the option of not looking until you care -- you can check for NaN at the end of a computation, you don't have to check for it everywhere, and you don't have to handle throws in the middle of unfinished computations -- maybe some later condition doesn't even use that particular intermediate result, and so the NaN is completely harmless. Exceptions eliminate that possibility to wait and see if an intermediate even ends up being used. I think that JavaScript has not been developed with the same level of deliberative care that the IEEE used and it is a pity bigint has no infinities nor a "NotABigint" value.

  • so in particular that means I maintain my suggestion to add negative infinity, not a fraction, and positive infinity to Fraction (as 1/0, 0/0, and 0/1) because they are entirely natural elements of the representation that are currently artificially disallowed, they have clear semantics, and for all the reasons that IEEE kept infinities and NaN in floats. How do you feel about that? There is no issue of "breaking a standard" with Fraction, since it is not an official "Rational" type of the JavaScript language.

  • I strongly believe that in the approach we are developing, when predictable is true, then calls like sin(fraction(1/3)) should return NotAFraction. One of the hallmarks of the design/philosophy of this rational arithmetic (stated well in the docs of the fraction class) is that when you get a result, it is a mathematically exact value. There is no Rational number equal to the sine of 1/3 or the square root of two, so these things should return NotAFraction to tell you so. (Or you can turn predictable off and get a bignum floating point approximation, and/or we could add interval arithmetic or ball arithmetic types that would give you two rationals the value lies between or a rational number and a bound on how far the true value could be, respectively.)

How do these ideas sit with you? I think we are close to being able to start systematizing mathjs functions along these lines. Thanks for the productive conversation!

@josdejong
Copy link
Owner

  1. Yes totally agree!

  2. I think this throwing behavior should be unique to bigint since it has no special values.

    Yes, this makes sense.

    You're right that it may be possible that part of a computation results in NaN, but that you can continue and things may be alright if this NaN value isn't actually used in the end. But it feels like quite an edge case to me. Have you ever utilized this for real? At least I haven't I do have experience with troubles to figure out the cause of a NaN result, it can be hard to debug. Therefore, I think in general you would like to be alerted when a NaN occurs. My personal preference tends to exceptions rather than NaN, but I think what is more important is to align with how our data types works in JavaScript. I think we have two options that align quite well: (A) either align with the bigint implementation of JS that throws an exception (inconsistent with the behavior of number), or (B) introduce some special value like new Symbol('NotABigint') and use that, so the behavior aligns better with how number works.

    Thinking through how to implement either of these two options, I'm not sure how we can implement something like NotABigInt. To make it work, it should be possible to use NotABigInt itself in operations, like 2 + NaN returns NaN. But how to do that for bigint? How to make 2n + NotABigInt work? Do you have a good idea there? If there is a neat solution for this, I think we can go for a NotABigInt solution (I think this is your preference too, right?). But if it turns out to be complicated or requires implementing a full abstraction layer and wrapper on top of bigint, I think option (A) would be the better choice.

  3. Yes I think that's fine with me, allowing Infinity in Fraction.

  4. Agree.

@josdejong
Copy link
Owner

One more thought about (3): we have to think through how a programmer using mixed numbers and bigint has to catch all the possible error cases. Right now this is:

try {
  const result = math.evaluate(expr)

  if (isNaN(result) {
    console.error('NaN result') // handle number NaN
  }

  // if we introduce a new kind of NotABigint, the programmer has to check for one more case:
  if (isNotABigint(result) {
    console.error('NotABigint result') // handle bigint NotABigint
  }
} (err) {
  // handle exceptions
  // Needed in all cases: handling everything from syntax errors to errors like "Index must be an integer"
  console.error(err) 
}

If we require programmers to check for an additional case of a NotABigint result, it makes the library harder to use. And it's easy to forget to check for these cases too. Only if we return the number NaN in case of 0n/0n we don't need a third check, but that is weird too and consecutive operations like 2n + NaN will throw again, so that solves nothing.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 12, 2025

  1. Yes totally agree!

Excellent, I am glad we are on the same page about making e.g. / for bigint consistent with all other operations, once we finalize what those conventions are.

  1. I think this throwing behavior should be unique to bigint since it has no special values.

Yes, this makes sense.
You're right that it may be possible that part of a computation results in NaN, but that you can continue and things may be alright if this NaN value isn't actually used in the end. But it feels like quite an edge case to me. Have you ever utilized this for real?

Not that I am specifically aware of. You should probably read https://langdev.stackexchange.com/questions/1051/pros-and-cons-of-throwing-an-error-versus-returning-a-nan-value before making a final call here. The biggest "pro" for NaN that it mentions is vector and matrix calculations, where you can do the whole calculation first and then filter out any NaNs that turn up, rather than have to deal with each exception as it occurs (possibly making it impossible to resume the vector operation).

At least I haven't I do have experience with troubles to figure out the cause of a NaN result, it can be hard to debug. Therefore, I think in general you would like to be alerted when a NaN occurs. My personal preference tends to exceptions rather than NaN,

Well that is certainly an important consideration.

but I think what is more important is to align with how our data types works in JavaScript.

And that is, too, but as you point out below, JavaScript works different ways for different types, so if we decide we want to work the same way for all types, we could really pick either and just say we aligned with that particular JavaScript type that works the way we end up preferring.

I think we have two options that align quite well: (A) either align with the bigint implementation of JS that throws an exception (inconsistent with the behavior of number), or (B) introduce some special value like new Symbol('NotABigint') and use that, so the behavior aligns better with how number works.
Thinking through how to implement either of these two options, I'm not sure how we can implement something like NotABigInt. To make it work, it should be possible to use NotABigInt itself in operations, like 2 + NaN returns NaN. But how to do that for bigint? How to make 2n + NotABigInt work? Do you have a good idea there? If there is a neat solution for this, I think we can go for a NotABigInt solution (I think this is your preference too, right?).

I am not aware of any way to alter/extend/enhance the behavior of + etc in JavaScript. E.g., 2n + NaN throws an error as JavaScript code, so it will always throw an error as JavaScript code. Similarly for 2n + Symbol('foo'). So the best we would be able to do is handle NotABigInt in math.add(...) and in mathjs expression language. If that seems inadequate, then exceptions it is, at least for bigint.

But if it turns out to be complicated or requires implementing a full abstraction layer and wrapper on top of bigint, I think option (A) would be the better choice.

And does option (A) then mean exceptions across the board, including for number type (which would be a majorly breaking change)? I see the following logical possibilities for when 'predictable' is true and we encounter an operation that has no value in the type of the inputs:

i) Never throw, and always return some neutral sentinel, even to the point of creating a NotABigint special value of some kind that JavaScript won't be able to do much with but which math.add and the expression language can handle.

ii) Throw for types that have no NaN/infinities (bigint, Fraction if we don't extend it if you prefer throwing anyway), and return neutral sentinels for types that do (number, BigNumber, Complex, Fraction if we extend it)

iii) Always throw, including for say number(0)/number(0) which currently returns NaN

iv) When predictable is true, have an optional additional outOfDomain configuration that lets you select any of (i), (ii), or (iii) that we decide to support, defaulting to whichever supported option produces the fewest breaking changes from current behavior. (Sorry to raise the spectre of additional configuration again, but when we are having trouble deciding it is perhaps an indication that different clients would want different behavior here.)

I think at this point we have hashed this thoroughly, so my recommendation is that you read the language design page on this I linked above, and then make your final call among (i) -- (iv) above (or some other option I didn't think of) and we go with it. I will be fine with whichever decision, since I expect I will mostly stick with predictable = false personally anyway. I feel that is the "convenience" choice anyway.

3. Yes I think that's fine with me, allowing Infinity in `Fraction`.

We should revisit this once you have made the final call on (i) -- (iv). For example, if you land on (iii), there is no point/need to extend Fraction; if you land on (ii) then we can simply choose either way, extend Fraction or not (as you know, in that case I would personally lean to extending it, but as with the overall decision, in this case I would be fine either way).

4. Agree.

OK, we are on the same page: if the final decision on 3 is to extend Fraction, then we will have sin() etc accept Fractions and return NotAFraction. If the decision is not to extend it and to throw errors, then we can leave sin() as not even accepting the Fraction type.

  1. Sorry to add an aspect, but I just wanted to ask: When predictable is false, what about those cases where an operation on a more complicated/less precise type happens to produce a mathematically exact answer that can be represented in one of our simpler/more precise types? For example, one might have Fraction times a Fraction that happens to be an exact integer -- should it return a plain ol' bigint? Similarly, should the floor() of a BigNumber return a bigint? Should the expression sin(1/4 cycle) return the bigint 1n since that value is exact? Or sin(1/12 cycle) return fraction(1,2)?
    In other words, we have already said that when predictable is false, functions are free to go to more complicated/less precise types if necessary to produce the highest mathematical-fidelity answer possible, but what about going the other direction if possible when the result happens to be mathematically completely exact?
    I am not sure I have an inclination either way on this question; I just wanted it to be out there that this is a decision that we will at least implicitly be making one way or the other. I think it's pretty clear the path of least resistance in coding is only to go the one direction in type-shifting that we have already discussed, but I don't think that necessarily means it's the best plan.

@gwhitney
Copy link
Collaborator Author

If we require programmers to check for an additional case of a NotABigint result, it makes the library harder to use. And it's easy to forget to check for these cases too.

Well, technically speaking at the moment, if you are evaluating a completely arbitrary mathjs expression and you want to check all possible "bad" returns, already you need to check for BigNumber NaN and Complex NaN as well (and maybe some infinities and other things I am overlooking). So I think the main import of your example is that if math.isNumeric() or something like that does not already test whether a value that comes out of mathjs is an "ok number to compute with", then we need to add a new function to the library. I see for example that there is no math.isFinite() -- if this behavior is not already covered, we could make that return false on every sort of NaN and/or infinity in the system, and true on all other scalar number entities.

@josdejong
Copy link
Owner

I've indeed read up a bit on the pros and cons of NaN vs throwing - and, yeah, both have pros and cons. It is a trade-off.

So the best we would be able to do is handle NotABigInt in math.add(...) and in mathjs expression language. If that seems inadequate, then exceptions it is, at least for bigint.

I too think that a special NotABigInt would be the only way. But it feels like quite some overhead to get calculations involving NotABigInt work rather than having them throw an exception (which is what will happen by default). I'm afraid we would have to implement a signature for it on every function, since the exact behavior may differ (like with your example of matrix operations which may ignore NaN's), so there is no easy way to address this on the level of typed-function I think. Concluding, I think for bigint it is best to go for throwing rather than NotABigInt.

Thanks for your clear listing of the options (i) to (iv). I think we can best go for (ii) or (iii). This is a choice between aligning with the the way the data types work (ii) vs consistency for all data types (iii). It is probably tricky to achieve (iii) in a reliable way, never returning NaN: in internal computations a NaN may occur, so we would have to somehow detect that after every calculation and throw an error instead. Concluding, I think it's best to go for the pragmatic way and choose (ii): use NaN for the data types that support it, throw otherwise. This also prevents introducing breaking changes.

About (3): in the spirit of the choise for (ii), I think we can leave Fraction too as it is, throwing an exception on 0/0 and infinity. So it will be have similar to bigint. Would that make sense or am I overlooking something?

About (4): so we can leave the current implementations of functions like sin as is, not accepting a Fraction at all.

About (5): yes I think that is fine, but let's addres such improvements in separate PR's. I had a quick look, and there are currently two cases, one in sign.js and one in gamma.js: when the imaginary part of a Complex number is zero, these functions return a number. I think for consistency we should only do that when predictable === false though.

About my comment about needing extra NaN checks: thinking again, my comment doesn't make sense, we can use the math.isNaN function which checks NaN for all data types. We can indeed think through whether isNumber should return false for NaN values, and we can implement a function isInfinite or isFinite.

@gwhitney
Copy link
Collaborator Author

But it feels like quite some overhead to get calculations involving NotABigInt work rather than having them throw an exception (which is what will happen by default). I'm afraid we would have to implement a signature for it on every function, since the exact behavior may differ (like with your example of matrix operations which may ignore NaN's), so there is no easy way to address this on the level of typed-function I think. Concluding, I think for bigint it is best to go for throwing rather than NotABigInt.

I don't think it's so bad: simply the mathjs type would be an extended bigint, not the raw JavaScript bigint, so maybe we would rename it to EBigInt or something, and the typed-function type recognizer would accept a bigint or our special NotABigInt symbol (and maybe symbols for PosInfBigInt and NegInfBigInt), and then the basic functions would have to check for these symbols in their implementations. I don't see that it would mean extra signatures for most functions, just switching the bigint type to EBigInt in most cases.

But that said, I am comfortable with your decision to go with (ii) now and leave Fraction alone at least for now (even though I personally would rather have a rational class that allows -1/0, 0/0, an 1/0; the ban on them is kind of artificial). I therefore think we have enough to proceed. I would plan to start with a PR that makes the documentation more explicit and thorough in this regard, and removes some of the existing violations (although such a PR may have to go on a branch started for the next breaking change). I don't know that I can get all violations removed in a single pass. (Actually I will pre-start by checking that none of my extant PRs introduce any new violations.)

I just want to be clear on (5): You are advocating that when predictable = false, it will be encouraged to return a simpler type when (and only when) it exactly represents the result? In other words, ultimately fraction(4,3)*fraction(3,4) should return 1n rather than fraction(1,1)? I am personally fine with that, but it will produce breaking changes as we implement it. But I can get started without a final decision on this, because as you say, implementing these sorts of type simplifications should be a second step after getting the out-of-domain issues straight.

@gwhitney
Copy link
Collaborator Author

OK, turns out I have just two outstanding PRs. For #3375, this issue is not strictly relevant because the result of math.bigint is certainly always a bigint if it's not throwing, but there is still a question of philosophical alignment that I have put in the discussion of that PR; and #3378 is completely orthogonal to this issue. So once you decide how/whether you would like #3375 to align with where we ended up on this issue, then I can embark on documenting the new mathjs policy on out-of-domain operation in the codebase itself and starting to align other cases with the new policy via some PRs.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 17, 2025

And just to make it explicit, here is where we have ended up on the running examples that have gone throughout this issue, so you can confirm we are really settled on the same page. Breaking changes from current behavior are marked in bold as such. Note I have not considered changes in which a former throw would become a valid value; if that would be considered breaking, then essentially all conforming changes are breaking changes.

First, the default behavior with predictable = false should be:

  • Division
    • Fraction: remains as is, always returns a Fraction except when throwing for f/0.
    • Bigint: Breaking b/c returns a bigint when b is a multiple of nonzero c, otherwise returns a Fraction or throws for b/0
  • sqrt
    • Fraction: return the rational square root of a perfect rational square as a Fraction, otherwise return the BigNumber approximation to the square root or Complex approximation for negative Fractions
    • Bigint: return the bigint square root of a perfect square, otherwise return the BigNumber approximation to the square root for positive bigints (Maybe Breaking: currently returns a number, but since bigints are arbitrarily precise, BigNumber seems to be the correct "best type available" in mathjs as it stands; should it be considered a breaking change to upgrade from a number result to a BigNumber result here?) or the Complex approximation for negative bigints.
  • log(x, base)
    • Fraction: return Fraction value when x is rational power of base, otherwise return the BigNumber/Complex approximation as appropriate
    • Bigint: return the bigint value when x is an integer power of base, otherwise return the Fraction value when x is a rational power of base (Maybe Breaking if you consider the shift from number to Fraction here breaking), otherwise return the BigNumber/Complex approximation as appropriate (Maybe Breaking if you consider number -> BigNumber shift breaking).

And now the behavior with predictable = true:

  • Division
    • Fraction: remains as is
    • Bigint: Breaking b/c returns a bigint when b is a multiple of nonzero c, otherwise throws
  • Sqrt:
    • Fraction: returns Fraction square root of perfect rational square, otherwise throws
    • Bigint: returns bigint square root of perfect square, otherwise throws
  • log(x, base)
    • Fraction: return Fraction value when x is rational power of base, otherwise throws
    • Bigint: return bigint value when x is integer power of base, otherwise throws

And note that to make the above changes workable for practical computations, we will need to add some operations:

  • quotient(a,b), returning the largest integer n such that na ≤ b (will have to see what the proper interpretation of this for complex numbers would be; I think there is a sensible meaning that always returns a complex number with exact integer re and im parts). The name for this function seems to be fairly standard (e.g., Mathematica) and I would suggest we support it in the expression language as operator //.
  • isqrt(a), returning the largest integer n such that n² ≤ |b|. I'm suggesting the Wikipedia name for this function, which matches the Python name, except note that I have extended its domain by taking the absolute value of b. We could instead throw on all negative or complex inputs, but it seemed more practical to throw less. There's definitely no need to support this function as an operator in the expression language unless/until we adopt for square root as in the oldest extant PR add a √ character to the pool of operators #767, that I think we mean to revive someday; in that case, we could consider √/ for this by analogy with //, but maybe it's a bit outlandish; at least, it would work syntactically, whereas √√ is not an option because √√16 should certainly evaluate to 2, i.e., taking two square roots. [On this latter point, as an aside that we could make into a separate issue if you think it's worth it, currently 3!! evaluates to (3!)! = 720 but the usual mathematical meaning of double factorial is the product of all integers up to the argument that have the same odd/even parity as the argument, so 3!! should be 3*1 or just 3, and if you actually want the 720 value you should have to write out (3!)!.]

@josdejong
Copy link
Owner

Thanks for the constructive discussion, evaluating and weighing all options.

Yes it would be technically doable to implement a NotABigInt and add support for NaN in Fraction, but I prefer the pragmatic (though less consistent) solution of aligning with bigint and Fraction itself and throw an exception for those types.

Thanks for your plan to work on improving the documentation. It's indeed in large part existing behavior.It will be good to clarify this in the docs.

About (5): yes indeed, we can look at cases where we can return a simpler type. I think we should be a bit careful there and weigh the pros/cons on a case-by-case base. When a Fraction result is an integer, or a Complex result is purely real, we may want to return a numeric result with type config.number. We should weigh added value for the user (which is subjective), added code complexity, performance, and whether it will be a breaking change. I think the hardest will be to decide whether returning a simpler type actually adds value to the user or is unhelpful.

About the overview with changes:

  • In all of the cases under predictable: false: instead of using a BigNumber when needed, we can use config.number, does that make sense?
  • About the "Maybe breaking" cases: they are indeed subtle but let's just mark them as breaking ok?
  • Besides that: sounds like a solid plan 👌

Your suggestions for functions quotient, integer division or "floor division" along with an operator //, and function isqrt sounds good. Python too has an operator //, we're lucky that we're not using // for comments 😄. Using n² ≤ |b| is fine with me. I think operators like √/ or √√ will be confusing, so maybe just limit to using isqrt as a function (we can always rethink this later if there turns out to be need for it). Let's discuss the behavior of !! separately to keep this topic focused ok 😅?

@gwhitney
Copy link
Collaborator Author

About (5): yes indeed, we can look at cases where we can return a simpler type. I think we should be a bit careful there and weigh the pros/cons on a case-by-case base. When a Fraction result is an integer, or a Complex result is purely real, we may want to return a numeric result with type config.number. We should weigh added value for the user (which is subjective), added code complexity, performance, and whether it will be a breaking change. I think the hardest will be to decide whether returning a simpler type actually adds value to the user or is unhelpful.

OK, let's leave (5) aside for now and take up the discussion in #3398 for when the time is ripe (presumably when the agreed-on principles for when to go to more "complicated" types are documented and largely implemented).

@gwhitney
Copy link
Collaborator Author

gwhitney commented Feb 20, 2025

About the overview with changes:

  • In all of the cases under predictable: false: instead of using a BigNumber when needed, we can use config.number, does that make sense?

OK, we are fully on the same page now except for this point. I would say that it does not make sense to use config.number in this regard, as that determines e.g. how an unadorned string of decimal digits is interpreted, and we certainly want it to be valid to set config.number to bigint or Fraction, but neither of those will work to tell us what type to return when a square root is not rational.

So then you might say "oh, let's use config.fallbackNumber". And I would agree that choice would be more reasonable, but still I would say that this is a parsing configuration, and computation is different from parsing, and there is little cost to additional configuration options. And moreover, the default fallbackNumber is number, I believe; but I think the default for the irrational square root of a bigint or Fraction should be BigNumber, because if you are dealing with bigints or fractions you are likely concerned with precision and JavaScript number is not all that precise.

(In fact, since each BigNumber can record its own precision, it could definitely make sense to return a BigNumber with a precision that depended on its inputs rather than the config.precision -- it might be that the natural amount of precision would depend on the input. For example, if you ask for the square root of the bigint equal to 10^201, it seems a little weird to round it to just 64 digits -- I think you would expect at least 100 digits, to get to the decimal point, and you would very likely want at least some digits after the decimal point since you have asked for a non-integer value. EDIT: the config.precision should, I think, be the minimum precision in any case, though.)

So my recommendation here is either to have a new config option, called say config.irrationalResult or config.preferredReal or something, which would default to BigNumber, or to just let it always be BigNumber.

Looking forward to your thoughts and to getting going on this.

@josdejong
Copy link
Owner

You're right. You can indeed set config.number to a Fraction or bigint. That will not work.

A new config option is an option indeed. I think though that the existing numberFallback option is quite a good fit, though we may want to rephrase its description in the docs? Maybe even rename it? I like the name irrationalResult, though it would be a breaking change if we rename the existing option.

Existing description in the docs:

numberFallback. When number is configured for example with value 'bigint', and a value cannot be represented as bigint like in math.evaluate('2.3'), the value will be parsed in the type configured with numberFallback. Available values: 'number' (default) or 'BigNumber'

Attempt to rephrase:

numberFallback. When a value cannot be represented in the provided numeric input type the or configured numeric type, the result will be converted into the type configured with config.numberFallback. For example, when you have configured config.number='bigint' and then parse math.evaluate('2.3'), the result cannot be represented as a bigint and will be parsed into the configured config.numberFallback type. Or when evaluating math.sqrt(math.fraction(2/3)), the result cannot be represented as a Fraction, and the result will be the configured config.numberFallback type.
Available values: 'number' (default) or 'BigNumber'

What do you think?

@gwhitney
Copy link
Collaborator Author

I think there are going to be plenty of breaking changes, so I wouldn't worry about another one if you'd prefer a different name for a unified parameter.

But as to whether to unify: I wouldn't, but if you prefer it I am willing. I think I have already said my reasons, but I will quickly recap: one type of fallback is used exclusively in parsing and one exclusively in computing. As those are two different things, who knows if clients will always prefer the same type in both cases. If we have just one, clients are locked into using the same type. The parsing one really only has to do with bigints, because Fraction can represent 2.3, whereas the computing one comes up with both bigints and fractions. And since those are both high-precision types, it seems clear to me the default for the computing one should be bignumber, whereas you have been proposing a number default.

OK, having laid out my full case I will await your final decision and then start implementing, whichever way you decide.

@josdejong
Copy link
Owner

The point of being able to configure parsing and conversion/calculation differently is a good one. You may be right, there may be need for this. I'm not sure though and only time will tell I guess. On the other hand, keeping the amount of options limited simplifies the use of the library. So, I understand your point, and maybe some day in the future you will say "I told you" ;), but for now, I prefer to use the existing config option numberFallback for both parsing and conversions during computation rather than introducing a new option irrationalResult. If the need arises, we can rethink splitting the option into two.

I agree that when you configure config.number to be bigint or Fraction, you probably want to set config.numberFallback to BigNumber. We can explain that in the docs I think. Still, I think the best defaults for both config options is number since that is the native numeric type in JavaScript.

So in your application, I think you will configure {number: 'bigint', numberFallback: 'BigNumber'}, right? Would that be sufficient, or would you run directly into limitations in that regard?

@gwhitney
Copy link
Collaborator Author

OK we will try that -- the first PR will focus on documenting the dual semantics of numberFallback and making non-breaking changes. Will get to it as soon as I have time.

@josdejong
Copy link
Owner

Thanks Glen!

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 2, 2025

So, I understand your point, and maybe some day in the future you will say "I told you" ;), but for now, I prefer to use the existing config option numberFallback for both parsing and conversions during computation

Oops, I was starting to code this up, and ran into a bind: for parsing, config {number: 'bigint', numberFallback: 'Fraction'} is perfectly legitimate, since Fraction is perfectly capable of representing e.g. 1.75, but it doesn't give any info about what to return if also predictable: false and it encounters sqrt(2), which cannot be represented as a Fraction. Should it produce the number or BigNumber approximation to sqrt(2) in such a case? Please advise.

@josdejong
Copy link
Owner

Hm. Officially numberFallback should only support number and BigNumber right now, but it doesn't throw an error when you try an other data type. Shouldn't we just not allow Fraction in numberFallback? It would be odd to configure { number: 'Fraction', numberFallback: 'Fraction' }, and then have a function like sqrt or sin still return a Fraction via the numberFallback.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 3, 2025

No, we want to interpret both integers and decimals as exact specifications of real numbers in our formulas, so we plan to use {number: 'bigint', numberFallback: ''Fraction'} for parsing. It's actually a very natural setting. So that's why I was concerned with the case -- we need someplace to specify we want bignumbers when an operation takes us from the specified exact rationals into the irrational. Thanks for your thoughts.

@josdejong
Copy link
Owner

Ok makes sense. So then we have to update the docs to tell that numberFallback can be Fraction too, and make the config function warn you when you try a non-supported numeric type like numberFallback: 'bigint' for example.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 3, 2025

Agreed, but then should we add an irrationalResult config defaulting to numberFallback (which defaults to number) so that in our use case we can set {number: 'bigint', numberFallback: 'Fraction', irrationalResult: 'bignumber'}? Thanks for letting me know.

@josdejong
Copy link
Owner

If I understand you correctly you already do have a use case where you need different configuration for parsing vs calculation (like we where discussing here: #3374 (comment))? If so can you explain a bit more about it?

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 3, 2025

If I understand you correctly you already do have a use case where you need different configuration for parsing vs calculation (like we where discussing here: #3374 (comment))? If so can you explain a bit more about it?

Just as we were describing: in an expression like 1.28^(5 % 3) we want the 5 and 3 to be interpreted as bigints and the 1.28 to be interpreted as the exact Fraction 32/25, because we want to do exact arithmetic whenever possible with no roundoff error (this is a number theory application, focused on integers and rational numbers, so people would just be using decimal notation as a shortcut for the corresponding exact fraction). On the other hand we sometimes need to compute irrational numbers to great accuracy: for example we want floor(sqrt(2)n) to be the exactly correct integer, even for integers n that may have 20 digits, say. So when we hit that sqrt(2), we want to approximate it with a bignumber. Similarly, in sin((n/m) cycle) we want the exact fraction in the unit value, and bignumber accuracy in the sin computation, and frankly we would like to recognize the exact rational values of sin, like sin(1/12 cycle) = 1/2 -- there are not so many of them -- and return Fractions for those; that would be a later refinement.

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 3, 2025

Oh, that brings up another question: would it be better:

A) to have floor always return bigint when the argument is a bignumber or a Fraction?

B) to have floor return the config number type when the argument is a bignumber or a Fraction?

(I think it's clear that floor of a JavaScript number should be a JavaScript number in all cases.)

@josdejong
Copy link
Owner

Ah, yes, you've convinced me. When parsing a number from a string, it always is a rational number (it has a limited number of digits). Only when doing calculations we can get irrational numbers. So, yeah, it makes sense to introduce a separate configuration option for this 👍.

How to best name this option? Some ideas:

  • numberIrrational (in the spirit of giving all number config options a number* prefix)
  • numberReal
  • irrationalNumber
  • irrationalResult (the "result" part shoulds a bit weird to me)
  • realNumber

About floor, how about:

  • floor(number) => number
  • floor(BigNumber) => BigNumber
  • floor(Fraction) => BigInt since Fraction holds a BigInt numerator and denominator, and when the denominator is just 1n it does not add value so we can return the numerator (similarly, when using complex numbers, and some calculation returns an imaginary part 0, I would be happy to just get the number back with the real value rather than x + 0i).

@gwhitney
Copy link
Collaborator Author

gwhitney commented Mar 6, 2025

OK on reflection I am comfortable with your proposed convention for floor; as bignumbers are approximations, it makes sense to return an approximation of the integer part with the same level of granularity, just like we do with number.

So that suggests numberApproximation for the new config option for the computational fallback type. What do you think of that? Or numberApproximate ? -- it's the type to use when going from exact to approximate in calculations.

@josdejong
Copy link
Owner

👍

The name numberApproximate sounds good to me 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants