-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inconsistencies in definitions of bigint operations and Fraction operations #3374
Comments
(I guess if I had to vote myself, at this moment I would say that in the spirit of the existing |
Upon sleeping on it, while what I said about 'closest' (Proposal 2) providing new mathematical capabilities is true, in a scheme where it is the only one directly supported by mathjs, it makes it the hardest for someone using the library to implement a different strategy themselves. The obstruction is that under 'closest', it can be difficult to tell when you've gotten an exact answer and when you've only gotten an approximation (in which case maybe you want to try another datatype). So I will edit out my recommendation that if there is only one that should be it. I guess if there is only one it should be 'promote'. But maybe it is best for mathjs to be at least somewhat configurable in this, as each proposal has its virtues. |
Thanks Glen, this is a good discussion point. We have to propose your try/barf construct to the TC39 commission and make it a part of JavaScript :) For context: the general idea in I think there are two main use cases that we need to serve, and the idea behind the option
Thoughts:
|
Thanks for that feedback! With that we can start to converge to a workable plan. First, your description of the current state of mathjs w/r/t config.predictable is not quite reflected in the code as it stands. Nowhere in the code when predictable is false does mathjs consult config.number or config.numberFallback. Here is a complete catalog of the current uses of predictable:
That's all. So in listening to your feedback in terms of the two use cases, I would suggest that we keep predictable, and have its two values mean: T. When predictable is true and all inputs are a given type, any result returned must be of that same type. F. When predictable is false, and the result of the Platonic mathematical operation cannot be represented by the input type(s), mathjs returns its judgment of the "best" type it knows of to return that Platonic exact result. Those two options seem to correspond well to the two use cases you mention. But they do leave two open questions: M. (For "mixed") When predictable is true and inputs are of mixed type, what type shall we return? Should we just presume that the conversion operations will promote this situation in some way or another to a case of all inputs of the same type, and then apply principle T? I think that is roughly what is happening in the status quo, but there could well be cases in which we could produce "better" answers by choosing one of the types of the operands based on knowing what all of the supplied values are (e.g. a bigint times a fraction could result in a bigint if the denominator cancels). Should we worry about finding any of those cases? Or is it central to the idea of principle T that operations should first be reduced to cases of uniform type? B. (For "barf") All current cases of T applied in practice in mathjs code deal with number and BigNumber, which are convenient in that those number systems contain their respective NaN values, which can never be "wrong" as the outcome of an operation. But as we become systematic about predictable, consider say
We need to make some at least initial decision on M and B. My recommendations: M. At least for now, leave the status quo where we presume typed-function/conversions transmute everything we need to implement into the uniform-type case, and just focus on implementing that case. B. Option 2 in which we explicitly say that mathjs will operate on this ExtendedBigint type (to the point of the TypeScript typings, etc.), and not precisely on the built-in bigint type, seems like the path of least resistance. (1) seems like a potential trap for clients ("How can I trust the result when I get back a 0?") and (3) seems like more of a pain ("I just want the answer, I don't want to have to wrap all my mathjs calls in try/catch"). But I could totally be convinced otherwise: if 0n is the only sentinel bigint uniformly used in this way, maybe it's not too hard to check if it's a real or sentinel answer; or maybe I shouldn't be so allergic to try/catch. So really I would be fine with any proposal here, as long as we do it uniformly across types that don't have NaNs and across all functions. (It was this ambivalence that led me to propose that the barfing style be configurable.) The answer could be different for bigint and Fraction, because Fraction is so easy/natural to extend to infinities and not a number, but bigint isn't. And I think with decisions on M and B, we would be ready to systematize the (existing and future) mathjs functions. To examine the cases in my original post, under these recommendations:
How does that all sound? What are your feelings on questions M and B? |
Ah, you're right, sorry for the confusion. M. (For "mixed") that is an interesting point. I agree with you, I think it's fine to keep it like it's working now: there is a set way to resolve mixed data types, like mixing a B. (For "barf") I agree that option (1) a sentinel bigint would be tricky to use. I expect that option (3) introducing an When doing a calculation of which the result cannot be represented as a What I am thinking about though is whether we should change the cases where mathjs currently returns I like the idea of improving |
How do these ideas sit with you? I think we are close to being able to start systematizing mathjs functions along these lines. Thanks for the productive conversation! |
|
One more thought about (3): we have to think through how a programmer using mixed numbers and bigint has to catch all the possible error cases. Right now this is: try {
const result = math.evaluate(expr)
if (isNaN(result) {
console.error('NaN result') // handle number NaN
}
// if we introduce a new kind of NotABigint, the programmer has to check for one more case:
if (isNotABigint(result) {
console.error('NotABigint result') // handle bigint NotABigint
}
} (err) {
// handle exceptions
// Needed in all cases: handling everything from syntax errors to errors like "Index must be an integer"
console.error(err)
} If we require programmers to check for an additional case of a |
Excellent, I am glad we are on the same page about making e.g.
Not that I am specifically aware of. You should probably read https://langdev.stackexchange.com/questions/1051/pros-and-cons-of-throwing-an-error-versus-returning-a-nan-value before making a final call here. The biggest "pro" for NaN that it mentions is vector and matrix calculations, where you can do the whole calculation first and then filter out any NaNs that turn up, rather than have to deal with each exception as it occurs (possibly making it impossible to resume the vector operation).
Well that is certainly an important consideration.
And that is, too, but as you point out below, JavaScript works different ways for different types, so if we decide we want to work the same way for all types, we could really pick either and just say we aligned with that particular JavaScript type that works the way we end up preferring.
I am not aware of any way to alter/extend/enhance the behavior of
And does option (A) then mean exceptions across the board, including for i) Never throw, and always return some neutral sentinel, even to the point of creating a NotABigint special value of some kind that JavaScript won't be able to do much with but which math.add and the expression language can handle. ii) Throw for types that have no NaN/infinities (bigint, Fraction if we don't extend it if you prefer throwing anyway), and return neutral sentinels for types that do (number, BigNumber, Complex, Fraction if we extend it) iii) Always throw, including for say number(0)/number(0) which currently returns NaN iv) When predictable is true, have an optional additional I think at this point we have hashed this thoroughly, so my recommendation is that you read the language design page on this I linked above, and then make your final call among (i) -- (iv) above (or some other option I didn't think of) and we go with it. I will be fine with whichever decision, since I expect I will mostly stick with predictable = false personally anyway. I feel that is the "convenience" choice anyway.
We should revisit this once you have made the final call on (i) -- (iv). For example, if you land on (iii), there is no point/need to extend Fraction; if you land on (ii) then we can simply choose either way, extend Fraction or not (as you know, in that case I would personally lean to extending it, but as with the overall decision, in this case I would be fine either way).
OK, we are on the same page: if the final decision on 3 is to extend Fraction, then we will have
|
Well, technically speaking at the moment, if you are evaluating a completely arbitrary mathjs expression and you want to check all possible "bad" returns, already you need to check for BigNumber NaN and Complex NaN as well (and maybe some infinities and other things I am overlooking). So I think the main import of your example is that if |
I've indeed read up a bit on the pros and cons of
I too think that a special Thanks for your clear listing of the options (i) to (iv). I think we can best go for (ii) or (iii). This is a choice between aligning with the the way the data types work (ii) vs consistency for all data types (iii). It is probably tricky to achieve (iii) in a reliable way, never returning About (3): in the spirit of the choise for (ii), I think we can leave Fraction too as it is, throwing an exception on About (4): so we can leave the current implementations of functions like About (5): yes I think that is fine, but let's addres such improvements in separate PR's. I had a quick look, and there are currently two cases, one in About my comment about needing extra NaN checks: thinking again, my comment doesn't make sense, we can use the |
I don't think it's so bad: simply the mathjs type would be an extended bigint, not the raw JavaScript bigint, so maybe we would rename it to EBigInt or something, and the typed-function type recognizer would accept a bigint or our special NotABigInt symbol (and maybe symbols for PosInfBigInt and NegInfBigInt), and then the basic functions would have to check for these symbols in their implementations. I don't see that it would mean extra signatures for most functions, just switching the bigint type to EBigInt in most cases. But that said, I am comfortable with your decision to go with (ii) now and leave Fraction alone at least for now (even though I personally would rather have a rational class that allows -1/0, 0/0, an 1/0; the ban on them is kind of artificial). I therefore think we have enough to proceed. I would plan to start with a PR that makes the documentation more explicit and thorough in this regard, and removes some of the existing violations (although such a PR may have to go on a branch started for the next breaking change). I don't know that I can get all violations removed in a single pass. (Actually I will pre-start by checking that none of my extant PRs introduce any new violations.) I just want to be clear on (5): You are advocating that when predictable = false, it will be encouraged to return a simpler type when (and only when) it exactly represents the result? In other words, ultimately |
OK, turns out I have just two outstanding PRs. For #3375, this issue is not strictly relevant because the result of |
And just to make it explicit, here is where we have ended up on the running examples that have gone throughout this issue, so you can confirm we are really settled on the same page. Breaking changes from current behavior are marked in bold as such. Note I have not considered changes in which a former throw would become a valid value; if that would be considered breaking, then essentially all conforming changes are breaking changes. First, the default behavior with predictable = false should be:
And now the behavior with predictable = true:
And note that to make the above changes workable for practical computations, we will need to add some operations:
|
Thanks for the constructive discussion, evaluating and weighing all options. Yes it would be technically doable to implement a Thanks for your plan to work on improving the documentation. It's indeed in large part existing behavior.It will be good to clarify this in the docs. About (5): yes indeed, we can look at cases where we can return a simpler type. I think we should be a bit careful there and weigh the pros/cons on a case-by-case base. When a Fraction result is an integer, or a Complex result is purely real, we may want to return a numeric result with type About the overview with changes:
Your suggestions for functions |
OK, let's leave (5) aside for now and take up the discussion in #3398 for when the time is ripe (presumably when the agreed-on principles for when to go to more "complicated" types are documented and largely implemented). |
OK, we are fully on the same page now except for this point. I would say that it does not make sense to use So then you might say "oh, let's use config.fallbackNumber". And I would agree that choice would be more reasonable, but still I would say that this is a parsing configuration, and computation is different from parsing, and there is little cost to additional configuration options. And moreover, the default fallbackNumber is (In fact, since each BigNumber can record its own precision, it could definitely make sense to return a BigNumber with a precision that depended on its inputs rather than the config.precision -- it might be that the natural amount of precision would depend on the input. For example, if you ask for the square root of the bigint equal to 10^201, it seems a little weird to round it to just 64 digits -- I think you would expect at least 100 digits, to get to the decimal point, and you would very likely want at least some digits after the decimal point since you have asked for a non-integer value. EDIT: the config.precision should, I think, be the minimum precision in any case, though.) So my recommendation here is either to have a new config option, called say Looking forward to your thoughts and to getting going on this. |
You're right. You can indeed set A new config option is an option indeed. I think though that the existing Existing description in the docs:
Attempt to rephrase:
What do you think? |
I think there are going to be plenty of breaking changes, so I wouldn't worry about another one if you'd prefer a different name for a unified parameter. But as to whether to unify: I wouldn't, but if you prefer it I am willing. I think I have already said my reasons, but I will quickly recap: one type of fallback is used exclusively in parsing and one exclusively in computing. As those are two different things, who knows if clients will always prefer the same type in both cases. If we have just one, clients are locked into using the same type. The parsing one really only has to do with bigints, because Fraction can represent 2.3, whereas the computing one comes up with both bigints and fractions. And since those are both high-precision types, it seems clear to me the default for the computing one should be bignumber, whereas you have been proposing a number default. OK, having laid out my full case I will await your final decision and then start implementing, whichever way you decide. |
The point of being able to configure parsing and conversion/calculation differently is a good one. You may be right, there may be need for this. I'm not sure though and only time will tell I guess. On the other hand, keeping the amount of options limited simplifies the use of the library. So, I understand your point, and maybe some day in the future you will say "I told you" ;), but for now, I prefer to use the existing config option I agree that when you configure So in your application, I think you will configure |
OK we will try that -- the first PR will focus on documenting the dual semantics of numberFallback and making non-breaking changes. Will get to it as soon as I have time. |
Thanks Glen! |
Oops, I was starting to code this up, and ran into a bind: for parsing, config |
Hm. Officially |
No, we want to interpret both integers and decimals as exact specifications of real numbers in our formulas, so we plan to use |
Ok makes sense. So then we have to update the docs to tell that |
Agreed, but then should we add an |
If I understand you correctly you already do have a use case where you need different configuration for parsing vs calculation (like we where discussing here: #3374 (comment))? If so can you explain a bit more about it? |
Just as we were describing: in an expression like |
Oh, that brings up another question: would it be better: A) to have floor always return bigint when the argument is a bignumber or a Fraction? B) to have floor return the config number type when the argument is a bignumber or a Fraction? (I think it's clear that floor of a JavaScript number should be a JavaScript number in all cases.) |
Ah, yes, you've convinced me. When parsing a number from a string, it always is a rational number (it has a limited number of digits). Only when doing calculations we can get irrational numbers. So, yeah, it makes sense to introduce a separate configuration option for this 👍. How to best name this option? Some ideas:
About
|
OK on reflection I am comfortable with your proposed convention for floor; as bignumbers are approximations, it makes sense to return an approximation of the integer part with the same level of granularity, just like we do with number. So that suggests |
👍 The name |
Describe the bug
Many operations on integers, like division or square root or logarithm, do not necessarily produce integers. Similarly, many operations on rational numbers, like square root or logarithm, do not necessarily produce rational numbers. However, the approach that mathjs currently takes in determining whether to accept such arguments and what to return if it does varies widely both between the two different types and within the types. For some examples:
Division of a bigint by a bigint returns the floor of the exact answer when that is not an integer
Square root of a bigint returns the number/Complex (approximation, or exact) square root, even when the answer is an exact integer.
logarithm of a bigint follows the same scheme
Division of Fraction by Fraction is no issue, it's always a Fraction except 0/0 which throws (as it presumably will in any scheme)
Square root of a Fraction always throws
logarithm of a Fraction to a Fraction base returns a Fraction when the exact answer happens to be rational, and throws an error otherwise, meaning that you must use try/catch to use it unless you for some reason happen to know that the outcome will be rational (which seems like an unusual situation).
To Reproduce
There are many examples, such as:
Discussion
With growing support of bigint, it seems important to adopt and make clear a consistent philosophy on how mathjs will handle the results of mathematical operations that go outside the domain that the inputs come from. Otherwise, it seems likely the problems illustrated above will become worse, leaving mathjs prone to producing inscrutable behavior.
Here are some possibilities:
Proposal (1): Take a page from the very early story of mathjs and sqrt and number. When x is negative, there is no number that is the sqrt of -4. So mathjs has a config option
predictable
. When it is true, sqrt(-4) therefore returns NaN -- an entity of type number, that informs the user there was otherwise no appropriate number answer. When it is false (the default), mathjs is allowed to go to the best available type to represent the result , and so returns complex(0,2).A slight hitch in the case of bigint and Fraction is that neither domain contains an analogue of
NaN
. So to truly remain within the type, when there is no suitable answer, our only alternative would be to throw. On the other hand, we might want to return null or NaN (choose one and use it always) even though it is not a bigint or Fraction, so that we are returning a sentinel value that can be checked for without try/catch, and which will presumably propagate into any further calculation, making the whole thing null or NaN to signal that somewhere within, something failed. So this option (1) splits into (1a) and (1b) depending on whether operations that cannot be satisfied throw an error, or whether they return null or NaN. To not have to repeat the below, I will just say the computation "barfs", and we would just need to pick one consistently (or allow it to be configured, perhaps by allowing additional values of thepredictable
option). (For the particular case of Fraction, it could be extended to allow the indeterminate ratio 0/0 as its own sort of NaN within its domain, as its particular kind of barfing.)So proposal (1) would be:
When predictable is false (default), all operations strive to return the best result in the best available type whenever possible. When a result is irrational, that best type could potentially be number or bignumber, and perhaps the fallbackNumber configuration option should control the choice of which one. I will just say "floating point" below to be agnostic between number and bignumber. So for example, in this situation:
Dividing a bigint by another would produce a bigint when the quotient is an integer, and a Fraction otherwise
sqrt of a bigint would produce a bigint for perfect squares, a floating point for other positive numbers, and a Complex for negative bigints. (Note no integer has a rational but non-integer square root, as a matter of mathematical fact. This is the current behavior.)
logarithm of a bigint would produce a bigint for perfect powers, a Fraction for rational powers, a floating point for other positive arguments, and a Complex otherwise.
Dividing a Fraction by a Fraction would always be a Fraction
sqrt of a rational square would produce a Fraction, a floating point for other positive numbers, and a Complex for negative Fractions.
logarithm of Fraction to a Fraction base would produce a Fraction when it happens to be a rational power; in all other cases or if either input is a floating point type, it would produce a floating point or Complex result if possible.
When predictable is true, all operations barf whenever the answer cannot be the same type (with perhaps the variant of barfing being configurable). In particular, in this situation
Dividing a bigint by another would produce a bigint when the quotient is an integer, and barf otherwise
sqrt of a bigint would produce a bigint for perfect squares, and barf otherwise
logarithm of a bigint would produce a bigint for perfect powers, and barf otherwise
sqrt of a rational square would produce a Fraction, and barf otherwise
logarithm of Fraction to a Fraction base would produce a Fraction when it is a rational power, and barf otherwise.
Proposal (2): Take a page from JavaScript's definition of bigint division, and have mathjs always strive to produce the "best" approximation to an answer within the arguments' domain:
Dividing a bigint by a bigint produces the bigint that is the floor of the actual quotient (the current behavior).
sqrt of a nonnegative bigint would produce the floor of the actual quotient, otherwise a sentinel value like 0 or -1.
logarithm of a bigint would produce the floor of the actual logarithm when that is real, otherwise a sentinel value.
sqrt of a nonnegative fraction would produce the exact Fraction when there is one, otherwise an approximation with minimal denominator within some set precision. One could press the existing
precision
configuration option into use, and say we want an approximation within 10^(-precision), to match roughly the precision we would get out of BigNumber. Or we could add a new configuration 'rationalPrecision', perhaps as the log of the maximum denominator allowed. For a negative fraction, you would get a sentinel value like 0, -1, or if we add such a thing, 0/0.logarithm of Fraction would produce the exact Fraction when there is one, otherwise a minimum-denominator rational approximation within a set precision when there is a real value, and if there is no real-number logarithm, a sentinel value.
Note for full consistency throughout mathjs in Proposal (2),
predictable
really ought to be abolished (it's only used in sqrt, pow, logarithms, and inverse trig and hyperbolic functions anyway) and sqrt on number should just return a sentinel value like NaN or 0 or -1 for negative numbers (etc.).Proposal (3): Both proposals (1) and (2) have their virtues, so further extend/enhance/replace the
predictable
config with settings that produce any one of these classes of consistent behavior. E.g, anoutOfDomainRule
parameter that could be 'throw', 'null', 'NaN', 'sentinel' (to produce a specific sentinel chosen strictly within each domain) -- this first group of options are all roughly likepredictable: true
but just differ in detail; 'closest' -- proposal (2); or 'promote' -- the current behavior withpredictable: false
except extended to Fraction, which currently acts most closely like 'throw' but not exactly. (And in the case of 'promote', the type(s) to promote to, number or bignumber, might have to be configurable, perhaps by fallbackNumber.)Proposal (4): Each of the settings in (3) has its virtues, but this whole configuring thing is overcomplex and bogs down implementations too much. Any one consistent class of behavior is feasible to work with, and you can always get other reasonable behavior(s) by explicit casts or by trying and then casting if need be. So just abolish "predictable" and any other config option like it, and in essence pick one
outOfDomainRule
, and implement it everywhere.Frankly, any of these proposals is defensible and there are surely other reasonable options I haven't thought of. The key thing is that any consistent approach will be more understandable and scalable than the current type-dependent hodgepodge. And as I dive into the details of the bigint implementation, it would really be helpful to settle, sooner rather than later, on a general philosophical direction that mathjs will commit to in the long run, even if it doesn't move quickly toward strict compliance. It would really inform the refinement of the bigint implementation. Thanks so much for your thoughts!
The text was updated successfully, but these errors were encountered: