Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lang] Make floor, ceil and round accept a dtype optional argument #5307

Merged
merged 17 commits into from
Jul 9, 2022
Merged

[Lang] Make floor, ceil and round accept a dtype optional argument #5307

merged 17 commits into from
Jul 9, 2022

Conversation

neozhaoliang
Copy link
Contributor

Currently the integer returned by floor function has float type, this PR adds an option to the returned value type.

@netlify
Copy link

netlify bot commented Jul 1, 2022

Deploy Preview for docsite-preview ready!

Name Link
🔨 Latest commit 529427f
🔍 Latest deploy log https://app.netlify.com/sites/docsite-preview/deploys/62c8e96d85bff60009a89db0
😎 Deploy Preview https://deploy-preview-5307--docsite-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@neozhaoliang neozhaoliang changed the title [Lang] Make floor accept a dtype optional argument [Lang] Make floor accept a dtype optional argument Jul 1, 2022
Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OOC: what's the motivation for this change?

@neozhaoliang
Copy link
Contributor Author

OOC: what's the motivation for this change?

A common need in simulations and renderings is that, for a point p we want to find the cell index i, j that p falls into, and use that index to access field elements. Hence allowing return integer types would be more convenient here.

Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Would you mind add an example with dtype specified in the docstring, as well as some test cases like test_random_int?

@strongoier
Copy link
Contributor

I doubt whether this change is necessary. Does ti.floor(x, int) have a clear advantage over int(ti.floor(x))?

@sjwsl
Copy link
Contributor

sjwsl commented Jul 1, 2022

I doubt whether this change is necessary. Does ti.floor(x, int) have a clear advantage over int(ti.floor(x))?

Agree. What's worse is that the range of dtype may be not big enough, e.g. f32->i32, which is UB. It should be users' responsibility to use the result correctly and avoid overflow. floor in C++ (and all the statically typed languages I know) always returns float or double.

@yuanming-hu
Copy link
Member

I'd actually vote for a function like ti.floor(val, dtype) given its simplicity, but I'm open to discussions.

I doubt whether this change is necessary. Does ti.floor(x, int) have a clear advantage over int(ti.floor(x))?

Sometimes using int to hold the result of ti.floor may not be what the user wants and you may want to explicitly specify ti.i32/ ti.u64 etc as the result type.
I guess if implemented correctly, ti.floor(x, ti.i64) would be clearer than ti.cast(ti.floor(x, ti.i64)).
The other option is that we allow users to directly cast values/vectors using a syntax like ti.i32(x).

What's worse is that the range of dtype may be not big enough, e.g. f32->i32, which is UB.

IMHO this is actually not making things worse :-) You need such a cast anyways in many use cases.

@k-ye
Copy link
Member

k-ye commented Jul 1, 2022

What's worse is that the range of dtype may be not big enough, e.g. f32->i32, which is UB.

I think the point is that f32 has a much larger range than INT_MAX/2,147,483,647? That opens up the door for UB in the API...

@sjwsl
Copy link
Contributor

sjwsl commented Jul 2, 2022

I think the point is that f32 has a much larger range than INT_MAX/2,147,483,647? That opens up the door for UB in the API...

Yeah, but users can do some custom checks to be safe. For example:

template <std::integral T>
std::optional<T> ifloor(double num) {
    auto double_result = std::floor(num);
    auto max = std::static_cast<double>(<std::numeric_limits<T>::max); // may have precision issues, just for demonstration.
    auto min = std::static_cast<double>(<std::numeric_limits<T>::min);
    return double_result <= max && double_resuld >= min ? std::static_cast<T>(double_result) : std::nullopt;
}

Of course we can do something similar inside the API to avoid UB, but I don't think there is a generally accepted logic here. IMHO that's why std::floor in C++ just returns what it accepts.

IMHO this is actually not making things worse :-) You need such a cast anyways in many use cases.

I do agree it won't overflow and we just cast it immediately in most cases. My point is that there should be as few APIs as possible with potentially undefined behavior, especially in the standard library. It's probably ok if thinking of it as a syntactic sugar and it does save some typing in many cases. But if UB is expected, at least it shall be documented.

@KuribohG
Copy link
Contributor

KuribohG commented Jul 2, 2022

I would like to say the syntax ti.floor(x, int) is more self-documented than ti.i32(ti.floor(x)). A newbie user will have to refer to the document for casting in Taichi if we adopt the latter syntax, while only need to refer to the signature for the former one, which can be done in IDE or IPython.

UB isn't a matter here if we guarantee the float version of this function is well-defined. Like many C++ standard functions, unsafe versions are provided, like std::span didn't have a boundary check, and std::thread didn't provide a default destructor.

@yuanming-hu
Copy link
Member

What's worse is that the range of dtype may be not big enough, e.g. f32->i32, which is UB.

I think the point is that f32 has a much larger range than INT_MAX/2,147,483,647? That opens up the door for UB in the API...

That's true, but I believe the user will always get a chance to use ti.floor(x), without specifying the dtype argument, so that the type of x is returned. The current implementation has an issue, since if float = ti.f32 and x is ti.f64, he gets an f32 return value. This should be fixed.

@strongoier
Copy link
Contributor

I'd actually vote for a function like ti.floor(val, dtype) given its simplicity, but I'm open to discussions.

This change can indeed make user code slightly shorter, but IMHO a function should have single responsibility. If we decide to add a dtype parameter to ti.floor, are we going to repeat this (creating an optional casting step) for all functions?

The other option is that we allow users to directly cast values/vectors using a syntax like ti.i32(x).

The syntax ti.f32(1.3) is used to annotate the type of a literal (like 1.3f in C), which is NOT a type cast. IMHO mixing two different semantics in one syntax is confusing.

The current implementation has an issue, since if float = ti.f32 and x is ti.f64, he gets an f32 return value. This should be fixed.

Could you provide an example running into the bug? IIRC the current implementation returns a value of the same type as the input value.

@yuanming-hu
Copy link
Member

If we decide to add a dtype parameter to ti.floor, are we going to repeat this (creating an optional casting step) for all functions?

Probably we only do this for floor, since this is the only floating-point function that takes a float but users typically want an int output.

@strongoier
Copy link
Contributor

If we decide to add a dtype parameter to ti.floor, are we going to repeat this (creating an optional casting step) for all functions?

Probably we only do this for floor, since this is the only floating-point function that takes a float but users typically want an int output.

Maybe ceil and round as well?

@k-ye
Copy link
Member

k-ye commented Jul 5, 2022

If we decide to add a dtype parameter to ti.floor, are we going to repeat this (creating an optional casting step) for all functions?

I think this is a valid point. If floor has a dtype, at least I would expect that ceil has the same interface for symmetry.

That's true, but I believe the user will always get a chance to use ti.floor(x), without specifying the dtype argument, so that the type of x is returned.

I agree.


Now giving it more thoughts, I think one possibility that C floor doesn't have an type cast is probably because it can't (without generics or type-as-a-parameter mechanism), i.e. there's no way for it to get a dtype info. If we look at numpy.floor (or any of its ufunc), there is a casting policy:

So my current opinion is that we don't have to abide to C's API design, but we should do the same for floor and ceil.

@neozhaoliang neozhaoliang changed the title [Lang] Make floor accept a dtype optional argument [Lang] Make floor, ceil and round accept a dtype optional argument Jul 6, 2022
@neozhaoliang
Copy link
Contributor Author

Added dtype option to ceil and round functions.

Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! The implementation LGTM, but could you update the doc and add a few test cases?

@neozhaoliang
Copy link
Contributor Author

test script added.

Copy link
Member

@yuanming-hu yuanming-hu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neozhaoliang neozhaoliang requested a review from writinwaters July 8, 2022 09:45
@neozhaoliang neozhaoliang merged commit 332298b into taichi-dev:master Jul 9, 2022
@neozhaoliang neozhaoliang deleted the fix-floor branch July 9, 2022 03:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants