Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort transform #1609

Merged
merged 8 commits into from
May 10, 2017
Merged

Sort transform #1609

merged 8 commits into from
May 10, 2017

Conversation

etpinard
Copy link
Contributor

Introducing a sort transform which can be very useful for example to sort marker points in descending order:

peek 2017-04-19 10-51

@etpinard etpinard added this to the 1.27.0 milestone Apr 19, 2017
@etpinard
Copy link
Contributor Author

cc @alexcjohnson @n-riesco

},
order: {
valType: 'enumerated',
values: ['ascending', 'descending'],
Copy link
Contributor Author

@etpinard etpinard Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose ascending and descending to be familiar with axis categoryorder.

return function(v) { return +v; };
}

function getIndices(opts, targetArray, d2c) {
Copy link
Contributor Author

@etpinard etpinard Apr 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Challenge to all plotly.js contributors: find a faster sort algorithm that this O(n^2) attempt. Note that binary search is nice, but doesn't work so great for arrays with duplicate items.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I quite understand. For the overall sort? Can you use something like stable? 1. assemble [{value: 2, index: 0}, {value: 5, index: 1}, ...]. 2. Apply stable sort. 3. collect sorted indices and reorder other data arrays by that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stable sorting may be useful, e.g. the user presorted based on some criteria, we'd keep that, even if this new facility will support sorting based on multiple fields. Stable sorting is simple with the built-in Array.prototype.sort as in this recently merged thing. The built-in sort is rather fast.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function d2c seems to always be getDataToCoordFunc which looks a bit expensive. Since it's a lot of calls to this function - 2 calls on each comparison - there's probably benefit to first map (e.g. in a loop) to get the d2c values for all items, and then sort based on that. One way is sorting value/index pairs as above, or more simply, just the indices in a numeric array. Another is to add the d2c value as a property to the objects to be sorted, and then use return a.d2cValue - b.d2cValue || a.originalIndex - b.originalIndex similar to the above link in which case the objects themselves are easily sorted (using the accessor is very fast).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @monfera , that .map(d2c) pass pre-.sort sounds like a good idea 🐎

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ put in a bit more code for copy/paste into console in isolation and fixed a typo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ another fix and using actual objects to verify stability of the sorting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@monfera how does ^^ handle duplicate items?

Copy link
Contributor Author

@etpinard etpinard Apr 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some interesting results in https://gist.github.com/etpinard/86949a01309d5505fab8ccca2b0cd875

Looks like my dumb attempt isn't that bad after all (barring any typos ...)

};

// TODO reuse for filter.js
function getTargetArray(trace, target) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about putting this in src/lib/. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

// TODO reuse for filter.js
function getDataToCoordFunc(gd, trace, target, targetArray) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about putting this in src/plots/cartesian/. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, there's nothing explicitly cartesian about this but that's where we have all our axis logic, makes more sense there than hiding in a transform. I guess this says at some point we might want to make a src/components/axes/ or something... but that's a long discussion and even longer project.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to make a src/components/axes/

I like to think of components as optional, i.e. things that could be left out of plotly.js/lib/core.js for users that want to make custom bundle w/o them to shave off some 🍔. As we can't (right?) make the axes logic optional, so I'd vote 👎 for making axes a component. But agreed, most the axes logic isn't strictly cartesian. Maybe it's time for making a src directory e.g. called src/base/ or src/common/ or src/plots/common/ for non-src/lib/ things that are used across multiple plot types?

- use it in filter and sort modules
- use it filter and sort modules.
src/lib/index.js Outdated
if(typeof target === 'string' && target) {
var array = lib.nestedProperty(trace, target).get();

return Array.isArray(array) ? array : [];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either this should be : false or the one below should be []

I'd probably vote for false and then instead of the if(!len) return; check, do if(!targetArray) return;

That said, this raises the question, in these applications, of what we should do when targetArray isn't the same length as the other associated arrays. Looks like for both, it will truncate (or extend) all of them to the length of targetArray. Is that what we want? Seems right for sort, but if we agree then lets 🔒 it with a test. For filter there's an argument to be made that eg [] and )(, and {} and }{, should be complementary - which they are within targetArray but not with respect to the items beyond it. Maybe that's still what we want, or maybe we don't want the extra complexity in so far as we think this would be incorrect usage with no legitimate use case. I don't have a strong opinion other than that it should be clear how we handle this case.

BTW it took me a minute to recall the motivation for the different cases in this function - can you give the function a docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in these applications, of what we should do when targetArray isn't the same length as the other associated arrays. Looks like for both, it will truncate (or extend) all of them to the length of targetArray. Is that what we want? Seems right for sort, but if we agree

Yep, that's how I designed back when filter was first added.

but if we agree then lets 🔒 it with a test.

Sure thing.

can you give the function a docstring?

good call 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in 2dd1333 , 9c70ee9 and 4f91957


for(var i = 0; i < arrayAttrs.length; i++) {
var np = Lib.nestedProperty(trace, arrayAttrs[i]);
var arrayCopy = np.get().slice();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it actually necessary that this be a copy? arrayNew is already a different object...

Copy link
Contributor Author

@etpinard etpinard May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call. We're only reading the values from those arrays here.

expect(out[1].ids).toEqual(['n0', 'n1', 'n2', 'z', 'p1', 'p2', 'p3', undefined, undefined]);
expect(out[1].marker.color).toEqual([0.1, 0.2, 0.3, 0.1, 0.2, 0.3, 0.4, undefined, undefined]);
expect(out[1].marker.size).toEqual([10, 20, 5, 1, 6, 0, 10, undefined, undefined]);
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice tests! 🎉

@alexcjohnson
Copy link
Collaborator

Looks great! 💃

@@ -241,4 +242,102 @@ describe('Test sort transform interactions:', function() {
.catch(fail)
.then(done);
});

it('does not preserve hover/click `pointNumber` value', function(done) {
Copy link
Contributor Author

@etpinard etpinard May 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@n-riesco @alexcjohnson IMPORTANT: event data pointNumber does not correspond to the array index in gd.data after a sort transform. The same is true for filter transform with preservegaps: false. This is an unfortunate consequence of the current transform pipeline. Using ids become crucial here. That is, as presented in this test case:

var pt = eventData.points[0]

pt.fullData.ids[pt.pointNumber] 
// => is preserved after a sort transform

We could eventually add a originalPointNumber field to the event data to solve this problem. Moreover, we should really add an id field to the hover & click event data and make the ids data array more ubiquitous (it is only available for scatter traces at the moment).

@etpinard etpinard merged commit 0204845 into master May 10, 2017
@etpinard etpinard deleted the sort-transform branch May 10, 2017 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants