Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sankey feature branch #1591

Merged
merged 26 commits into from
May 9, 2017
Merged

Sankey feature branch #1591

merged 26 commits into from
May 9, 2017

Conversation

monfera
Copy link
Contributor

@monfera monfera commented Apr 12, 2017

An example: http://s.codepen.io/monfera/debug/aWbVev/DqrDdEVVJngr

Add-on features:
- use of force for drag&drop without manual overlap fiddling
- multiple force ticks per frame for faster response
- starting/stopping force per graph layer (column)
- use of force for handling apparently non-circular shapes
- vertical layout
- option for multi-edge handling d3/d3-sankey#19
- plotly tooltips and raising various plotly events
- plotly-style color contrast: text becomes white on darker backgrounds
- highlight all connecting links and nodes when hovering a node

Current limitations:
- positions aren't saved (layout lost on update)
- positions aren't specifiable yet (no attributes for this yet)
- cheesy horizontal/vertical switch
- screenshot fails for a couple of reasons
- no test cases included yet (except a couple semi-mocks)
- tooltip doesn't clear on drag once there was a Plotly.restyle (works after newPlot)
- color contrast on the bright red node isn't good

Notes:

  • As the automatic layouting isn't quite optimal, and it'd be a hard thing to do in compact code (and time) it was important to facilitate ergonomic rearrangement. The general solution - letting nodes slide and overlap - is tedious, due to overlaps and manuality. The force layout of D3 was used to assist in layouting.

@monfera monfera self-assigned this Apr 12, 2017
@alexcjohnson
Copy link
Collaborator

@monfera this is looking awesome! A few things popped out to me while playing with it:

  • Weird things happen when you have fat bands that need to make a sharp turn. I guess there really isn't a solution that maintains the width in this case, but I'd much prefer to sacrifice precise width and get rid of these sharp artifacts.
    screen shot 2017-04-12 at 7 01 32 pm
  • When dragging nodes, I feel like the main thing I want to do is reorder them, but with force layout still running it's hard to reorder because the other nodes keep getting pushed away. Can we do something like disable force layout during drag, then turn it back on at drop? Or make the positions of the other nodes stickier in some other way so they'll push a bit but then you can drag past them and they jump to the other side?
  • I definitely prefer sticky tooltips. Looks like their positioning is a little off right now, but I'd vote to have the node tooltips try to stick to the middle of the right edge of the node, then swap to the left only if necessary. And for links, I can see why you made them flip as the mouse moves across the centerline, but I find that a bit jarring... I'm not sure what the best location is for these, maybe where they are (which looks like it's trying to be the very center of the link) but maybe there would be a good way to move them to the edge of the link? I'm not sure.
  • Tooltip text: the size should probably go into the main tip. Maybe the node name(s) should go in the secondary text? Not sure how that would look for links though... Perhaps we can make the text a little more conversational? Like maybe from and to instead of source and target, and inflows and outflows instead of incoming / outgoing flow count?
  • The secondary text color is a bit funny - normally that gets the tooltip background color, but for nodes it looks like right now you have it as the border color.
    screen shot 2017-04-12 at 7 22 16 pm

].join(' ')
},

valueunit: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valuesuffix would be best here, similar to ?axis.ticksuffix

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, switching to this!

},

// followmouse for UX testing
followmouse: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪


textfont: fontAttrs,

nodes: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. I like this API 😄

description: 'The nodes of the Sankey plot.'
},

links: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done, very plotly-esque.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, tried to apply things learnt off the bat with this new chart; still room for improv!

usedNodeCount = 0,
indexMap = [];

var defaultPalette = d3.scale.category20();
Copy link
Contributor

@etpinard etpinard Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess our 10-color default palette isn't ideal for Sankey. But I would still vote 👍 for using it for Sankey for consistency.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good in principle, I'll try and add some screenshots to get a comparison.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually looks OK! A slight loss is that repetition is more noticeable as it has fewer colors, but users may specify per-node color and use a d3 or colorbrewer categorical palette however they choose. I'll switch to the plotly palette for consistency.

With d3.scale.category20:
image

With the 10 default plotly.js colors:
image

return Lib.coerce(nodeIn, nodeOut, attributes.nodes, attr, dflt);
}

for(i = 0; i < nodesIn.length; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty heavy operation. It would be nice to move it (or some of it) to the calc step.



foundUse = false;
for(j = 0; j < traceOut.links.length && !foundUse; j++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. I meant this one.

Defaults should still have to loop over all nodes and links to coerce their attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my education, since as you say, defaults must loop through it anyway (tho not nested) would putting it in calc result in more efficient code, or are you referring to the convention where computational things are usually in calc? I put it in defaults without much thought simply b/c filtering of unrendered elements is typically done in defaults and didn't think of it as 'calculation'.

Plot.moduleType = 'trace';
Plot.name = 'sankey';
Plot.basePlotModule = require('./base_plot');
Plot.categories = ['gl'];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch, gl was a leftover, I'll still need to eyeball the diffs as the PR ripens. I did

Plot.categories = [];

with which everything worked fine. I had to specify some set (even if the empty set) otherwise registry.js throws. I also tried

Plot.categories = ['sankey'];

but it failed on some coercion.

description: [
'Sankey plots for network flow data analysis.',
'The nodes are specified in `nodes` and the links between sources and targets in `links`.',
'The colors are set in `line.color`.'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👓 you mean in nodes[i].color and links[i].color?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes... thanks for the catch!

}
}

var log = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, this and some other things such as index-sankey are only for current dev. Will remove as it nears merging; now just helpful w/ churning out codepen with small bundle size.

@monfera
Copy link
Contributor Author

monfera commented Apr 13, 2017

@alexcjohnson thanks for the comments!

  1. Weird band overlaps: it's inherent in the (simple, compact) Sankey code we use by Mike Bostock, and happens in tighter layout situations. The codepen demo just flips when going to vertical mode; I should update the example to also add height when doing so (can I change 2 styles or one style and one layout value with one button?). If you greenlight it I'm glad to turn the current, very thick stroke based link rendering of d3-sankey into a fillbased link, which may give us the room for artifact reduction. It'd be another PR to d3-sankey itself, if it works out fine.
  2. We discussed that we'd loop in some more pairs of eyes and think of ways to help 'feature discovery' for end users - I'll come back with findings. Also, we have a (not yet exposed) option for non-snappy node rearrangement including sideways movement that stays on drag release.
  3. Sticky tooltips: indeed it's more plotly-esque! The question of using mouse following, or tweened tooltip positions as e.g. Highcharts does goes beyond a single chart type, more of a UX/engagement question whether or not I feel it'd suit the Sankey (which has these long shapes in contrast to e.g. scatter that uses denser, smaller glyphs: point markers, and as you say the static version is a bit 'jarring'). The static version is positioned to the middle of the link out of convenience: the center of the bounding box is guaranteed to be over the link. I thought about putting it to a terminal, but it could be confused with the nodes themselves. They flip as you move the mouse to reduce likelyhood of occlusion, and to let users de-occlude the tooltip area without entirely losing the tooltip. Further thoughts absolutely welcome!
  4. Tooltip text vs value flip: indeed it's reversed. If we want to show a bunch of things in the colored box area (which I think users would sometimes want with Sankey, e.g. multiple props or statistics) then the colored rectangle will be wide. If we put the textual label into the non-colored box area, then both adjacent boxes will be wide. Putting the value there instead saves precious width and therefore reduces avoidable occlusion, already a concern with sizeable tooltips as it is. Also, the identity of a node is always salient - it shows up on, or next to the box all the time. So the single distinguished key thing that drives the Sankey is the value, as it shows flows (alluvial). Glad to receive explicit direction or further considerations on this; the above was just my initial thoughts.
  5. Glad to adhere - could you pls. clarify for me what should be what color? e.g. 'the value 18 should show as black, on an orange background, but here it's green` or some such.

Thanks again for the feedback and I'll keep my thoughts humming on these, also appreciating further thoughts or decisions.

@monfera
Copy link
Contributor Author

monfera commented Apr 19, 2017

@alexcjohnson regarding your comment on the colors, there's a new version, not yet in this squashed branch, that has an improved color handling: http://codepen.io/monfera/full/aWbVev/

The values (e.g. TWh values here) are in the node color, and if the node color is too light, then it's darkened for better contrast on the white tooltip background. It may still be at odds with what you want so my question above stands.

@alexcjohnson
Copy link
Collaborator

The values (e.g. TWh values here) are in the node color, and if the node color is too light, then it's darkened for better contrast on the white tooltip background. It may still be at odds with what you want so my question above stands.

It's still a tiny bit different from what we do elsewhere, though that sometimes makes a mistake:
screen shot 2017-04-19 at 2 09 01 pm
(there's supposed to be some text in the white box's extra section... but it's white! this is mock 10 BTW)

The rationale was that we picked a color from the trace that made a visible point against the plot background - the user wouldn't have made a point that wasn't visible! so it should be visible against a semitransparent version of the same. Obviously the premise there ("we picked a color from the trace that made a visible point against the plot background") isn't always correct. Better in that particular case would be to have chosen the red, rather than the white marker fill, but that's a separate issue, there could be cases where no color from the point/trace is really sufficiently contrasting.

So actually, if you make sure that the algorithm you're using works for arbitrary background color (white, black, some grey in the middle...) then I think I'd like to use it everywhere. I didn't look at your implementation at all, but in case you didn't see it go by there's a recently added/modified routine Color.contrast made explicitly for purposes like this.

@alexcjohnson
Copy link
Collaborator

BTW just looking at that codepen again - multiple links between the same nodes are cool, but do you have a strategy for ensuring they don't overlap? Perhaps related to the strategy for super-wide links, that may need to move to explicit paths instead of fat-stroked lines?

screen shot 2017-04-19 at 2 25 00 pm

@monfera
Copy link
Contributor Author

monfera commented Apr 19, 2017

@alexcjohnson

Overlap: yes the codepen is partly a demonstration for the overlap effect. We had no client requirement for multi-edges, though @jackparmer thought we should have this feature, so this PR was born which shows the improvement. IOW the original Sankey implementation had no support for multi-edges and with the PR we do have it, although not in a perfect way. It's OK if there are lots of items and bands are narrow, but noticeable if bands are wide and Y difference is significant. Solving it might be half day to a day, should I go ahead with it or let's consider it in a subsequent merge?

Color: Yes I'm already using Color.contrast - the way it's used in Sankey is that it darkens the lighter colors on the white background, i.e. the color will appear perceptually same or close enough but in fact will be darker. If the color logic feel satisfactory or robust with the Sankey, maybe we can make a separate subsequent PR that unifies behavior across other plots if this task is greenlighted.

@monfera
Copy link
Contributor Author

monfera commented Apr 19, 2017

@alexcjohnson I forgot to add that yes, both the overlap avoidance, and the avoidance of bulges and the sometimes appearing circular artifacts around the nodes would be best taken care of together, because both would involve a rewrite of the current stroke based link making logic with a fill based one where we'd have finer grained control.

@alexcjohnson
Copy link
Collaborator

both the overlap avoidance, and the avoidance of bulges and the sometimes appearing circular artifacts around the nodes would be best taken care of together, because both would involve a rewrite of the current stroke based link making logic with a fill based one where we'd have finer grained control.

Yes, we definitely need to solve these problems. Your call how to go about it, if you think fill-based drawing is the easiest then go for it, I'll just point out that there are possibilities without dropping stroke-based rendering: The artifacts on fat, short links could use a transform to compress the stroke only along one axis. And the multi-link overlaps could potentially be solved by moving the control points, though seems like that would leave gaps... alternatively use alpha masking - would be a bit annoying to construct, from one or two copies of the original path with different stroke widths and filled to one side... but I'm pretty sure you could get arbitrary lateral segments of the stroke that way.

That said, if we ever want to support outlines on the links, might be best to bite the bullet now and go for fill-based links. It's not used very often, but occasionally, like: sankey with link outlines

If the color logic feel satisfactory or robust with the Sankey, maybe we can make a separate subsequent PR that unifies behavior across other plots if this task is greenlighted.

Sounds right to me. Get this one perfect and then at some point we'll merge it back to the rest of them. The one thing I'll note right now is what should happen when the background is not white (change paper_bgcolor in your example - this also shows that the diagram is overflowing the allowed area...) - the secondary text shouldn't be on semitransparent white, it should be semitransparent of whatever the background color is, then pick a text color that's legible against that.

@monfera
Copy link
Contributor Author

monfera commented Apr 20, 2017

@alexcjohnson thanks for the useful comments! I take it from your answer that overflow is not okay i.e. all contents must fit in the domain rectangle. One option would be to measure the text lengths, but it'd be quite brittle - the effective width of the area covered by the nodes and links would change arbitrarily, depending on the length of the nodes in the rightmost layer. Another option is to truncate or break to multiple lines (not practical if a node is small). So I'd favor placing the texts to the left side of the nodes. The slight problem with that is that node texts of the two rightmost columns could then overlap. Label collision avoidance is doable with more coding, but then the label may be separated from the node, perhaps linked with a polyline as in the case of pie charts. It's somewhat complex so I'd only put time into it if it's requested; otherwise I'd just place the rightmost labels to the left. Please also see Mike Bostock's example, where apparently not just the rightmost, but several layers have labels folded to the left.

@alexcjohnson
Copy link
Collaborator

I take it from your answer that overflow is not okay

yeah, I think that's right - I'm not sure we've ever explicitly stated that, but for example if someone specifies the width and height of the div they plot into, we translate that into the plot width and height, and I expect they would not want the plot overflowing that. Also for iframe embeds, if the plot is autosized we will size it to the iframe and disable scrolling.

otherwise I'd just place the rightmost labels to the left.

Sounds fine, at least as a starting point.

@monfera
Copy link
Contributor Author

monfera commented Apr 25, 2017

Hi @alexcjohnson, hope you see your feedback acted on in the updated example such as

  • no circular bulges when moving nodes too close
  • no partial overlap or gap with multiedges
  • left-swept node labels in the rightmost layer
  • better text-shadow to make node labels stand apart, no matter the surrounding background color and or mixed link color (there's no need for text-shadow in the vertical orientation as text is inside node then)
  • toggle button for black background - it'll swap the default rgba(0,0,0,0.2) link color to rgba(255, 255, 255, 0.6) which isn't noticeable - yielding the same perceived color - but necessary for the right balance of translucency and contrast (you'd notice very much if there were no link color change when switching; an example of which is the manually specified red/blue that was taylored for white background and isn't as good on black background)
  • to me, these goals seem to be met:
    • the hover tooltip is readable with the defaults and various colors, both with dark and light background (toggle w/ the top button on the left)
    • the tooltip is translucent, which is useful as it can be large depending on the contents, and in my experience, not having some translucency is disturbing as it hides the presence of link flows behind it
    • the node / link colors are present in one form or another on the tooltips
    • looks pleasing (subjective)

As @etpinard indicated, the Follow mouse will give way to the Sticky tooltip option (see button) as it's more in line with the rest of the trace types.

I think Please let me know if you see any glaring unergonomic thing or incompliance, or if I haven't addressed some of your feedback.

@alexcjohnson
Copy link
Collaborator

alexcjohnson commented Apr 25, 2017 via email

min: 1,
dflt: 20,
role: 'style',
description: 'Sets the thickness of the `nodes`.'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in px?

@@ -41,26 +41,26 @@ module.exports = function supplyDefaults(traceIn, traceOut, defaultColor, layout
return Lib.coerce(traceIn, traceOut, attributes, attr, dflt);
}

coerce('node.label');
coerce('node.label', []);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a constant, can it be the attribute dflt?

@@ -83,11 +83,11 @@ module.exports = function supplyDefaults(traceIn, traceOut, defaultColor, layout
};

if(traceOut.node.label.some(missing)) {
Lib.log('Some of the nodes are neither sources nor targets, please remove them.');
Lib.warn('Some of the nodes are neither sources nor targets, please remove them.');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor point, the user may have different ways to solve the problem (eg providing the missing links, instead of removing disconnected nodes). I would have the warning simply describe what we did and why, like "Some of the nodes are neither sources nor targets, they will not be displayed." (if indeed that's what will happen) Your message on encountering cycles does a great job of this.

@alexcjohnson
Copy link
Collaborator

Looks great @monfera - Unless @etpinard has anything else to add, I'm ready to 💃 !

}
};

exports.toSVG = function() {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪


var nodes = nodeList.map(function() {return [];});

for(var i = 0; i < Math.min(sources.length, targets.length); i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I already commented on this, but can we move the computations that scale as the length of the data arrays in calc.js?

}

function nodeHoveredStyle(sankeyNode, d, sankey) {

Copy link
Contributor

@etpinard etpinard May 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔪 those blank lines at the start of new blocks in accordance with standard style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it!

['Source:', d.link.source.label].join(' '),
['Target:', d.link.target.label].join(' ')
].filter(renderableValuePresent).join('<br>'),
color: Color.addOpacity(d.tinyColorHue, 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will have to be generalized for #1582 either in this PR or in #1582 whichever gets merged first.


});

});
Copy link
Contributor

@etpinard etpinard May 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at the very least add some restyle tests checking that we can toggle visible: true || false correctly and that sankey subplots get clear correctly on deleteTraces.

Some hover labels tests and click/hover event data assertions would be much appreciated too.

Plot.moduleType = 'trace';
Plot.name = 'sankey';
Plot.basePlotModule = require('./base_plot');
Plot.categories = [];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is trace-wide opacity a thing for sankey traces? If not, categories should include the noOpacity flag like we did for parcoords in #1506

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No trace- wide opacity so I'll add it, thanks!

@etpinard
Copy link
Contributor

etpinard commented May 8, 2017

Including this in this week's 1.27.0 release. @monfera let me know if you need help closing out this PR. cc @jackparmer

@etpinard etpinard added this to the 1.27.0 milestone May 8, 2017
});
}

module.exports = function calc(gd, trace) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very nice first step. Thanks @monfera !

.then(function() {
expect(gd.data.length).toEqual(1);
expect(d3.selectAll('.sankey').size()).toEqual(1);
return Plotly.plot(gd, mockCopy2);
Copy link
Contributor

@etpinard etpinard May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing addTraces would be better. Sequential Plotly.plot calls is a deprecated pattern.

var gd = createGraphDiv();
var mockCopy = Lib.extendDeep({}, mock);

Plotly.plot(gd, mockCopy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@etpinard
Copy link
Contributor

etpinard commented May 9, 2017

Nicely done @monfera good enough to 💃

I'll add a few more hover label tests while addressing #1591 (review) after this PR is in master.

@monfera
Copy link
Contributor Author

monfera commented May 9, 2017

Thank you @etpinard and @alexcjohnson for your reviews and @etpinard for your upcoming additions!

@monfera monfera merged commit 132c8ea into master May 9, 2017
@monfera monfera changed the title [WIP] squashed Sankey feature branch Sankey feature branch May 9, 2017
@chriddyp
Copy link
Member

🏆 👏 🎉 Awesome stuff @monfera !!

@monfera monfera deleted the sankey-squashed branch May 10, 2017 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants