Skip to content

Sankey Diagrams

Andrea Ferretti edited this page Jan 12, 2016 · 6 revisions

Sankey diagrams are a specific type of flow diagram, in which the arrows are proportional to the flow quantity. They are classically used to visualize energy accounts or material flow on a regional or national level, but they can represent any kind of quantitative flow, from source (to the left) to target (to the right). They are also closely related to "Alluvial diagrams". It can be used as follows:

var Sankey = require('paths/sankey');
var sankey = Sankey({
  data: {
    nodes:[
      [{id:"pippo"},{id:"pluto"},{id:"paperino"}],
      [{id:"qui"},{id:"quo"},{id:"qua"}],
      [{id:"nonna papera"},{id:"ciccio"}]
    ],
    links:[
      {start:"pippo", end:"quo", weight:10},
      {start:"pippo", end:"qua", weight:30},
      {start:"pluto", end:"nonna papera", weight:10},
      {start:"pluto", end:"qui", weight:10},
      {start:"pluto", end:"quo", weight:10},
      {start:"paperino", end:"ciccio", weight:100},
      {start:"qui", end:"ciccio", weight: 20},
      {start:"quo", end:"ciccio", weight: 10},
      {start:"qua", end:"nonna papera", weight: 30}
    ]
  },
  compute: {
    color: function(i) { return somePalette[i]; }
  },
  nodeaccessor: function (x) { return x.id; },
  width: 500,
  height: 400,
  gutter: 10,
  rectWidth: 10
});

Parameters:

  • width, height: have the obvious geometric meaning
  • data: contains an object with nodes and links. The precise form of the data is not important, because the actual value of the data will be extracted by the nodeaccessor and linkaccessor functions.
  • nodeaccessor (optional, default identity): a function that is applied to each datum inside each item in data.nodes to extract its id.
  • linkaccessor (optional, default identity): a function that is applied to each datum inside each item in data.links.
  • gutter (optional, default 10): the space to leave between each bar
  • rectWidth (optional, default 10): the width of each bar
  • compute (optional): see the introduction. Each function here has three parameters: index, item and group, where the third one represents the outer index in the nodes array.

NOTE: rectWidth used to be rect_width before 0.4.

nodes is a list of lists of objects. Each list represent a level of the diagram; each element in a list is an object from which we can extract an id with the nodeaccessor function. Ids should be unique to avoid wrong associations.

links is a list of objects. Applying the linkaccessor function, we should extract something which contains a start and end properties (ids of nodes) and a weight, which represents how much flow is going from start to end; start should be on the left of end and you should avoid circles (which is automatic if you respect the previous rule!). You don't need to have start and end in two consecutive levels.

The object returned by the Sankey function contains the curvedRectangles array, on which one can iterate to draw the flows, and the rectangles array, on which one can iterate to draw the rectangles.

Each member of one of these arrays has the properties curve, index, item, the latter containing the actual datum associated to the node/link. Elements of the rectangles array also have the group property, containing the index of the group to which the item belongs. You can add more properties by passing them within the compute object.

Clone this wiki locally