Playfair-Inspired Area Chart

  • Data visualisation
  • D3.js
Classic visualisation by Playfair

Context

For one of my projects at City of Oslo I wanted to visualize the change in the age distribution over time. I figured that there was only two realistic was of doing so:

  • Animation
  • Displaying data for two years (today vs some date in the past)

I opted for option two because the chart should also work well on static presentations and print-outs.

Discovery

The population pyramid is a decent starting point. This visualisation format displays the age distribution between genders, and was worth looking into.

Population pyramid
A typical population pyramid

I discovered quickly, though, that one of the main issues with these types of charts is that comparing the left side with the right is very difficult. It’s more the overall shape of the pyramid that’s clear, so I had to start over.

Next I tried plotting the distribution as a line graph with age on the x-axis and the ratio as the y-axis.

I needed to use relative values here, as the population has increased significantly over the years, so sticking with absolute values would probably just show that the volume of people has increased, while I actually wanted to display how the distribution had changed.

After plotting both datasets on the graph, I was convinced that there was onto something. Differentiating the lines with colours did a decent job, but I wasn’t convinced it was clear enough …

I remembered William Playfair’s (famous?) chart of trade between England and Denmark–Norway, where he had filled in the positive and negative trade balances with different colours to emphasize a change in export vs import between the two nations over time.

Chart of trade between England and Denmark–Norway
Playfair's chart of trade between England and Denmark–Norway from 1700 to 1780

This was actually a similar concept to what was relevant here – the main difference being that the x-axis on my chart wasn’t time but age, and that time was actually the third variable indicated by the two lines. I wanted to try it out. There was only one issue, I had no idea how to actually implement this in d3.js.

Challenge

Having forgotten much of my algebra from school, this seemed pretty difficult at first. The problem is that the lines cross at several places and finding that exact location where two lines intersect seemed like a pretty daunting task.

Initially I was thinking the solution was somewhere along the lines of:

  • Finding the angle and location of every line between two ages for both years
  • Somehow (well, using simple algebra, I think …) compare these lines to see if they cross
  • Find the exact location of where those lines intersect
  • Defining what category (increase/decrease) this area belongs to
  • Draw an area between the lines starting at the point of intersection and ending at the next
  • Repeat

Surely this was to be way overly complex for something seemingly simple, but I couldn’t really think of any other way.

But after working on this problem for a couple of hours I suddenly realized that I might be able to solve it using SVG masks! The <mask> element is exactly what it sounds like. It’s an invisible object that limits the reach of any other element within the same SVG.

Creating the chart

The ideas is quite simple really. What I needed to do was to draw two masks, one for the red polygon and one for the blue one and then connect each polygon to its corresponding mask using the mask attribute.

Let’s start by converting the lines to areas.

So under each line, there’s a separate path element with its fill color set to fill-opacity="0.2". So there would actually be two DOM elements for each series now. One to show the line and one for the area below it.

The next step here is to remove all the “red stuff” below the blue line and all the “blue stuff” below the red line.

In order to do that we’ll add two more paths that will function as masks. In the illustration below, you can see (in black) where the should go.

In our d3.js script we’ll wrap these “mask paths” in a <defs> element and <clip-path> with identifying ids.

this.defs = this.svg.append('defs')

this.clips = this.defs
  .selectAll('path.line')
  .data(data.map((d) => d.values))
  .join('clipPath')
  .attr('id', (d, i) => `clip_${i}`)
  .append('path')
  .attr('class', 'clip')
  .attr('d', (d) => {
    let path = this.line(d).substring(1)
    const yPos = 0

    path = `M0,${yPos} L${path} L${width},${yPos} Z`
    return path
  })
  .attr('fill', 'black')

this.areas.attr('clip-path', (d, i) => `url(#clip_${i === 0 ? 1 : 0})`)

We then end up with something like this:

The blue areas now represent age ranges where the population ratio has increased from 2001 and the red areas show where the ratio has decreased.

However, this might not be obvious immediately to the reader, so we can help out by adding some annotations.

The strategy here is to identify at which point along the x-axis the difference is the biggest - both largest increase and largest decrease - as those areas are most likely to visually stand out:

const diffs = data[1].values.map((d, i) => {
  return {
    age: i,
    diff: d.ratio - data[0].values[i].ratio,
  }
})

const largestDecrease = d3.scan(diffs, (a, b) => a.diff - b.diff)
const largestIncrease = d3.scan(diffs, (a, b) => b.diff - a.diff)

const decreasePos = {
  x: this.x(largestDecrease),
  y: this.y(
    (data[1].values[largestDecrease].ratio - data[0].values[largestDecrease].ratio) / 2 +
      data[0].values[largestDecrease].ratio
  ),
}

const increasePos = {
  x: this.x(largestIncrease),
  y: this.y(
    (data[0].values[largestIncrease].ratio - data[1].values[largestIncrease].ratio) / 2 +
      data[1].values[largestIncrease].ratio
  ),
}

Voilà!