Better clustering and colours for Excel heat maps

Excel is great for creating quick charts, but often the defaults are not ideal.

For example, Conditional Formatting > Color Scales can produce a heatmap very easily, and for some datasets, the default colour scheme can work well.

However, for datasets which are skewed, then the colours assigned to each value by Excel are too subtle to be useful for visually identifying their value. For example, for the dataset 1, 2, 4, 5, 12, 43, 52, 123, 234 and 1244, Excel would produce these heat maps:

Fortunately, there a good algorithms for generating more useful groupings - I recommend this article, from which this example dataset comes from, and which proposes an elegant algorithm for producing groups:

Using clustering to create a new D3.js color scale

The proposed algorithm (d3-scale-cluster, which is great JavaScript library written by David Schnurr) produces nice groupings, but applying these into Excel requires a bit of manual legwork.

I've created a little interface to the algorithm, where you can copy your data into the webpage, select the number of groups and start/end colours, and it will generate the groups:

The interface will tell you which groups each value should be in. So back in Excel, you can:

  • create a Conditional Formatting rule (Conditional Formatting > New Rule... > Format only cells that contain)
  •  enter the lower and upper value from each group
  •  set the Fill Format of the cell to be colour shown by the RGB value.

The interface generates colours between the First and Last colour automatically, using D3.JS's interpolateHCL function.

Of course, you can apply your own colours, or preferably, choose a scheme from ColorBrewer, whose colours schemes have been carefully designed to ensure that they work well for differentiating between clusters and are better for people with different types of colour blindness.

Here's the same dataset as before, but this time formatted using the colour scheme generated by my interface for five groups, and the same groupings formatted using the ColorBrewer Green-Blue sequential scheme:

Even though there is bit more effort involved, the end result is much better for people reading your spreadsheet. We can see more clearly the differences as we move to the 40s, 50s, etc, and not just a dark colour for 1244.

All the best, and don't accept the defaults!
Tom