Skip to main content

How-To: The Stacked Bar Alternative

ยท 18 min read
Daniel Marsh-Patrick
Program Manager & Developer

I've become smitten with this stacked bar chart alternative, as recently showcased on the Flerlage Twins blog. This article is a brilliant insight into a collaborative process between Sebastine Amede, Darragh Murray and Kevin, with a further spotlight on folks who have inspired their final design. It's a really brilliant way of keeping the part-to-whole idea of a stacked bar chart, while also making the individual segments much easier to compare.

As always, when I see cool things (and because I enjoy using the tooling as well as developing it), I wonder how easy they might be to replicate in Power BI using Deneb. I thought it would be a fun exercise to try and replicate the example from the blog post using Vega-Lite. Fortunately, Vega-Lite has multiple encoding channels along each direction for Cartesian plots, which makes it a fairly straightforward process.

Finished Recipeโ€‹

This is how our final design will look:

finished-recipe

Although Deneb is a Power BI-specific tool, I'll produce the example using Vega-Lite natively, so we can visualize our progress on this post using embedded Vega-Lite visuals that you can open in the Vega Editor to further explore and/or remix. There may be some steps to make modifications if you have a preference for using Deneb-specific features like integrating your Power BI theme or Power BI-specific number formatting, but they should be fairly straightforward.

For those who just want to get started with the finished product, there are assets available at the bottom of this article.

Our Dataโ€‹

I'm going to start with the simple financial data that you can download from MS Learn. For this example, I'm adding the Country and Segment columns, and the $ Sales measure, so we have a lower-level granularity than we might normally have for a simple bar chart:

CountrySegmentSales
CanadaChannel Partners19752468
CanadaEnterprise144752736
CanadaGovernment428829833
CanadaMidmarket21616940
CanadaSmall Business361203792
FranceChannel Partners15178124
FranceEnterprise150182723
FranceGovernment432792397
FranceMidmarket25395919
FranceSmall Business274034856
GermanyChannel Partners9807620
GermanyEnterprise146362549
GermanyGovernment307773485
GermanyMidmarket12452545
GermanySmall Business281729079
MexicoChannel Partners8887825
MexicoEnterprise130960356
MexicoGovernment413892255
MexicoMidmarket16428775
MexicoSmall Business308830851
United States of AmericaChannel Partners12059458
United States of AmericaEnterprise166894624
United States of AmericaGovernment362146299
United States of AmericaMidmarket21890524
United States of AmericaSmall Business463176030

This produces the dataset for our Deneb visual and I'll add this to the Values data role, as I would for any project.

Thinking About Our Designโ€‹

While the finished recipe contains text marks to further annotate our chart, the solution itself can be approached at a high level as follows:

  1. Adding our highest-level grouping field (Country) to the x encoding channel and aggregating our measure (Sales) using sum.
  2. Adding our lower-level grouping field (Segment) to the xOffset encoding channel.

Everything else will work within these constraints.

Step 1: Highest Level Groupingโ€‹

Here's how we produce a summarized bar chart for our highest-level grouping (Country):

{
"data": { "name": "dataset" },
"layer": [
{
"encoding": { "y": { "field": "Sales", "aggregate": "sum" } },
"mark": { "type": "bar", "color": "#d4d4d4" }
}
],
"encoding": {
"x": {
"field": "Country",
"sort": { "field": "Sales", "op": "sum", "order": "descending" }
},
"y": { "type": "quantitative" }
}
}
  • Note that I'm already thinking about the above approach by using a layer rather than a mark at the top level..
  • As our y encoding channel is quantitative, I'm putting that at the top level, and adding a y encoding channel by an aggregate of Sales in the layer (due to granularity). This will inherit the top-level type and we won't need to repeat this, due to how Vega-Lite will union the y-scale.
  • The x encoding channel will be shared by all layers, so this is at the top level. I'm also sorting this in descending order of Sales (again aggregated).

Our design looks pretty normal right now, but as we might expect:

Step 2: Lowest Level Groupingโ€‹

With our primary channels (x and y) worked out, we can add our second layer, which will perform the offset within each bar, based on the Segment field:

{
"data": { "name": "dataset" },
"layer": [
... // Our first layer
{
"encoding": {
"y": { "field": "Sales" },
"xOffset": {
"field": "Segment",
"sort": { "field": "Sales", "order": "descending" },
"scale": { "padding": 0.2 }
},
"color": {
"field": "Segment",
"sort": {"field": "Sales", "order": "descending" }
}
},
"mark": { "type": "bar", "stroke": "white" }
}
],
"encoding": {
... // Our original encoding
}
}
  • This layer has the y encoding channel set the the Sales field.
  • Because we're at our lowest level of granularity, we don't need to aggregate this.
  • The xOffset channel is set to the Segment field, and we're sorting this in descending order of Sales. This will produce individual bars within each bigger bar, sharing the top-level x encoding channel for their corresponding Country. Theres a small amount of padding here, to help space them out a bit better.
  • The color channel is set to the Segment field, which will also give us our legend. This is sorted in descending order of sales, to match the ordering of the xOffset channel and having each legend item match the order of the segments in the chart.
  • Each bar has a white stroke, just to help with contrast against the bar behind it.

As this point, we should be able to see that the main challenge has been solved and we have our "bars within bars" design:

Step 3: Cleaning Up the Plot Areaโ€‹

For a more clutter-free design, we will need to remove the redundant elements from our chart. We will add labels in step 4, but this will help us to understand how much space we have to work with.

{
"title": { "text": "Country Revenue by Segment ($M)", "anchor": "start" },
"view": { "stroke": "transparent" },
"data": { "name": "dataset" },
"layer": [
{
"encoding": { "y": { "field": "Sales", "aggregate": "sum" } },
"mark": { "type": "bar", "color": "#d4d4d4" }
},
{
"encoding": {
"y": { "field": "Sales" },
"xOffset": {
"field": "Segment",
"sort": { "field": "Sales", "order": "descending" },
"scale": { "padding": 0.2 }
},
"color": {
"field": "Segment",
"sort": { "field": "Sales", "order": "descending" },
"legend": { "orient": "top", "title": null }
}
},
"mark": { "type": "bar", "stroke": "white" }
}
],
"encoding": {
"x": {
"field": "Country",
"sort": { "field": "Sales", "op": "sum", "order": "descending" },
"axis": null
},
"y": { "type": "quantitative", "axis": null }
}
}

The changes have been highlighted in the code above but can be summarized as:

  • Adding a title and anchoring it to the left (start).
  • Setting the view to have a transparent stroke, so we don't have a border around the plot area.
  • Setting the legend for the Segment field to be at the top, and removing the title (as it is redundant).
  • Removing both axes (setting them to null), as they are not needed.

Here is our refined design:

Step 4: Labelsโ€‹

For the last part of this walkthrough, I'm going to add labels for each bar using text marks. As our two-layer approach works well, we can use descendant layers so that these marks inherit what they need from the existing layout:

{
"title": { "text": "Country Revenue by Segment ($M)", "anchor": "start" },
"view": { "stroke": "transparent" },
"data": { "name": "dataset" },
"transform": [{ "calculate": "datum.Sales / 1000000", "as": "Sales_M" }],
"layer": [
{
"encoding": { "y": { "field": "Sales", "aggregate": "sum" } },
"layer": [
{ "mark": { "type": "bar", "color": "#d4d4d4" } },
{
"mark": { "type": "text", "baseline": "bottom", "dy": -15 },
"encoding": { "text": { "field": "Country" } }
},
{
"mark": { "type": "text", "baseline": "top", "dy": -15 },
"encoding": {
"text": { "field": "Sales_M", "aggregate": "sum", "format": ",d" }
}
}
]
},
{
"encoding": {
"y": { "field": "Sales" },
"xOffset": {
"field": "Segment",
"sort": { "field": "Sales", "order": "descending" },
"scale": { "padding": 0.2 }
}
},
"layer": [
{
"mark": { "type": "bar", "stroke": "white" },
"encoding": {
"color": {
"field": "Segment",
"sort": { "field": "Sales", "order": "descending" },
"legend": { "orient": "top", "title": null }
}
}
},
{
"mark": { "type": "text", "dy": -7.5 },
"encoding": { "text": { "field": "Sales_M", "format": ",d" } }
}
]
}
],
"encoding": {
"x": {
"field": "Country",
"sort": { "field": "Sales", "op": "sum", "order": "descending" },
"axis": null
},
"y": { "type": "quantitative", "axis": null }
}
}

I've again highlighted the lines that have changed, but an overview of what I've done is as follows:

  • Added a transform to calculate the Sales measure in millions, and store it in a new field called Sales_M. This is used for the labels. We could do this using Power BI formatting against the Sales measure, but I'm trying to keep things replicable in Vega-Lite outside Power BI for this example.

  • Created a sub-layer in the first layer, for three marks:

    • Our original bar mark, which is the grey bar for the Country field.
    • A text mark for the Country field, offset by dy by -15 pixels (upwards).
    • A text mark for the (aggregated) Sales_M field, offset by dy -by 15 pixels (upwards).

    Both text marks have opposing baseline properties, so while they share the same y position and offset, they will be positioned above and below their resolved y-position accordingly.

  • The second layer has a similar approach to the first, where we create a sub-layer for two marks:

    • The bar mark, which is the colored bar for the Segment field. We have also moved the color encoding for the layer into this sub-layer, so that the text marks do not inherit it.
    • A text mark for the Sales_M field, offset by dy by -7.5 pixels (upwards).

And now we have a visual that looks like what we intended:

Wrapping Upโ€‹

I'm really grateful for Sebastine, Darragh and Kevin for taking the time to write up such a wonderful study on this approach and sharing their techniques with the data visualization community, and I hope that this implementation does that some justice, and help folks to implement this design in other tools. Hopefully anyone reading may have learned a little bit more about what Vega-Lite can do (and how Deneb can help you to achieve similar results in Power BI).

As promised, here are some links to the finished recipe:

Thanks as always for reading!

DM-P