This guide will show how to create a Sankey Diagram in Excel. Use the ChartEngine add-in to create stunning data visualization in seconds.
These diagrams offer insights into energy, material, or information transfers, which are important for enhancing efficiency, understanding system dynamics, and making informed decisions.
Table of contents:
- What is a Sankey Diagram?
- How to create a Sankey Diagram
- How Sankey Diagrams Works
- How to use the Sankey Diagram?
- Example
- How to make a Sankey diagram in Excel manually
- Conclusion
What is a Sankey Diagram?
The Sankey Diagram converts information into accessible, visual insights, highlighting areas that might require attention or intervention. These visualizations are more engaging and understandable than traditional numerical data or boring reports. In addition, you can recognize patterns in a few seconds. The diagram helps us drill down a complex data set and return a detailed overview of how the data flows and changes between stages.
How to Create a Sankey Diagram in Excel
Here are the steps to create a Sankey Diagram in Excel:
- Open Microsoft Excel
- Click the Developer Tab, then click Excel Add-ins.
- Install the ChartEngine add-in for Excel.
- Select data, then click the Sankey icon on the ribbon.
- Choose your color theme.
- Click OK.
- The diagram is ready.
Excel does not have native support for Sankey diagrams.
Creating this type of visualization in Excel is a complex task without using third-party chart tools. The ChartEngine add-in provides a user-friendly interface for creating these diagrams with zero coding.
Here is a sample data set:
After clicking the icon, you can choose various color styles.
The result looks great.
Okay, the question is: is it possible to create the diagram without an add-in?
In short, the answer is no. Even if you are an Excel ninja, Excel has some limitations. Developing complex visualizations in Excel requires high-level programming skills in VBA or .NET.
How Sankey Diagrams Works
Take a closer look at how the visualization works. Imagine you’re trying to explain a bunch of complicated info, but you want to keep it chill and easy to understand.
- Choose Your Story: First, you decide what story you’re telling. It could be about how your marketing cost is used and how people navigate a website. The “story” is all the movements you’re tracking.
- Identify the Nodes: Now, determine where the main activity starts, changes, or finishes.
- Map the Journey using flows: Next, you draw the lines or “flows” connecting your nodes. These show the path from one node to another.
- Add the Details (Width and Colors): You use different widths for the flows to show more or less energy, cost, or money moving. Then, add colors to sort things out or make points clear.
- Make Smart Decisions: Now, use your diagram to make decisions. Do you need to study more? Change some variables and test a new scenario. It is fast and easy.
Why is the Sankey Diagram important?
- Get the Big Picture Quickly: Sankey diagrams display everything in one easy-to-understand visualization. It is like getting a bird’s-eye view of a busy city. You can see where things start, where they end up, and what routes they take to get there.
- Spot the important details: Clarify where most resources (energy, money, or web traffic) go. This helps you focus on what’s important and what you need to change.
- Ability to understand complex activities: Some things are very complex and have many parts, but breaking them down makes them much easier to understand.
- Communicating Clearly: They clarify your point without much back-and-forth. They’re like a universal language for tough ideas.
How to use the Sankey Diagram?
The goal is simple: “Follow the flow.” Using the Sankey Diagram, you can quickly overview significant changes in your data across multiple stages. In real life, we work with dynamic data. Accordingly, tracking changes and visualizing the flow between two or more stages is essential. The main point of the analysis is to understand how and why the data changes.
The Sankey diagram can be used for various purposes. For example, analyzing costs, website traffic, or issues in a software development project is easy. It is a Swiss knife.
Sankey Diagram Example
The first step is to set up your data.
To create a Sankey diagram, use the following structure:
The selected range should contain a minimum of three rows, and the last column should contain positive numbers. In the example, three columns contain categorical data (Total Spend, Department, and Category), and one column contains numerical data.
Insert the diagram
Select the data and click the Sankey icon to insert a new diagram. After clicking the icon, the add-in generates the flow diagram in real time and shows the connections between stages. The diagram transforms the cash-flow data into a stunning visualization.
The diagram speaks for itself, but it is worth spending a few words to highlight the essence.
Insights
- In Level 1, the total amount spent is $313,000.
- At Level 2 (Departments), the size of the nodes shows the amount spent by different departments. In the example, it is easy to check that the Marketing department had the highest expenditure while the Sales department had the lowest.
- Level 3 breaks down the data into multiple categories;
- We spent $155,000 on Salaries, which is 49.52% of the total cost; this is the highest amount.
- The Advertising category had the lowest expenditure of $6,000, which made up only 1.92% of the expenses.
The Sankey Diagram’s main advantage is its ability to uncover patterns. Furthermore, it provides more insight instead of analyzing huge data tables. So, using high-level views is outstanding.
You can click on the main blocks, like the example below. In this case, you selected the Marketing Department and got a breakdown immediately.
Take a closer look at the highlighted stage! In Level 2, the selected department had the highest expenditure, $123,000, 39.30% of the total cost. Level 3 breaks down the cost flow of the Marketing activity into six categories. In the example, we spent various amounts on Advertising, Collateral, Events, Salaries, Translation Services, and Website design. Based on this highlighting feature, you can select any department to analyze the cost structure. It is very useful when you want to take a closer look at the given category.
So, creating an analysis based on the chart can take only minutes. To reset the layout, click on the diagram background. You can take a closer look at the example!
How to make a Sankey diagram in Excel manually
This guide will explain building a Sankey Diagram in Excel without add-ins. The trick is to build multiple charts and merge them into a final diagram. If the shaded lines are partially transparent and overlap, you can construct a Sankey diagram. The connectors use 100% stacked area charts, and the start and end blocks are based on 100% stacked column charts. We will show you how to transform a sales table into a diagram.
You can download the practice file.
#1. Setting up the data
For the sake of simplicity, we will use a 4×4 matrix to create a Sankey Diagram. Each row represents the diagram’s starting point, and the columns calculate the endpoint. The numeric data set will show the flow rate.
We have an additional field named range, Gap.
The reason for using these values is to ensure spaces between categories.
#2. Create Calculation tables
We will use three tables and one named range to create the structure.
- Table names: Lines, StartBlock, EndBlock
- Named range: Divider
The Lines table uses our initial data set. We will create a table containing all row and column combinations. To separate the row categories, insert an additional row. The goal of creating additional rows is to insert blank spaces between the chart components.
The values in the Lines table are linked to the original table. Let us see the formulas!
#3. Calculate Data Points
Formula:
=IF(LEFT([@From],5)=”Blank”,Gap,INDEX(Data,MATCH([@From],Data[From / To],0),MATCH([@To],Data[#Headers],0)))
To insert a 100% stacked area chart, we need to calculate the following data points:
The “End Position” column decides the order of the lines (connectors) at the diagram’s end.
- The gap above the line
- Value of the line
- The gap below the line
Apply some additional columns to perform the necessary calculations.
Explanation:
- Above Start: =SUM(Lines[[#Headers],[Value]]:[@Value])-[@Value] The formula returns a value and determines the blank area’s required size at the starting point.
- Above Middle 1: =[@[Above Start]] The value is a simple cell reference from the AboveStart column.
- Above Middle 2: =[@[Above End]] The value is a simple cell reference from the AboveEnd column.
- Above End: =SUM([Value])-SUMIFS([Value],[End Position],”>=”&[@[End Position]]) The formula returns the required space above the Sankey line at the endpoint.
- Start value, Middle 1, Middle 2, End Value =[@Value]
- Below Start =SUM([Value])-[@[Above Start]]-[@Start]
- Below Middle 1 =SUM([Value])-[@[Above Middle 1]]-[@[Middle 1 Value]]
- Below Middle 2 =SUM([Value])-[@[Above Middle 2]]-[@[Middle 2 Value]]
- Below End =SUM([Value])-[@[Above End]]-[@[End Value]]
The StartBlocks table collects each row category from the source table, and we apply a gap row for every other row.
The formula in a value column: =SUMIFS(Lines[Value],Lines[From],[@From])
The EndBlock table works the same as the StartBlocks table. The difference is that we use each column category linked to the main Data table. After every nth row, insert a Gap row.
Formula: =SUMIFS(Lines[Value],Lines[To],[@To])
The final part of the calculations is a named range called “Divider.” We will use this range as a horizontal category axis. It establishes the starting and ending points of the chart’s gradient.
#4. Insert a stacked area chart and use overlapping
All calculations are ready; it is time to create the chart section. Select the data and insert a 100% stacked area chart. Each row of the Lines table represents a 100% stacked area chart that uses three data series.
The next step is to format the chart. Right-click on the axis and click Format Axis. Under “Axis Options,” check Date Axis. Also, check the “Dates in reverse order” checkbox.
Finally, clean up the chart area. Delete legends and the chart title. Apply “No fill” for the chart background.
Repeat these steps for all rows! The Sankey Diagram connectors are based on multiple small charts.
#5. Create Sankey Blocks
The last step is to create the start and the end blocks using two 100% stacked column charts.
Apply the following formatting setup for the blocks:
- Plot series in reverse order
- Fill the blocks.
- Set the “Gap” sections with “No Fill”
- Add data labels.
Use the same method for the end blocks. Finally, align the connectors properly between the start and end blocks; our Sankey Diagram is ready.
Frequently Asked Questions about Sankey Diagrams
The diagram can be used for various purposes. Its main function is to visualize the flow of a particular resource (money, time allocation, various activities) between two or more categories.
One of the great advantages of the Sankey Diagram is that it is customizable, like most charts in Excel, and offers different views.
You can use a data set that contains various categories. The main point is that you can only use numbers in the last column of your data set.
If you want to use another chart type, the Sales Funnel is the recommended type of visualization. If you are in Sales, use a sales funnel to follow the activities from the first cold call to the purchase.
Avoid creating complicated diagrams! Remember the following rule: use a maximum of 3 or 4 nodes and try to keep the number of flows below 10. Without these conditions, the diagram loses its most important property.
The wider node represents greater resource usage between the two categories. It is crucial to identify non-efficient or less efficient areas or activities. Sankey Diagram provides a quick analysis tool; you can easily identify the critical points. Based on the result, just create a different scenario and regenerate the chart.
Conclusion
So, Sankey diagrams are important, whether you’re trying to save energy at home, run a giant company, or figure out traffic patterns. The main point is that they help turn “What is happening?” into “Oh, I see now!”
We hope you enjoyed this small introduction. Learn more about chart templates!