4. Creating Sunburst Charts#
Sunburst charts are a hierarchical data visualization technique that displays relationships and proportions within nested categories using concentric circles. Each layer of the chart represents a level in the hierarchy, starting from the center and expanding outward. For instance, in a company structure, the inner circle could represent the organization as a whole, the next layer the departments, and subsequent layers the teams or individual roles within those departments. The size of each segment typically corresponds to a numerical value, such as revenue or population, making it easy to compare proportions within and across different categories.
Sunburst charts are particularly useful for visualizing hierarchical data with many layers and showcasing the composition of categories within a whole. They are effective in situations where relationships and proportions at multiple levels need to be understood simultaneously, such as organizational structures, file system breakdowns, or market share by product lines and subcategories. Their circular layout makes efficient use of space and can visually engage an audience, making them a great choice for storytelling and presentations.
However, sunburst charts should be avoided when the hierarchy is too deep or the data is too detailed, as this can lead to visual clutter and make the chart difficult to interpret. Additionally, they are not ideal for datasets requiring precise comparisons between categories, as the curved segments can distort perception compared to linear representations. In such cases, alternative visualizations like tree maps or bar charts may be more effective. The key is to use sunburst charts when the hierarchical structure and overall proportions are the primary focus, rather than specific quantitative analysis.
Getting ready#
For this recipe we will create two data sets. The fist one df1
contains data representing personal goals falling in different categories. While the second one df2
is an extract from the Gapminder
data set
import pandas as pd
df1 = pd.DataFrame(dict(Activity=["My Goals", "Health", "Career", "Personal", "Finance",
"Exercise", "Diet", "Sleep",
"New Skills", "Networking",
"Family", "Friends",
"Investing", "Saving"
],
Category=["", "My Goals", "My Goals", "My Goals", "My Goals", "Health", "Health", "Health", "Career", "Career", "Personal", "Personal", "Finance", "Finance"],
Value=[0, 25, 25, 25, 25, 10, 10, 10, 10, 10, 10, 10, 10, 10]))
df1.head()
Activity | Category | Value | |
---|---|---|---|
0 | My Goals | 0 | |
1 | Health | My Goals | 25 |
2 | Career | My Goals | 25 |
3 | Personal | My Goals | 25 |
4 | Finance | My Goals | 25 |
import plotly.express as px
df2 = px.data.gapminder().query("year == 2007")
df2.head()
country | continent | year | lifeExp | pop | gdpPercap | iso_alpha | iso_num | |
---|---|---|---|---|---|---|---|---|
11 | Afghanistan | Asia | 2007 | 43.828 | 31889923 | 974.580338 | AFG | 4 |
23 | Albania | Europe | 2007 | 76.423 | 3600523 | 5937.029526 | ALB | 8 |
35 | Algeria | Africa | 2007 | 72.301 | 33333216 | 6223.367465 | DZA | 12 |
47 | Angola | Africa | 2007 | 42.731 | 12420476 | 4797.231267 | AGO | 24 |
59 | Argentina | Americas | 2007 | 75.320 | 40301927 | 12779.379640 | ARG | 32 |
How to do it#
Import the
plotly.express
module aspx
import plotly.express as px
Create a minimal sunburst chart by calling the function
sunburst
from Plotly express and passing the data setdf1
as well as the following key arguments:
names
parents
In addition, the following extra arguments are specified
height
andwidth
title
fig = px.sunburst(df1, names='Activity', parents='Category',
height=600, width=900,
title='Personal Goals 2025')
fig.show()
Customise the colors used in the scatter by using the input
color_discrete_sequence
fig = px.sunburst(df1, names='Activity', parents='Category',
color_discrete_sequence=px.colors.sequential.deep,
height=600, width=900,
title='Personal Goals 2025')
fig.show()
Customise the data to appear in the hover tooltip by specifying the arguments
hover_name
andhover_data
fig = px.sunburst(df1, names='Activity', parents='Category',
color_discrete_sequence=px.colors.sequential.deep,
hover_name='Category',
hover_data={'Activity':True, 'Category': False},
height=600, width=900,
title='Personal Goals 2025')
fig.show()
For our second example, we are going to use the second data set df2
which contains a subset of the Gapminder
data. The idea is to visualise both Population size and Life Expectancy in the world with particular interest on the comparison between regions.
Create an sunburst chart using the function
sunburst
and passing the argumentspath
: this represents the list of columns names or columns of a rectangular dataframe defining the hierarchy of sectors, from root to leaves (groing from the center outwards). In this case, we pass the list['continent', 'country']
since we want to be able to visualise the data by regionsvalues
: this is the name of the column (alternatively, you can pass an array object) that is used to specify the size of the wedges which form the Sunburst chart. In our case we passpop
since we want to visualise the population size
fig = px.sunburst(df2,
path=['continent', 'country'],
values='pop',
height=600, width=900,
title='Life Expectancy accross Regions in 2007'
)
fig.show()
Use the argument
color
to add an extra dimension to the data through the color of the wedges. In this case, we are going to associate the color of each wedge with the Life expectancy of the corresponding country or region. So, we setcolor='lifeExp'
. We also use the argumentcolor_continuous_scale
to set the color palette to be used
fig = px.sunburst(df2,
path=['continent', 'country'],
values='pop',
color='lifeExp',
color_continuous_scale='RdBu',
height=600, width=900,
title='Life Expectancy accross Regions in 2007'
)
fig.show()
Set the midpoint of the color bar to the average Life Expectancy weighted by the population size of each country
import numpy as np
avg_life_exp_pop = np.average(df2['lifeExp'], weights=df2['pop'])
fig = px.sunburst(df2, path=['continent', 'country'], values='pop',
color='lifeExp',
color_continuous_scale='RdBu',
color_continuous_midpoint=avg_life_exp_pop,
hover_name='country',
hover_data=['iso_alpha'],
height=600, width=900,
title='Life Expectancy accross Regions in 2007'
)
fig.show()