Creating Sunburst Charts

4. Creating Sunburst Charts#

Sunburst charts are a hierarchical data visualization technique that displays relationships and proportions within nested categories using concentric circles. Each layer of the chart represents a level in the hierarchy, starting from the center and expanding outward. For instance, in a company structure, the inner circle could represent the organization as a whole, the next layer the departments, and subsequent layers the teams or individual roles within those departments. The size of each segment typically corresponds to a numerical value, such as revenue or population, making it easy to compare proportions within and across different categories.

Sunburst charts are particularly useful for visualizing hierarchical data with many layers and showcasing the composition of categories within a whole. They are effective in situations where relationships and proportions at multiple levels need to be understood simultaneously, such as organizational structures, file system breakdowns, or market share by product lines and subcategories. Their circular layout makes efficient use of space and can visually engage an audience, making them a great choice for storytelling and presentations.

However, sunburst charts should be avoided when the hierarchy is too deep or the data is too detailed, as this can lead to visual clutter and make the chart difficult to interpret. Additionally, they are not ideal for datasets requiring precise comparisons between categories, as the curved segments can distort perception compared to linear representations. In such cases, alternative visualizations like tree maps or bar charts may be more effective. The key is to use sunburst charts when the hierarchical structure and overall proportions are the primary focus, rather than specific quantitative analysis.

Getting ready#

For this recipe we will create two data sets. The fist one df1 contains data representing personal goals falling in different categories. While the second one df2 is an extract from the Gapminder data set

import pandas as pd

df1 = pd.DataFrame(dict(Activity=["My Goals", "Health", "Career", "Personal", "Finance", 
               "Exercise", "Diet", "Sleep", 
               "New Skills", "Networking", 
               "Family", "Friends", 
               "Investing", "Saving"
               ],
    Category=["", "My Goals", "My Goals", "My Goals", "My Goals", "Health", "Health", "Health",  "Career",  "Career", "Personal", "Personal", "Finance", "Finance"],
    Value=[0, 25, 25, 25, 25, 10, 10, 10, 10, 10, 10, 10, 10, 10]))

df1.head()

	Activity	Category	Value
0	My Goals		0
1	Health	My Goals	25
2	Career	My Goals	25
3	Personal	My Goals	25
4	Finance	My Goals	25

import plotly.express as px

df2 = px.data.gapminder().query("year == 2007")

df2.head()

	country	continent	year	lifeExp	pop	gdpPercap	iso_alpha	iso_num
11	Afghanistan	Asia	2007	43.828	31889923	974.580338	AFG	4
23	Albania	Europe	2007	76.423	3600523	5937.029526	ALB	8
35	Algeria	Africa	2007	72.301	33333216	6223.367465	DZA	12
47	Angola	Africa	2007	42.731	12420476	4797.231267	AGO	24
59	Argentina	Americas	2007	75.320	40301927	12779.379640	ARG	32

How to do it#

Import the plotly.express module as px

import plotly.express as px

Create a minimal sunburst chart by calling the function sunburst from Plotly express and passing the data set df1 as well as the following key arguments:

names
parents

In addition, the following extra arguments are specified

height and width
title

fig = px.sunburst(df1, names='Activity', parents='Category', 
                  height=600, width=900,
                  title='Personal Goals 2025')
fig.show()

Customise the colors used in the scatter by using the input color_discrete_sequence

fig = px.sunburst(df1, names='Activity', parents='Category', 
                  color_discrete_sequence=px.colors.sequential.deep,
                  height=600, width=900,
                  title='Personal Goals 2025')
fig.show()

Customise the data to appear in the hover tooltip by specifying the arguments hover_name and hover_data

fig = px.sunburst(df1, names='Activity', parents='Category', 
                  color_discrete_sequence=px.colors.sequential.deep,
                  hover_name='Category',
                  hover_data={'Activity':True, 'Category': False},
                  height=600, width=900,
                  title='Personal Goals 2025')
fig.show()

For our second example, we are going to use the second data set df2 which contains a subset of the Gapminder data. The idea is to visualise both Population size and Life Expectancy in the world with particular interest on the comparison between regions.

Create an sunburst chart using the function sunburst and passing the arguments
- path: this represents the list of columns names or columns of a rectangular dataframe defining the hierarchy of sectors, from root to leaves (groing from the center outwards). In this case, we pass the list ['continent', 'country'] since we want to be able to visualise the data by regions
- values: this is the name of the column (alternatively, you can pass an array object) that is used to specify the size of the wedges which form the Sunburst chart. In our case we pass pop since we want to visualise the population size

fig = px.sunburst(df2, 
                  path=['continent', 'country'], 
                  values='pop',
                  height=600, width=900,
                  title='Life Expectancy accross Regions in 2007'
                  )
fig.show()

Use the argument color to add an extra dimension to the data through the color of the wedges. In this case, we are going to associate the color of each wedge with the Life expectancy of the corresponding country or region. So, we set color='lifeExp'. We also use the argument color_continuous_scale to set the color palette to be used

fig = px.sunburst(df2, 
                  path=['continent', 'country'], 
                  values='pop',
                  color='lifeExp', 
                  color_continuous_scale='RdBu',
                  height=600, width=900,
                  title='Life Expectancy accross Regions in 2007'
                  )
fig.show()

Set the midpoint of the color bar to the average Life Expectancy weighted by the population size of each country

import numpy as np
avg_life_exp_pop = np.average(df2['lifeExp'], weights=df2['pop'])

fig = px.sunburst(df2, path=['continent', 'country'], values='pop',
                  color='lifeExp', 
                  color_continuous_scale='RdBu',
                  color_continuous_midpoint=avg_life_exp_pop,
                  hover_name='country',
                  hover_data=['iso_alpha'],
                  height=600, width=900,
                  title='Life Expectancy accross Regions in 2007'
                  )
fig.show()

Creating Sunburst Charts

Contents

4. Creating Sunburst Charts#

Getting ready#

How to do it#