5. Creating Icicle Charts#

Icicle charts are a hierarchical data visualization technique that displays relationships within nested categories using a stacked rectangular layout. Each layer of the hierarchy is represented by a horizontal segment, with the root node (or top-level category) placed at the top and subsequent layers branching downward. The width of each segment reflects a quantitative value, such as size or frequency, making it easy to compare the proportions of different categories within the hierarchy.

Icicle charts are particularly useful when visualizing hierarchical data with multiple levels, such as organizational structures, file systems, or budget breakdowns. Their linear design allows for straightforward reading, as users can follow the hierarchy from top to bottom without the visual distortions that can occur in circular layouts like sunburst charts. This makes icicle charts well-suited for scenarios where precision and clarity in hierarchical relationships are paramount. Additionally, the rectangular format is space-efficient and can be easier to implement and interpret in digital dashboards or printed reports.

However, icicle charts should be avoided when the hierarchy is overly complex or the dataset contains too many branches at each level. This can lead to clutter and make the chart difficult to read, especially if the segment widths become too narrow to display labels or meaningful comparisons. They are also less visually engaging than alternative representations like sunburst charts or tree maps, which might be preferred for presentations or storytelling. To maximize their effectiveness, icicle charts should be used when the focus is on hierarchical structure and precise proportions, rather than aesthetics or high-level overviews.

Getting ready#

import pandas as pd
df1 = pd.DataFrame(dict(Activity=["My Goals", "Health", "Career", "Personal", "Finance", 
               "Exercise", "Diet", "Sleep", 
               "New Skills", "Networking", 
               "Family", "Friends", 
               "Investing", "Saving"
               ],
    Category=["", "My Goals", "My Goals", "My Goals", "My Goals", "Health", "Health", "Health",  "Career",  "Career", "Personal", "Personal", "Finance", "Finance"],
    Value=[0, 25, 25, 25, 25, 10, 10, 10, 10, 10, 10, 10, 10, 10]))
df1.head(3)
Activity Category Value
0 My Goals 0
1 Health My Goals 25
2 Career My Goals 25
import plotly.express as px
df2 = px.data.gapminder().query("year == 2007")
df2.head(3)
country continent year lifeExp pop gdpPercap iso_alpha iso_num
11 Afghanistan Asia 2007 43.828 31889923 974.580338 AFG 4
23 Albania Europe 2007 76.423 3600523 5937.029526 ALB 8
35 Algeria Africa 2007 72.301 33333216 6223.367465 DZA 12

How to do it#

  1. Import the plotly.express module as px

import plotly.express as px
  1. Create a minimal icicle chart by calling the function icicle from Plotly express and passing the data set df as well as the following key arguments:

  • names

  • parents

In addition, the following extra arguments are specified

  • height and width

  • title

to set the dimensions and the title of our figure.

fig = px.icicle(df1, names='Activity',parents='Category',
                height=600, width=600,
                title='Personal Goals 2025')
fig.show()
  1. Customise the color of the root rectangle by using the method update_traces and setting the argument root_color. By default the root rectangle is white. Here we are setting the color as lightgrey to make it visible against a white background

fig = px.icicle(df1, names='Activity',parents='Category',
                height=600, width=600,
                title='Personal Goals 2025')
fig.update_traces(root_color="lightgrey")
fig.show()
  1. Customise the palette used for the inner rectangles by setting the argument color_discrete_sequence in the icicle function

fig = px.icicle(df1, names='Activity',parents='Category',
                color_discrete_sequence=px.colors.sequential.deep,
                height=600, width=600,
                title='Personal Goals 2025')
fig.update_traces(root_color="lightgrey")
fig.show()
  1. Customise the hover appearance by setting the arguments hover_name and hover_data

fig = px.icicle(df1, names='Activity',parents='Category',
                color_discrete_sequence=px.colors.sequential.deep,
                hover_name='Category',
                hover_data={'Activity':True, 'Category': False},
                height=600, width=600,
                title='Personal Goals 2025')
fig.update_traces(root_color="lightgrey")
fig.show()

For our second example, we are going to use the second data set df2 which contains a subset of the Gapminder data. The idea is to visualise both Population and Life Expectancy in the world with particular interest on the comparison between regions.

  1. Create an icicle using the function icicle and passing the arguments

    • path: this represents the list of columns names or columns of a rectangular dataframe defining the hierarchy of sectors, from root to leaves. Note that we are setting the root as a constant named World

    • values: this is the name of the column (alternatively, you can pass an array object) that is used to specify the size of the rectangles which form the Icicle chart. In our case we pass pop

fig = px.icicle(df2, path=[px.Constant("World"), 'continent', 'country'], 
                values='pop',
                height=700, width=700,
                title='Life Expectancy accross Regions in 2007')
fig.show()
  1. Use the argument color to add an additional dimension to the data through the color of the rectangles. In this case, we are going to associate the color of each rectangle with the Life expectancy of the corresponding country or region. So, we set color='lifeExp'. We also use the argument color_continuous_scale to set the color palette to be used

fig = px.icicle(df2, path=[px.Constant("World"), 'continent', 'country'], 
                values='pop',
                color='lifeExp',
                color_continuous_scale='RdBu',
                height=700, width=700,
                title='Life Expectancy accross Regions in 2007')
fig.show()
  1. Set the midpoint of the color bar to the average Life Expectancy weighted by the population size of each country

import numpy as np
average_life_exp_pop = np.average(df2['lifeExp'], weights=df2['pop'])

fig = px.icicle(df2, path=[px.Constant("world"), 'continent', 'country'], values='pop',
                color='lifeExp', hover_data=['iso_alpha'],
                color_continuous_scale='RdBu',
                color_continuous_midpoint=average_life_exp_pop,
                height=700, width=700,
                title='Life Expectancy accross Regions in 2007'
                )
fig.show()