1. Making animated scatter charts#

Animated scatter plots are a dynamic and engaging data visualization technique that allows us to track changes in multi-dimensional data over time (or across a continuous variable). Unlike static scatter/bubble plots, which capture data at a single state, animated scatter plots show movement and transitions in the data, revealing trends, patterns, and outliers that might otherwise go unnoticed. Each frame of the animation represents a snapshot of the data at a particular time or state, and the sequential movement provides a narrative of how the data evolves.

This technique is particularly useful in scenarios where time-series data or changes in one or more variables are of interest. For instance, animated scatter plots can effectively demonstrate economic shifts, such as changes in GDP and life expectancy across countries over decades. By animating the data, these plots make it easier to identify correlations, causations, and cycles while engaging the viewer in a way that static graphs may not. Additionally, they are a powerful tool for presentations and storytelling, as they can visually guide the audience through the insights and context behind the data.

Getting ready#

For this recipe we will load the Gapminder data set from the plotly.express module.

import plotly.express as px
df = px.data.gapminder()
df = df[df.country!='Kuwait']

Inspect the data by calling the method head on the data frame

df.head()
country continent year lifeExp pop gdpPercap iso_alpha iso_num
0 Afghanistan Asia 1952 28.801 8425333 779.445314 AFG 4
1 Afghanistan Asia 1957 30.332 9240934 820.853030 AFG 4
2 Afghanistan Asia 1962 31.997 10267083 853.100710 AFG 4
3 Afghanistan Asia 1967 34.020 11537966 836.197138 AFG 4
4 Afghanistan Asia 1972 36.088 13079460 739.981106 AFG 4

How to do it#

  1. Make a simple animated scatter using the function px.scatter in the same way as an static scatter plot (passing the data frame as well as the names of the two columns that will be ploted as x and y respectively) but adding

  • animation_frame: this column or array like is used to assign marks to animation frames. In our case, we pass the string year since we want to make the animation to run over time

  • animation_group: this column or array like is used to provide object-constancy across animation frames. That is, rows with matching animation_groups will be treated as if they describe the same object in each frame. In our case, we pass the string country since each dot represents a country which we want to animate over time

Notice that we are also passing the following arguments to set the aesthetics of the plot

  • color_discrete_sequence

  • height and width

  • template

Then, use the method show to display the Figure object

fig = px.scatter(df, x='gdpPercap', y ='lifeExp',
                 animation_frame="year", 
                 animation_group="country",
                 color="continent",
                 color_discrete_sequence=px.colors.qualitative.Bold,
                 height=500, width=800,
                 template='plotly_white',
                 title='Gap Minder Data: GDP per Capita vs Life Expectancy'
                 )
fig.show()

By inspecting the resulting animation we quickly can make the following observations

  • the name of the country is not visible when hoovering over each point

  • the range of boht axes is fixed and some points fall outside of it as the animation progresses over time

Let’s improve our animation by fixing these two issues.

  1. Add the input argument hover_name and pass the string country. This will result on the name of the country being appearing in bold in the hover tooltip.

fig = px.scatter(df, x='gdpPercap', y ='lifeExp', 
                 animation_frame="year", 
                 animation_group="country",
                 color="continent",
                 color_discrete_sequence=px.colors.qualitative.Bold,
                 template='plotly_white',
                 height=600, width=800,
                 title='Gap Minder Data: GDP per Capita vs Life Expectancy',
                 hover_name="country",
                 )
fig.show()
  1. In order to find appropriate ranges for the axes, use the method describe on the data frame df and look at the minimum and maximum values for each relevant column. Then, use the arguments range_x and range_y to set the axes’s range for the animation.

df.describe()
year lifeExp pop gdpPercap iso_num
count 1692.000000 1692.000000 1.692000e+03 1692.000000 1692.000000
mean 1979.500000 59.407433 2.980259e+07 6803.145639 425.964539
std 17.265365 12.923315 1.065068e+08 8139.536000 249.183165
min 1952.000000 23.599000 6.001100e+04 241.165876 4.000000
25% 1965.750000 48.125000 2.829882e+06 1192.603485 208.000000
50% 1979.500000 60.492000 7.150606e+06 3484.113173 410.000000
75% 1993.250000 70.811250 1.977102e+07 9145.776073 638.000000
max 2007.000000 82.603000 1.318683e+09 49357.190170 894.000000

This shows that the values on the gdpPercap column go from 241 to 49,357. Thus passing range_x = [0, 55000] would cover all the values. Similarly, the values on the lifeExp column go from 23.5 to 82.6. Tgus, passing range_y = [20, 90] would cover all the values.

fig = px.scatter(df, x='gdpPercap', y ='lifeExp', 
                 animation_frame="year", 
                 animation_group="country",
                 hover_name="country",
                 color="continent",
                 color_discrete_sequence=px.colors.qualitative.Bold,
                 template='plotly_white',
                 height=600, width=800,
                 title='Gap Minder Data: GDP per Capita vs Life Expectancy',
                 range_x = [0, 55000],
                 range_y = [20, 90]
                 )
fig.show()
  1. Transform your animated scatter into an animated bubble char by passing the additonal arguments

  • size

  • size_max This allows you to display an extra variable in your visualization

fig = px.scatter(df, x='gdpPercap', y ='lifeExp', 
                 animation_frame="year", 
                 animation_group="country",
                 hover_name="country",
                 color="continent",
                 color_discrete_sequence=px.colors.qualitative.Bold,
                 size="pop",
                 size_max=50, 
                 template='plotly_white',
                 height=600, width=800,
                 title='Gap Minder Data: GDP per Capita vs Life Expectancy',
                 range_x = [-5000, 55000],
                 range_y = [25, 90]
                 )
fig.show()

There is more#

Using a logarithmic scale for an axis in data visualization is a powerful technique for displaying data that spans several orders of magnitude. Unlike a linear scale, where each unit increase corresponds to an equal increment, a logarithmic scale represents data in terms of powers of a base (commonly 10).

This approach compresses large values and expands smaller ones, making it easier to visualize and compare data with extreme ranges. Logarithmic scales are particularly useful in fields like finance (e.g., stock price changes), science (e.g., earthquake magnitudes, pH levels), and engineering (e.g., signal strength). They are ideal for identifying proportional relationships, exponential growth, or power-law distributions that might be obscured on a linear scale.

However, it is crucial to clearly label the axis and ensure the audience understands the scale, as logarithmic representations can be less intuitive for those unfamiliar with the concept. Misuse or poor communication of the logarithmic scale can lead to confusion or misinterpretation.

Set the scale of the x-axis as logarithmic by passing the argument log_x = True

Note: Misuse or poor communication of the logarithmic scale can lead to confusion or misinterpretation.

fig = px.scatter(df, x='gdpPercap', y ='lifeExp', 
                 animation_frame="year", 
                 animation_group="country",
                 hover_name="country",
                 color="continent",
                 size="pop",
                 size_max=75, 
                 height=600, width=800,
                 template='plotly_white',
                 title='Gap Minder Data: GDP per Capita vs Life Expectancy',
                 color_discrete_sequence=px.colors.qualitative.Bold,
                 range_x=[100,100000],
                 range_y = [25, 90],
                 log_x=True,
                 )
fig.show()