# Making line charts


A **[line chart](https://en.wikipedia.org/wiki/Line_chart)** is a very popular type of graph used to represents a **series of data points**, called **markers, connected by straight line segments**. 


It is particularly useful for visualizing **continuous data**, as well as illustrating **trends over time**. The x-axis of a line chart typically represents time or categories, while the y-axis shows the measured value or quantity. 

Each marker on the graph represents a data value, and the lines connecting them help to illustrate how these values change. Sometimes, markers are not showed in line charts, especially when the main message of the chart is the trend.  

üöÄ When to use them:

- Line charts are most useful when you want to visualize continuous data, such as time series (e.g., stock prices, temperature changes, or website traffic over time). They make it easy to identify upward or downward trends and to compare multiple sets of data at once, which is why they are often employed in financial reports, scientific studies, and business analytics. 


‚ö†Ô∏è Be aware:

- Line charts may not be effective when working with *[categorical variables](https://en.wikipedia.org/wiki/Categorical_variable)* or a data that doesn't have an inherent order. 

- Additionally, if there are too many data points or the data is highly variable, the chart can become cluttered, making it harder to interpret patterns or meaningful insights.

## Getting ready

For this recipe we will load the `gapminder` data set and create 2 data sets

1. Import the `plotly.express` module as `px`

In [1]:
import plotly.express as px

2. Load the `gapminder` data set and add a new column with the values of Life expectancy rounded to the first decimal place.

In [2]:
df = px.data.gapminder()
df['Life Expectancy'] = df['lifeExp'].round(1)

3. Create two data sets: 
   - `data1`  contains only the data for China
   - `data2` contains only the data for those countries in Asia

In [3]:
data1 = df[df.country=='China']

In [4]:
data2 = df[df.continent=='Asia']

## How to do it

### Single Line

1. Make a simple line chart by using the function `line` from the `plotly.express` module. 

Here we are passing the following inputs:

- `data_frame` : this is the data set to be used. Typically this is a `DataFrame` object, but you can only pass a dictionary or another array-like object. If missing, the function builds a `DataFrame` from the rest of the inputs. Note that each row of the data set is represented as vertex of a polyline mark in 2D space.
- `x` : the name of a column whose values will be used as x axis in cartesian coordinates
- `y` : the name of a column whose values will be used as y axis in cartesian coordinates
- `title` : the title for our chart
  

In [5]:
df = data1
fig = px.line(data_frame=df, x="year", y="Life Expectancy", 
              title="China's Life Expectancy over Time")
fig.show()

2. Customise the size of the figure by using the inputs `height` and `width`. Both have to be integers and correspond to the size of the figure in pixels

In [11]:
fig = px.line(data_frame=df, x="year", y="Life Expectancy", 
              height=600, width=800,
              title="China's Life Expectancy over Time")
fig.show()

3. Change the color of the line by using `color_discrete_map` and passing a dictionary mapping the values in the column passed as `color` into color names as strings. In this case, we have only one value (`China`) and we map it to the color `green`.

In [15]:
fig = px.line(df, x="year", y="Life Expectancy", 
              color='country', color_discrete_map={'China':'green'},
              height=600, width=800,
              title="China's Life Expectancy over Time")
fig.show()

Alternatively, you can use the input `color_discrete_sequence`

In [16]:
fig = px.line(df, x="year", y="Life Expectancy", 
              color='country', 
              color_discrete_sequence=px.colors.qualitative.Bold,
              height=600, width=800,
              title="China's Life Expectancy over Time")
fig.show()

Get ride of the legend using the method `update_layout` on the figure object as follows.

In [17]:
fig = px.line(df, x="year", y="Life Expectancy", 
              color='country', color_discrete_map={'China':'green'},
              height=600, width=800,
              title="China's Life Expectancy over Time")
fig.update_layout(showlegend=False)
fig.show()

4. Add the values of a column on top of the  line by using the input `text`

In [23]:
fig = px.line(df, x="year", y="Life Expectancy", 
              text='Life Expectancy',
              height=600, width=800,
              title="China's Life Expectancy over Time")
fig.show()

### Multiple Lines from a single Data Set

Load our second data set as `df` and inspect it by calling the method `head`


In [36]:
df = data2
df.head()

Unnamed: 0,country,continent,year,lifeExp,pop,gdpPercap,iso_alpha,iso_num,Life Expectancy
0,Afghanistan,Asia,1952,28.801,8425333,779.445314,AFG,4,28.8
1,Afghanistan,Asia,1957,30.332,9240934,820.85303,AFG,4,30.3
2,Afghanistan,Asia,1962,31.997,10267083,853.10071,AFG,4,32.0
3,Afghanistan,Asia,1967,34.02,11537966,836.197138,AFG,4,34.0
4,Afghanistan,Asia,1972,36.088,13079460,739.981106,AFG,4,36.1


1. Make a chart showing multiple lines by using the `line` function with the inputs:
   - `x` 
   - `y` 
   - `color` 

In this case, we will make a plot to illustrate the size of the population of each country in Asia over time. Each country will be illustrated by a line of a different color.

In [31]:

fig = px.line(df, x="year", y="pop", color='country', 
              height=600, width=900,
              title='Population Size - Asia continent')
fig.show()

2. Another way to differentiate each line, is by using the  input `line_dash`.  By passing either the name of a column in the data set or a pandas series, this allows us to assign different dash patterns to each line. In this case we pass `country`.

In [30]:
fig = px.line(df, x="year", y="pop",
              line_dash='country',
              height=600, width=900,
              title='Population Size - Asia continent')
fig.show()

3. Alternatively, use the input `symbol` to assign different symbols for each line that is drawn. In this case we pass `country`.

In [29]:
fig = px.line(df, x="year", y="pop", symbol='country',
              height=600, width=900,
              title='Population Size - Asia continent')
fig.show()

4. Use a combination of `color` and `symbol`/`line_dash` to make each line distinctive

In [34]:
fig = px.line(df, x="year", y="pop", 
              color='country', 
              symbol='country',
              height=600, width=800,
              title='Population Size - Asia continent')
fig.show()

In [35]:
fig = px.line(df, x="year", y="pop", 
              color='country', 
              line_dash='country',
              height=600, width=800,
              title='Population Size - Asia continent')
fig.show()