{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Reading Data from Excel"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Getting Ready\n",
"\n",
"\n",
"Make sure you the library [`xlrd`](https://xlrd.readthedocs.io/en/latest/) is available in your Python environment."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Install the library `kaleido` using \n",
"\n",
"```\n",
"pip install -U xlrd\n",
"```\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to do it\n",
"1. Import `pandas` and `numpy` and create a `DataFrame` to work with"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. Read a `csv` file with the `read_csv` method"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" 2023 | \n",
" May | \n",
" 129.1 | \n",
" 7.9 | \n",
" 0.6 | \n",
" 131.3 | \n",
" 8.7 | \n",
" 0.7 | \n",
" 117.5 | \n",
" 4.2 | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" NaN | \n",
" Jun | \n",
" 129.4 | \n",
" 7.3 | \n",
" 0.2 | \n",
" 131.5 | \n",
" 7.9 | \n",
" 0.1 | \n",
" 118.1 | \n",
" 4.4 | \n",
"
\n",
" \n",
" 1 | \n",
" NaN | \n",
" Jul | \n",
" 129.0 | \n",
" 6.4 | \n",
" -0.3 | \n",
" 130.9 | \n",
" 6.8 | \n",
" -0.4 | \n",
" 118.6 | \n",
" 4.5 | \n",
"
\n",
" \n",
" 2 | \n",
" NaN | \n",
" Aug | \n",
" 129.4 | \n",
" 6.3 | \n",
" 0.4 | \n",
" 131.3 | \n",
" 6.7 | \n",
" 0.3 | \n",
" 119.2 | \n",
" 4.8 | \n",
"
\n",
" \n",
" 3 | \n",
" NaN | \n",
" Sep | \n",
" 130.1 | \n",
" 6.3 | \n",
" 0.5 | \n",
" 132.0 | \n",
" 6.7 | \n",
" 0.5 | \n",
" 119.8 | \n",
" 5.0 | \n",
"
\n",
" \n",
" 4 | \n",
" NaN | \n",
" Oct | \n",
" 130.2 | \n",
" 4.7 | \n",
" 0.1 | \n",
" 132.0 | \n",
" 4.6 | \n",
" 0.0 | \n",
" 120.6 | \n",
" 5.4 | \n",
"
\n",
" \n",
" 5 | \n",
" NaN | \n",
" Nov | \n",
" 130.0 | \n",
" 4.2 | \n",
" -0.1 | \n",
" 131.7 | \n",
" 3.9 | \n",
" -0.2 | \n",
" 121.1 | \n",
" 5.3 | \n",
"
\n",
" \n",
" 6 | \n",
" NaN | \n",
" Dec | \n",
" 130.5 | \n",
" 4.2 | \n",
" 0.4 | \n",
" 132.2 | \n",
" 4.0 | \n",
" 0.4 | \n",
" 121.6 | \n",
" 5.3 | \n",
"
\n",
" \n",
" 7 | \n",
" 2024 | \n",
" Jan | \n",
" 130.0 | \n",
" 4.2 | \n",
" -0.4 | \n",
" 131.5 | \n",
" 4.0 | \n",
" -0.6 | \n",
" 122.1 | \n",
" 5.4 | \n",
"
\n",
" \n",
" 8 | \n",
" NaN | \n",
" Feb | \n",
" 130.8 | \n",
" 3.8 | \n",
" 0.6 | \n",
" 132.3 | \n",
" 3.4 | \n",
" 0.6 | \n",
" 123.2 | \n",
" 6.0 | \n",
"
\n",
" \n",
" 9 | \n",
" NaN | \n",
" Mar | \n",
" 131.6 | \n",
" 3.8 | \n",
" 0.6 | \n",
" 133.0 | \n",
" 3.2 | \n",
" 0.6 | \n",
" 123.9 | \n",
" 6.3 | \n",
"
\n",
" \n",
" 10 | \n",
" NaN | \n",
" Apr | \n",
" 132.2 | \n",
" 3.0 | \n",
" 0.5 | \n",
" 133.5 | \n",
" 2.3 | \n",
" 0.3 | \n",
" 124.6 | \n",
" 6.6 | \n",
"
\n",
" \n",
" 11 | \n",
" NaN | \n",
" May | \n",
" 132.7 | \n",
" 2.8 | \n",
" 0.4 | \n",
" 133.9 | \n",
" 2.0 | \n",
" 0.3 | \n",
" 125.3 | \n",
" 6.7 | \n",
"
\n",
" \n",
" 12 | \n",
" Source: | \n",
" Consumer price inflation from the Office for N... | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" 2023 May 129.1 7.9 \\\n",
"0 NaN Jun 129.4 7.3 \n",
"1 NaN Jul 129.0 6.4 \n",
"2 NaN Aug 129.4 6.3 \n",
"3 NaN Sep 130.1 6.3 \n",
"4 NaN Oct 130.2 4.7 \n",
"5 NaN Nov 130.0 4.2 \n",
"6 NaN Dec 130.5 4.2 \n",
"7 2024 Jan 130.0 4.2 \n",
"8 NaN Feb 130.8 3.8 \n",
"9 NaN Mar 131.6 3.8 \n",
"10 NaN Apr 132.2 3.0 \n",
"11 NaN May 132.7 2.8 \n",
"12 Source: Consumer price inflation from the Office for N... NaN NaN \n",
"\n",
" 0.6 131.3 8.7 0.7 117.5 4.2 \n",
"0 0.2 131.5 7.9 0.1 118.1 4.4 \n",
"1 -0.3 130.9 6.8 -0.4 118.6 4.5 \n",
"2 0.4 131.3 6.7 0.3 119.2 4.8 \n",
"3 0.5 132.0 6.7 0.5 119.8 5.0 \n",
"4 0.1 132.0 4.6 0.0 120.6 5.4 \n",
"5 -0.1 131.7 3.9 -0.2 121.1 5.3 \n",
"6 0.4 132.2 4.0 0.4 121.6 5.3 \n",
"7 -0.4 131.5 4.0 -0.6 122.1 5.4 \n",
"8 0.6 132.3 3.4 0.6 123.2 6.0 \n",
"9 0.6 133.0 3.2 0.6 123.9 6.3 \n",
"10 0.5 133.5 2.3 0.3 124.6 6.6 \n",
"11 0.4 133.9 2.0 0.3 125.3 6.7 \n",
"12 NaN NaN NaN NaN NaN NaN "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.read_csv('data/cpi_rates_uk.csv', skiprows=4)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3. Read an Excel file using the method `read_excel`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Figure 2: All three main sectors contributed positively to GDP growth in May 2024 | \n",
" Unnamed: 1 | \n",
" Unnamed: 2 | \n",
" Unnamed: 3 | \n",
" Unnamed: 4 | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Contributions to monthly GDP growth, May 2023 ... | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 1 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 2 | \n",
" Notes | \n",
" 1. Sum of component contributions may not sum ... | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 3 | \n",
" Unit | \n",
" Percentage points (%) | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 4 | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
" 5 | \n",
" Date | \n",
" GDP | \n",
" Services | \n",
" Production | \n",
" Construction | \n",
"
\n",
" \n",
" 6 | \n",
" May 2023 | \n",
" -0.31 | \n",
" -0.12 | \n",
" -0.12 | \n",
" -0.07 | \n",
"
\n",
" \n",
" 7 | \n",
" Jun 2023 | \n",
" 0.77 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.18 | \n",
"
\n",
" \n",
" 8 | \n",
" Jul 2023 | \n",
" -0.52 | \n",
" -0.35 | \n",
" -0.13 | \n",
" -0.03 | \n",
"
\n",
" \n",
" 9 | \n",
" Aug 2023 | \n",
" -0.01 | \n",
" 0.1 | \n",
" -0.06 | \n",
" -0.06 | \n",
"
\n",
" \n",
" 10 | \n",
" Sep 2023 | \n",
" 0.01 | \n",
" -0.01 | \n",
" -0.02 | \n",
" 0.04 | \n",
"
\n",
" \n",
" 11 | \n",
" Oct 2023 | \n",
" -0.44 | \n",
" -0.23 | \n",
" -0.17 | \n",
" -0.03 | \n",
"
\n",
" \n",
" 12 | \n",
" Nov 2023 | \n",
" 0.22 | \n",
" 0.19 | \n",
" 0.06 | \n",
" -0.04 | \n",
"
\n",
" \n",
" 13 | \n",
" Dec 2023 | \n",
" -0.07 | \n",
" -0.1 | \n",
" 0.08 | \n",
" -0.04 | \n",
"
\n",
" \n",
" 14 | \n",
" Jan 2024 | \n",
" 0.38 | \n",
" 0.4 | \n",
" -0.09 | \n",
" 0.06 | \n",
"
\n",
" \n",
" 15 | \n",
" Feb 2024 | \n",
" 0.25 | \n",
" 0.23 | \n",
" 0.12 | \n",
" -0.1 | \n",
"
\n",
" \n",
" 16 | \n",
" Mar 2024 | \n",
" 0.43 | \n",
" 0.4 | \n",
" 0.02 | \n",
" 0.01 | \n",
"
\n",
" \n",
" 17 | \n",
" Apr 2024 | \n",
" 0.03 | \n",
" 0.22 | \n",
" -0.12 | \n",
" -0.07 | \n",
"
\n",
" \n",
" 18 | \n",
" May 2024 | \n",
" 0.37 | \n",
" 0.22 | \n",
" 0.03 | \n",
" 0.11 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Figure 2: All three main sectors contributed positively to GDP growth in May 2024 \\\n",
"0 Contributions to monthly GDP growth, May 2023 ... \n",
"1 NaN \n",
"2 Notes \n",
"3 Unit \n",
"4 NaN \n",
"5 Date \n",
"6 May 2023 \n",
"7 Jun 2023 \n",
"8 Jul 2023 \n",
"9 Aug 2023 \n",
"10 Sep 2023 \n",
"11 Oct 2023 \n",
"12 Nov 2023 \n",
"13 Dec 2023 \n",
"14 Jan 2024 \n",
"15 Feb 2024 \n",
"16 Mar 2024 \n",
"17 Apr 2024 \n",
"18 May 2024 \n",
"\n",
" Unnamed: 1 Unnamed: 2 Unnamed: 3 \\\n",
"0 NaN NaN NaN \n",
"1 NaN NaN NaN \n",
"2 1. Sum of component contributions may not sum ... NaN NaN \n",
"3 Percentage points (%) NaN NaN \n",
"4 NaN NaN NaN \n",
"5 GDP Services Production \n",
"6 -0.31 -0.12 -0.12 \n",
"7 0.77 0.26 0.33 \n",
"8 -0.52 -0.35 -0.13 \n",
"9 -0.01 0.1 -0.06 \n",
"10 0.01 -0.01 -0.02 \n",
"11 -0.44 -0.23 -0.17 \n",
"12 0.22 0.19 0.06 \n",
"13 -0.07 -0.1 0.08 \n",
"14 0.38 0.4 -0.09 \n",
"15 0.25 0.23 0.12 \n",
"16 0.43 0.4 0.02 \n",
"17 0.03 0.22 -0.12 \n",
"18 0.37 0.22 0.03 \n",
"\n",
" Unnamed: 4 \n",
"0 NaN \n",
"1 NaN \n",
"2 NaN \n",
"3 NaN \n",
"4 NaN \n",
"5 Construction \n",
"6 -0.07 \n",
"7 0.18 \n",
"8 -0.03 \n",
"9 -0.06 \n",
"10 0.04 \n",
"11 -0.03 \n",
"12 -0.04 \n",
"13 -0.04 \n",
"14 0.06 \n",
"15 -0.1 \n",
"16 0.01 \n",
"17 -0.07 \n",
"18 0.11 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.read_excel('data/gdp_sectors.xls')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4. Use the input `skiprows`"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Date | \n",
" GDP | \n",
" Services | \n",
" Production | \n",
" Construction | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" May 2023 | \n",
" -0.31 | \n",
" -0.12 | \n",
" -0.12 | \n",
" -0.07 | \n",
"
\n",
" \n",
" 1 | \n",
" Jun 2023 | \n",
" 0.77 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.18 | \n",
"
\n",
" \n",
" 2 | \n",
" Jul 2023 | \n",
" -0.52 | \n",
" -0.35 | \n",
" -0.13 | \n",
" -0.03 | \n",
"
\n",
" \n",
" 3 | \n",
" Aug 2023 | \n",
" -0.01 | \n",
" 0.10 | \n",
" -0.06 | \n",
" -0.06 | \n",
"
\n",
" \n",
" 4 | \n",
" Sep 2023 | \n",
" 0.01 | \n",
" -0.01 | \n",
" -0.02 | \n",
" 0.04 | \n",
"
\n",
" \n",
" 5 | \n",
" Oct 2023 | \n",
" -0.44 | \n",
" -0.23 | \n",
" -0.17 | \n",
" -0.03 | \n",
"
\n",
" \n",
" 6 | \n",
" Nov 2023 | \n",
" 0.22 | \n",
" 0.19 | \n",
" 0.06 | \n",
" -0.04 | \n",
"
\n",
" \n",
" 7 | \n",
" Dec 2023 | \n",
" -0.07 | \n",
" -0.10 | \n",
" 0.08 | \n",
" -0.04 | \n",
"
\n",
" \n",
" 8 | \n",
" Jan 2024 | \n",
" 0.38 | \n",
" 0.40 | \n",
" -0.09 | \n",
" 0.06 | \n",
"
\n",
" \n",
" 9 | \n",
" Feb 2024 | \n",
" 0.25 | \n",
" 0.23 | \n",
" 0.12 | \n",
" -0.10 | \n",
"
\n",
" \n",
" 10 | \n",
" Mar 2024 | \n",
" 0.43 | \n",
" 0.40 | \n",
" 0.02 | \n",
" 0.01 | \n",
"
\n",
" \n",
" 11 | \n",
" Apr 2024 | \n",
" 0.03 | \n",
" 0.22 | \n",
" -0.12 | \n",
" -0.07 | \n",
"
\n",
" \n",
" 12 | \n",
" May 2024 | \n",
" 0.37 | \n",
" 0.22 | \n",
" 0.03 | \n",
" 0.11 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Date GDP Services Production Construction\n",
"0 May 2023 -0.31 -0.12 -0.12 -0.07\n",
"1 Jun 2023 0.77 0.26 0.33 0.18\n",
"2 Jul 2023 -0.52 -0.35 -0.13 -0.03\n",
"3 Aug 2023 -0.01 0.10 -0.06 -0.06\n",
"4 Sep 2023 0.01 -0.01 -0.02 0.04\n",
"5 Oct 2023 -0.44 -0.23 -0.17 -0.03\n",
"6 Nov 2023 0.22 0.19 0.06 -0.04\n",
"7 Dec 2023 -0.07 -0.10 0.08 -0.04\n",
"8 Jan 2024 0.38 0.40 -0.09 0.06\n",
"9 Feb 2024 0.25 0.23 0.12 -0.10\n",
"10 Mar 2024 0.43 0.40 0.02 0.01\n",
"11 Apr 2024 0.03 0.22 -0.12 -0.07\n",
"12 May 2024 0.37 0.22 0.03 0.11"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pd.read_excel('data/gdp_sectors.xls', skiprows=6)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}