{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Reading Data from Excel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Getting Ready\n", "\n", "\n", "Make sure you the library [`xlrd`](https://xlrd.readthedocs.io/en/latest/) is available in your Python environment." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Install the library `kaleido` using \n", "\n", "```\n", "pip install -U xlrd\n", "```\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## How to do it\n", "1. Import `pandas` and `numpy` and create a `DataFrame` to work with" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Read a `csv` file with the `read_csv` method" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
2023May129.17.90.6131.38.70.7117.54.2
0NaNJun129.47.30.2131.57.90.1118.14.4
1NaNJul129.06.4-0.3130.96.8-0.4118.64.5
2NaNAug129.46.30.4131.36.70.3119.24.8
3NaNSep130.16.30.5132.06.70.5119.85.0
4NaNOct130.24.70.1132.04.60.0120.65.4
5NaNNov130.04.2-0.1131.73.9-0.2121.15.3
6NaNDec130.54.20.4132.24.00.4121.65.3
72024Jan130.04.2-0.4131.54.0-0.6122.15.4
8NaNFeb130.83.80.6132.33.40.6123.26.0
9NaNMar131.63.80.6133.03.20.6123.96.3
10NaNApr132.23.00.5133.52.30.3124.66.6
11NaNMay132.72.80.4133.92.00.3125.36.7
12Source:Consumer price inflation from the Office for N...NaNNaNNaNNaNNaNNaNNaNNaN
\n", "
" ], "text/plain": [ " 2023 May 129.1 7.9 \\\n", "0 NaN Jun 129.4 7.3 \n", "1 NaN Jul 129.0 6.4 \n", "2 NaN Aug 129.4 6.3 \n", "3 NaN Sep 130.1 6.3 \n", "4 NaN Oct 130.2 4.7 \n", "5 NaN Nov 130.0 4.2 \n", "6 NaN Dec 130.5 4.2 \n", "7 2024 Jan 130.0 4.2 \n", "8 NaN Feb 130.8 3.8 \n", "9 NaN Mar 131.6 3.8 \n", "10 NaN Apr 132.2 3.0 \n", "11 NaN May 132.7 2.8 \n", "12 Source: Consumer price inflation from the Office for N... NaN NaN \n", "\n", " 0.6 131.3 8.7 0.7 117.5 4.2 \n", "0 0.2 131.5 7.9 0.1 118.1 4.4 \n", "1 -0.3 130.9 6.8 -0.4 118.6 4.5 \n", "2 0.4 131.3 6.7 0.3 119.2 4.8 \n", "3 0.5 132.0 6.7 0.5 119.8 5.0 \n", "4 0.1 132.0 4.6 0.0 120.6 5.4 \n", "5 -0.1 131.7 3.9 -0.2 121.1 5.3 \n", "6 0.4 132.2 4.0 0.4 121.6 5.3 \n", "7 -0.4 131.5 4.0 -0.6 122.1 5.4 \n", "8 0.6 132.3 3.4 0.6 123.2 6.0 \n", "9 0.6 133.0 3.2 0.6 123.9 6.3 \n", "10 0.5 133.5 2.3 0.3 124.6 6.6 \n", "11 0.4 133.9 2.0 0.3 125.3 6.7 \n", "12 NaN NaN NaN NaN NaN NaN " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_csv('data/cpi_rates_uk.csv', skiprows=4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. Read an Excel file using the method `read_excel`" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Figure 2: All three main sectors contributed positively to GDP growth in May 2024Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4
0Contributions to monthly GDP growth, May 2023 ...NaNNaNNaNNaN
1NaNNaNNaNNaNNaN
2Notes1. Sum of component contributions may not sum ...NaNNaNNaN
3UnitPercentage points (%)NaNNaNNaN
4NaNNaNNaNNaNNaN
5DateGDPServicesProductionConstruction
6May 2023-0.31-0.12-0.12-0.07
7Jun 20230.770.260.330.18
8Jul 2023-0.52-0.35-0.13-0.03
9Aug 2023-0.010.1-0.06-0.06
10Sep 20230.01-0.01-0.020.04
11Oct 2023-0.44-0.23-0.17-0.03
12Nov 20230.220.190.06-0.04
13Dec 2023-0.07-0.10.08-0.04
14Jan 20240.380.4-0.090.06
15Feb 20240.250.230.12-0.1
16Mar 20240.430.40.020.01
17Apr 20240.030.22-0.12-0.07
18May 20240.370.220.030.11
\n", "
" ], "text/plain": [ " Figure 2: All three main sectors contributed positively to GDP growth in May 2024 \\\n", "0 Contributions to monthly GDP growth, May 2023 ... \n", "1 NaN \n", "2 Notes \n", "3 Unit \n", "4 NaN \n", "5 Date \n", "6 May 2023 \n", "7 Jun 2023 \n", "8 Jul 2023 \n", "9 Aug 2023 \n", "10 Sep 2023 \n", "11 Oct 2023 \n", "12 Nov 2023 \n", "13 Dec 2023 \n", "14 Jan 2024 \n", "15 Feb 2024 \n", "16 Mar 2024 \n", "17 Apr 2024 \n", "18 May 2024 \n", "\n", " Unnamed: 1 Unnamed: 2 Unnamed: 3 \\\n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 1. Sum of component contributions may not sum ... NaN NaN \n", "3 Percentage points (%) NaN NaN \n", "4 NaN NaN NaN \n", "5 GDP Services Production \n", "6 -0.31 -0.12 -0.12 \n", "7 0.77 0.26 0.33 \n", "8 -0.52 -0.35 -0.13 \n", "9 -0.01 0.1 -0.06 \n", "10 0.01 -0.01 -0.02 \n", "11 -0.44 -0.23 -0.17 \n", "12 0.22 0.19 0.06 \n", "13 -0.07 -0.1 0.08 \n", "14 0.38 0.4 -0.09 \n", "15 0.25 0.23 0.12 \n", "16 0.43 0.4 0.02 \n", "17 0.03 0.22 -0.12 \n", "18 0.37 0.22 0.03 \n", "\n", " Unnamed: 4 \n", "0 NaN \n", "1 NaN \n", "2 NaN \n", "3 NaN \n", "4 NaN \n", "5 Construction \n", "6 -0.07 \n", "7 0.18 \n", "8 -0.03 \n", "9 -0.06 \n", "10 0.04 \n", "11 -0.03 \n", "12 -0.04 \n", "13 -0.04 \n", "14 0.06 \n", "15 -0.1 \n", "16 0.01 \n", "17 -0.07 \n", "18 0.11 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_excel('data/gdp_sectors.xls')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. Use the input `skiprows`" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
DateGDPServicesProductionConstruction
0May 2023-0.31-0.12-0.12-0.07
1Jun 20230.770.260.330.18
2Jul 2023-0.52-0.35-0.13-0.03
3Aug 2023-0.010.10-0.06-0.06
4Sep 20230.01-0.01-0.020.04
5Oct 2023-0.44-0.23-0.17-0.03
6Nov 20230.220.190.06-0.04
7Dec 2023-0.07-0.100.08-0.04
8Jan 20240.380.40-0.090.06
9Feb 20240.250.230.12-0.10
10Mar 20240.430.400.020.01
11Apr 20240.030.22-0.12-0.07
12May 20240.370.220.030.11
\n", "
" ], "text/plain": [ " Date GDP Services Production Construction\n", "0 May 2023 -0.31 -0.12 -0.12 -0.07\n", "1 Jun 2023 0.77 0.26 0.33 0.18\n", "2 Jul 2023 -0.52 -0.35 -0.13 -0.03\n", "3 Aug 2023 -0.01 0.10 -0.06 -0.06\n", "4 Sep 2023 0.01 -0.01 -0.02 0.04\n", "5 Oct 2023 -0.44 -0.23 -0.17 -0.03\n", "6 Nov 2023 0.22 0.19 0.06 -0.04\n", "7 Dec 2023 -0.07 -0.10 0.08 -0.04\n", "8 Jan 2024 0.38 0.40 -0.09 0.06\n", "9 Feb 2024 0.25 0.23 0.12 -0.10\n", "10 Mar 2024 0.43 0.40 0.02 0.01\n", "11 Apr 2024 0.03 0.22 -0.12 -0.07\n", "12 May 2024 0.37 0.22 0.03 0.11" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "pd.read_excel('data/gdp_sectors.xls', skiprows=6)" ] } ], "metadata": { "kernelspec": { "display_name": "venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 2 }