Download data from ECB’s SDW
Download data from ECB’s SDW¶
For more info on the ECB’s web API, see here
import io
import pandas as pd
import requests
import matplotlib.pyplot as plt
%matplotlib inline
First, lets import the HICP Code List Description table from the ECB’s SDW site here into a pandas dataframe.
url = 'https://sdw.ecb.europa.eu/datastructure.do?conceptMnemonic=ICP_ITEM&datasetinstanceid=122#cl'
r = requests.get(url, headers={"Accept": "text/csv"})
df_codes = pd.read_html(r.text)[1]
df_codes.set_index('Code')
Code Description | |
---|---|
Code | |
000000 | HICP - Overall index |
010000 | HICP - FOOD AND NON-ALCOHOLIC BEVERAGES |
011000 | HICP - Food |
011100 | HICP - Bread and cereals |
011110 | HICP - Rice |
... | ... |
XESEAS | HICP - All-items excluding energy and seasonal... |
XFROOP | HICP - All-items excluding frequent out-of-poc... |
XFUELL | HICP - All-items excl. liquid fuels and fuels ... |
XRENT0 | HICP - All-items excluding actual rentals for ... |
XSEAS0 | HICP - All-items excluding seasonal food |
3676 rows × 1 columns
The following cell will download “HICP - Overall index” which has code “000000”. That code is part of the series_code
string and must be changed in order to download a different series (e.g. 010000, 011000, etc). In addition U2
stands for “euro area” and can be changed to another geographic area, while ANR
and N
stand for “annual rate of change”, and not adjusted (e.g not seasonally adjusted). Those parts of the series_code
can also be changed to, for example INX
- for index (instead of rate of change), and Y
- for adjusted series. Note, however, that not all combinations of geos, adjustments and variations (INX
, ANR
, etc) are available.
SDW_URL = "https://sdw-wsrest.ecb.europa.eu/service/data"
series_code = 'ICP/M.U2.N.000000.4.ANR'
ecburl = f"{SDW_URL}/{series_code}"
response = requests.get(ecburl, headers={"Accept": "text/csv"})
assert response.status_code == 200, 'Failed to get a correct reponse'
The assert statement above is to make sure that our request was successful, and the response
object has the data we want. The following cell puts the data into a dataframe and then creates a new dataframe with only the columns we want (check the content of all_data
for see what else is there).
all_data = pd.read_csv(io.StringIO(response.text))
df = all_data[["TIME_PERIOD", "OBS_VALUE"]].copy()
df.head()
TIME_PERIOD | OBS_VALUE | |
---|---|---|
0 | 1997-01 | 2.0 |
1 | 1997-02 | 1.8 |
2 | 1997-03 | 1.6 |
3 | 1997-04 | 1.3 |
4 | 1997-05 | 1.4 |
Now, we can make other adjustments to the dataframe - set datetime index, rename the columns, etc.