Download data from ECB’s SDW

Download data from ECB’s SDW

For more info on the ECB’s web API, see here

import io
import pandas as pd
import requests
import matplotlib.pyplot as plt
%matplotlib inline

First, lets import the HICP Code List Description table from the ECB’s SDW site here into a pandas dataframe.

url = 'https://sdw.ecb.europa.eu/datastructure.do?conceptMnemonic=ICP_ITEM&datasetinstanceid=122#cl'
r  = requests.get(url, headers={"Accept": "text/csv"})    
df_codes = pd.read_html(r.text)[1]
df_codes.set_index('Code')
Code Description
Code
000000 HICP - Overall index
010000 HICP - FOOD AND NON-ALCOHOLIC BEVERAGES
011000 HICP - Food
011100 HICP - Bread and cereals
011110 HICP - Rice
... ...
XESEAS HICP - All-items excluding energy and seasonal...
XFROOP HICP - All-items excluding frequent out-of-poc...
XFUELL HICP - All-items excl. liquid fuels and fuels ...
XRENT0 HICP - All-items excluding actual rentals for ...
XSEAS0 HICP - All-items excluding seasonal food

3676 rows × 1 columns

The following cell will download “HICP - Overall index” which has code “000000”. That code is part of the series_code string and must be changed in order to download a different series (e.g. 010000, 011000, etc). In addition U2 stands for “euro area” and can be changed to another geographic area, while ANR and N stand for “annual rate of change”, and not adjusted (e.g not seasonally adjusted). Those parts of the series_code can also be changed to, for example INX - for index (instead of rate of change), and Y - for adjusted series. Note, however, that not all combinations of geos, adjustments and variations (INX, ANR, etc) are available.

SDW_URL = "https://sdw-wsrest.ecb.europa.eu/service/data"
series_code = 'ICP/M.U2.N.000000.4.ANR'
ecburl = f"{SDW_URL}/{series_code}"
response = requests.get(ecburl, headers={"Accept": "text/csv"})

assert response.status_code == 200, 'Failed to get a correct reponse'

The assert statement above is to make sure that our request was successful, and the response object has the data we want. The following cell puts the data into a dataframe and then creates a new dataframe with only the columns we want (check the content of all_data for see what else is there).

all_data = pd.read_csv(io.StringIO(response.text))
df = all_data[["TIME_PERIOD", "OBS_VALUE"]].copy()
df.head()
TIME_PERIOD OBS_VALUE
0 1997-01 2.0
1 1997-02 1.8
2 1997-03 1.6
3 1997-04 1.3
4 1997-05 1.4

Now, we can make other adjustments to the dataframe - set datetime index, rename the columns, etc.