{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "6d52755e-c428-44bd-ba3a-1213b0d9b5f1",
   "metadata": {
    "tags": []
   },
   "source": [
    "# Download data from ECB's SDW\n",
    "\n",
    "For more info on the ECB's web API, see [here](https://sdw-wsrest.ecb.europa.eu/help/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "786a5b18-0290-4138-9d06-8ebc7f719ccd",
   "metadata": {},
   "outputs": [],
   "source": [
    "import io\n",
    "import pandas as pd\n",
    "import requests\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cb985c7d-d2df-4797-8e6b-d10aff1133dc",
   "metadata": {},
   "source": [
    "First, lets import the _HICP Code List Description_ table from the ECB's SDW site [here](https://sdw.ecb.europa.eu/datastructure.do?conceptMnemonic=ICP_ITEM&datasetinstanceid=122#cl) into a pandas dataframe."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "3dc728a6-03ca-4fb0-8c84-d74af2f318c2",
   "metadata": {},
   "outputs": [],
   "source": [
    "url = 'https://sdw.ecb.europa.eu/datastructure.do?conceptMnemonic=ICP_ITEM&datasetinstanceid=122#cl'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "5ca08bbd-15e2-4205-a905-0ca2b3342243",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Code Description</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Code</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>000000</th>\n",
       "      <td>HICP - Overall index</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>010000</th>\n",
       "      <td>HICP - FOOD AND NON-ALCOHOLIC BEVERAGES</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>011000</th>\n",
       "      <td>HICP - Food</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>011100</th>\n",
       "      <td>HICP - Bread and cereals</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>011110</th>\n",
       "      <td>HICP - Rice</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>XESEAS</th>\n",
       "      <td>HICP - All-items excluding energy and seasonal...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>XFROOP</th>\n",
       "      <td>HICP - All-items excluding frequent out-of-poc...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>XFUELL</th>\n",
       "      <td>HICP - All-items excl. liquid fuels and fuels ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>XRENT0</th>\n",
       "      <td>HICP - All-items excluding actual rentals for ...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>XSEAS0</th>\n",
       "      <td>HICP - All-items excluding seasonal food</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>3676 rows × 1 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                         Code Description\n",
       "Code                                                     \n",
       "000000                               HICP - Overall index\n",
       "010000            HICP - FOOD AND NON-ALCOHOLIC BEVERAGES\n",
       "011000                                        HICP - Food\n",
       "011100                           HICP - Bread and cereals\n",
       "011110                                        HICP - Rice\n",
       "...                                                   ...\n",
       "XESEAS  HICP - All-items excluding energy and seasonal...\n",
       "XFROOP  HICP - All-items excluding frequent out-of-poc...\n",
       "XFUELL  HICP - All-items excl. liquid fuels and fuels ...\n",
       "XRENT0  HICP - All-items excluding actual rentals for ...\n",
       "XSEAS0           HICP - All-items excluding seasonal food\n",
       "\n",
       "[3676 rows x 1 columns]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "r  = requests.get(url, headers={\"Accept\": \"text/csv\"})    \n",
    "df_codes = pd.read_html(r.text)[1]\n",
    "df_codes.set_index('Code')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9404e59-58fb-4c42-85a9-a645b3f8a9de",
   "metadata": {},
   "source": [
    "The following cell will download \"HICP - Overall index\" which has code \"000000\". That code is part of the `series_code` string and must be changed in order to download a different series (e.g. 010000, 011000, etc). In addition  `U2` stands for \"euro area\" and can be changed to another geographic area, while `ANR` and `N` stand for \"annual rate of change\", and __not__ adjusted (e.g not seasonally adjusted). Those parts of the `series_code` can also be changed to, for example `INX` - for index (instead of rate of change), and `Y` - for adjusted series. Note, however, that not all combinations of geos, adjustments and variations (`INX`, `ANR`, etc) are available.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "ce983a49-7337-4b2d-8dda-9d5ff2ecd5b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "SDW_URL = \"https://sdw-wsrest.ecb.europa.eu/service/data\"\n",
    "series_code = 'ICP/M.U2.N.000000.4.ANR'\n",
    "ecburl = f\"{SDW_URL}/{series_code}\"\n",
    "response = requests.get(ecburl, headers={\"Accept\": \"text/csv\"})\n",
    "\n",
    "assert response.status_code == 200, 'Failed to get a correct reponse'"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8fe1e278-f37f-4f0c-9302-0b1ed2994a61",
   "metadata": {},
   "source": [
    "The assert statement above is to make sure that our request was successful, and the `response` object has the data we want. The following cell puts the data into a dataframe and then creates a new dataframe with only the columns we want (check the content of `all_data` for see what else is there)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "f62e3a8a-6889-41ae-a446-b99b9be60054",
   "metadata": {},
   "outputs": [],
   "source": [
    "all_data = pd.read_csv(io.StringIO(response.text))\n",
    "df = all_data[[\"TIME_PERIOD\", \"OBS_VALUE\"]].copy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "c8fb9beb-cf5c-4099-a463-45a176f663f2",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>TIME_PERIOD</th>\n",
       "      <th>OBS_VALUE</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1997-01</td>\n",
       "      <td>2.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1997-02</td>\n",
       "      <td>1.8</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1997-03</td>\n",
       "      <td>1.6</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1997-04</td>\n",
       "      <td>1.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1997-05</td>\n",
       "      <td>1.4</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  TIME_PERIOD  OBS_VALUE\n",
       "0     1997-01        2.0\n",
       "1     1997-02        1.8\n",
       "2     1997-03        1.6\n",
       "3     1997-04        1.3\n",
       "4     1997-05        1.4"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5a836bff-60b7-4275-a6b8-eb775339e7c0",
   "metadata": {},
   "source": [
    "Now, we can make other adjustments to the dataframe - set datetime index, rename the columns, etc."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}