HW 2 Part 2
HW 2 Part 2¶
Suppose that you had to redo the work you did in Part 1 for different group of HICP indices. You could duplicate that notebook (right click on the name of a notebook anc choose Duplicate, or press Ctrl+d), replace the list of subindices and then run the remaining cells. However, this is not very efficient and violates a basic principle of programming, known as DRY - don’t repeat yourself. A better approach is to write functions and automate, whenever possible, tasks which are repeated.
Your task in this part of the homework is to replace parts of the code in Part 1 with functions.
# import the necessary packages
2.0 Get the hicp_codes
dataframe as in Part 1.
2.1 Write a function which takes as input an item group code
pattern (a string) and returns two lists - one of the codes and the other - of the code descriptions of all items in that item group. With that function you should be able to get any group of items for any level of aggregation. Test the function for the group of indices you downloaded in Part 1, and several other groups to make sure it returns the correct lists. For example try “HICP - Overall index” (one item), all items from “HICP - FOOD AND NON-ALCOHOLIC BEVERAGES” (75 items), all items from “HICP - Food” (65 items), etc.
Use the function to get the codes and descriptions of the item group you got in 1.2 of Part 1.
Note: the function will use hicp_codes
which you should have created in 2.0
#TODO
2.2 Write a function which takes a HICP item code (a string) and returns a dataframe like the one in 1.3 of Part 1 - datetime index and only one series with the item code as column name.
## TODO
2.3 Copy the code you wrote in Part 1 for 1.4 and replace the relevant part in the for loop using the function you created above. Check that it gives you the same dataframe as in Part 1.
## TODO
2.4 create a new dataframe df_nona
by dropping all columns in df
that have more than half of their observations missing
## TODO
2.5 Write a functions which takes a dataframe and two dates (strings) - start
and end
as arguments and returns a dictionary with two keys: ‘code’ and ‘std’ whose values are the code and standard deviation of the most volatile among the series in the dataframe.
Write another function for the least volatile series (with the same arguments and output)
Test your functions with the code you used in 1.7 and 1.8 in Part 1 as follows
a) Find the respective start and end dates for the full period and the last 24 months. See if you can do it programmatically, but if you cant think of how, just type them directly.
b) Assign the results to dictionaries such as dict_max
, dict_min
(Up to you what names you use)
c) Use the function you created in 2.1 to get the descriptions for the codes in the two dictionaries from (b)
d) print the output usig f-strngs as in 1.7 and 1.8 from Part 1.
2.6 Create a function which takes a series (or dataframe with 1 column) and an integer and creates and saves (as a png file) a plot of the rolling window standard deviations of that series, for a window specified by the integer parameter. The name of the file with the figure should change with code of the series and the integer indicating the window length. For example if the code is ‘011500’ and the window is 36, the name of the figure should be something like ‘fig_011500_rolling_std_36.png’. The important parts (in this example) are 011500 and 36 - the rest (fixed part of the file name) you choose as you see best. This can be done using f-string.
Test that the function produces the the same plots as in 1.10 of Part 1.