Wednesday, 21 January 2015

FOMC Dates - Full History Web Scrape

As I delve into the existing academic research regarding price patterns around US Federal Open Market Committee (FOMC) meetings, it’s clear that I will need more data than I collected in the previous post FOMC Dates - Scraping Data From Web Pages.

Which reminds me of the quote by Google’s Research Director Peter Norvig:

We don’t have better algorithms. We just have more data.

In particular, I’ll need FOMC dates from at least February 1994 (when the Federal Reserve began issuing statements describing monetary policy decisions following an FOMC meeting) and whether the meeting was scheduled or not - there are 8 scheduled meetings per year but they also hold inter-meeting conference calls as and when needed (for interesting background info on the Fed’s move to greater transparency over time see the article posted on their web site).

With such data it should be relatively easy to reproduce some of the results from the academic research.

All the data is available on the Fed’s web site but unfortunately it requires scraping it off many web pages. To do this I decided to use XPath (XML Path Language) and regular expressions in R. I created 4 R functions and saved them in a separate file “FOMC Dates Functions.R”, which you will need to download from GitHub in order to run the R code below (save the file in your working directory).

## install.packages(c("httr", "XML"), repos = "http://cran.us.r-project.org")
library(httr)
library(XML)

# load fomc date functions
source("FOMC Dates Functions.R")

# extract data from web pages and parse dates
fomcdatespre2009 <- get.fomc.dates.pre.2009(1936, 2008)
fomcdatesfrom2009 <- get.fomc.dates.from.2009()

# combine datasets and order chronologically
fomcdatesall <- do.call(rbind, list(fomcdatespre2009, fomcdatesfrom2009))
fomcdatesall <- fomcdatesall[order(fomcdatesall$begdate), ]

# save as RData format
save(fomcdatesall, file = "fomcdatesall.RData")
# save as csv file
write.csv(fomcdatesall, "fomcdatesall.csv", row.names = FALSE)

# check results
head(fomcdatesall)

This will scrape the full history of FOMC meetings from 1936 to the present. The data is stored in a dataframe with a row for each meeting/conference call and 6 columns for beginning and end dates, whether a press conference was held, whether it was a regularly scheduled meeting, type of document published to record meeting details, and the url of that document. For example:

##      begdate    enddate pressconf scheduled
## 1 1936-03-18 1936-03-18         0         1
## 2 1936-03-19 1936-03-19         0         1
## 3 1936-05-25 1936-05-25         0         1
## 4 1936-11-19 1936-11-19         0         1
## 5 1936-11-20 1936-11-20         0         1
## 6 1937-01-26 1937-01-26         0         1
##                                document
## 1       Historical Minutes (400 KB PDF)
## 2 Record of Policy Actions (271 KB PDF)
## 3 Record of Policy Actions (143 KB PDF)
## 4       Historical Minutes (176 KB PDF)
## 5 Record of Policy Actions (198 KB PDF)
## 6 Record of Policy Actions (272 KB PDF)
##                                             url
## 1 /monetarypolicy/files/FOMChistmin19360318.pdf
## 2    /monetarypolicy/files/fomcropa19360319.pdf
## 3    /monetarypolicy/files/fomcropa19360525.pdf
## 4 /monetarypolicy/files/FOMChistmin19361119.pdf
## 5    /monetarypolicy/files/fomcropa19361120.pdf
## 6    /monetarypolicy/files/fomcropa19370126.pdf

The data is also saved to disk as an RData file for use in my future posts and as a csv file (if you need the dates for your own research).

Click here for the above R code on GitHub.

Click here for the “FOMC Dates Functions.R” code file on GitHub.

Click here for the csv file of FOMC dates from 1936.

1 comment:

  1. Updated scripts so no need to manually update year argument when Fed updates its web page once a year. Use these files going forward:
    Click here for the updated main file on GitHub.
    Click here for the updated functions file on GitHub.

    ReplyDelete