# Panel Data and Fixed Effects
Ian McCarthy | Emory University
Econ 771, Fall 2022

# Understanding Panel Data margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Default FE </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Log GDP per Capita </td> <td style="text-align:center;"> 9.769 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.702) </td> </tr> </tbody> </table> --- # Within Estimator (Manually Demean) in practice .pull-left[ **Stata**<br> ```stata causaldata gapminder.dta, use clear download gen lgdp_pc=log(gdppercap) foreach x of varlist lifeExp lgdp_pc { egen mean_`x'=mean(`x') egen demean_`x'=`x'-mean_`x' } reg demean_lifeExp demean_lgdp_pc ``` ] .pull-right[ **R**<br> ```r library(causaldata) reg.dat <- causaldata::gapminder %>% mutate(lgdp_pc=log(gdpPercap)) %>% group_by(country) %>% mutate(demean_lifeexp=lifeExp - mean(lifeExp, na.rm=TRUE), demean_gdp=lgdp_pc - mean(lgdp_pc, na.rm=TRUE)) lm(demean_lifeexp~ 0 + demean_gdp, data=reg.dat) ``` ] --- # Within Estimator (Manually Demean) in practice <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Default FE </th> <th style="text-align:center;"> Manual FE </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Log GDP per Capita </td> <td style="text-align:center;"> 9.769 </td> <td style="text-align:center;"> 9.769 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.702) </td> <td style="text-align:center;"> (0.701) </td> </tr> </tbody> </table> **Note:** `feols` defaults to clustering at level of FE, `lm` requires our input --- # First differencing (default) in practice .pull-left[ **Stata**<br> ```stata causaldata gapminder.dta, use clear download gen lgdp_pc=log(gdppercap) reg d.lifeExp d.lgdp_pc, noconstant ``` ] .pull-right[ **R**<br> ```r library(plm) reg.dat <- causaldata::gapminder %>% mutate(lgdp_pc=log(gdpPercap)) plm(lifeExp ~ 0 + lgdp_pc, model="fd", individual="country", index=c("country","year"), data=reg.dat) ``` ] --- # First differencing (manual) in practice <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Default FE </th> <th style="text-align:center;"> Manual FE </th> <th style="text-align:center;"> Default FD </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Log GDP per Capita </td> <td style="text-align:center;"> 9.769 </td> <td style="text-align:center;"> 9.769 </td> <td style="text-align:center;"> 5.290 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.702) </td> <td style="text-align:center;"> (0.284) </td> <td style="text-align:center;"> (0.291) </td> </tr> </tbody> </table> --- # First differencing (manual) in practice .pull-left[ **Stata**<br> ```stata causaldata gapminder.dta, use clear download gen lgdp_pc=log(gdppercap) reg d.lifeExp d.lgdp_pc, noconstant ``` ] .pull-right[ **R**<br> ```r reg.dat <- causaldata::gapminder %>% mutate(lgdp_pc=log(gdpPercap)) %>% group_by(country) %>% arrange(country, year) %>% mutate(fd_lifeexp=lifeExp - lag(lifeExp), lgdp_pc=lgdp_pc - lag(lgdp_pc)) %>% na.omit() lm(fd_lifeexp~ 0 + lgdp_pc , data=reg.dat) ``` ] --- # First differencing (manual) in practice <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Default FE </th> <th style="text-align:center;"> Manual FE </th> <th style="text-align:center;"> Default FD </th> <th style="text-align:center;"> Manual FD </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Log GDP per Capita </td> <td style="text-align:center;"> 9.769 </td> <td style="text-align:center;"> 9.769 </td> <td style="text-align:center;"> 5.290 </td> <td style="text-align:center;"> 5.290 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.702) </td> <td style="text-align:center;"> (0.284) </td> <td style="text-align:center;"> (0.291) </td> <td style="text-align:center;"> (0.291) </td> </tr> </tbody> </table> --- # FE and FD with same time period <table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Default FE </th> <th style="text-align:center;"> Default FD </th> <th style="text-align:center;"> Manual FD </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> Log GDP per Capita </td> <td style="text-align:center;"> 8.929 </td> <td style="text-align:center;"> 5.290 </td> <td style="text-align:center;"> 5.290 </td> </tr> <tr> <td style="text-align:left;"> </td> <td style="text-align:center;"> (0.741) </td> <td style="text-align:center;"> (0.291) </td> <td style="text-align:center;"> (0.291) </td> </tr> </tbody> </table> Don't want to read too much into this, but... - Likely strong serial correlation in this case (almost certainly) - Mispecified model