Skip to main content

Panel Model

  • yi:T×1y_i: T\times 1, YnT×1=(y1,...,yn)Y_{nT\times 1}=(y_1',...,y_n')' ,
  • xi:Ti×kx_i : T_i\times k, XnT×k=(x1,...,xn)X_{nT\times k}=(x_1',...,x_n')'
yit=xitβ+eitY=Xβ+eeit=αi+εit individual effect+idiosyncratic error\begin{aligned} &y_{i t}=\boldsymbol{x}_{i t}^{\prime} \boldsymbol{\beta}+e_{i t}\\ &Y=X\beta+e\\ &e_{i t}=\alpha_{i}+\varepsilon_{i t} \text{ individual effect+idiosyncratic error} \end{aligned}

1. Random Effects-GLS#

yit=xitβ+αi+εity_{i t}=\boldsymbol{x}_{i t}^{\prime} \boldsymbol{\beta}+\alpha_i+\varepsilon_{i t} RE-Identification

E(εitXi)=0E(εit2Xi)=σε2E(εijεitXi)=0E(αiXi)=0E(αi2Xi)=σα2E(αiεitXi)=0\begin{aligned} \mathbb{E}\left(\varepsilon_{i t} \mid \mathbf{X}_{i}\right) &=0 \\ \mathbb{E}\left(\varepsilon_{i t}^{2} \mid \mathbf{X}_{i}\right) &=\sigma_{\varepsilon}^{2} \\ \mathbb{E}\left(\varepsilon_{i j} \varepsilon_{i t} \mid \mathbf{X}_{i}\right) &=0 \\ \mathbb{E}\left(\alpha_{i} \mid \mathbf{X}_{i}\right) &=0 \\ \mathbb{E}\left(\alpha_{i}^{2} \mid \mathbf{X}_{i}\right) &=\sigma_{\alpha}^{2} \\ \mathbb{E}\left(\alpha_{i} \varepsilon_{i t} \mid \mathbf{X}_{i}\right) &=0 \end{aligned}

For that heteroskedasticity,

E(eiXi)=0E(eieiXi)=1i1iσu2+Iiσε2=σε2Ωi.\begin{aligned} \mathbb{E}\left(\boldsymbol{e}_{i} \mid \mathbf{X}_{i}\right) &=0 \\ \mathbb{E}\left(\boldsymbol{e}_{i} \boldsymbol{e}_{i}^{\prime} \mid \mathbf{X}_{i}\right) &=\mathbf{1}_{i} \mathbf{1}_{i}^{\prime} \sigma_{u}^{2}+\boldsymbol{I}_{i} \sigma_{\varepsilon}^{2}=\sigma_{\varepsilon}^{2} \boldsymbol{\Omega}_{i}. \end{aligned}

We use GLS to obtain BLUE (Gauss-Markov conditions).

β^gls=(i=1NXiΩi1Xi)1(i=1NXiΩi1yi)\widehat{\boldsymbol{\beta}}_{\mathrm{gls}}=\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \boldsymbol{\Omega}_{i}^{-1} \boldsymbol{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \boldsymbol{\Omega}_{i}^{-1} \boldsymbol{y}_{i}\right)

Feasible We use still consistent OLS for the residuals e^\hat{e} and estimate the covariance matrix.

e^i=yiXiβ^OLS\widehat{\boldsymbol{e}}_{i}=\boldsymbol{y}_{i}-\mathbf{X}_{i} \widehat{\boldsymbol{\beta}}_{\mathrm{OLS}}

2. Fixed Effect#

In the econometrics literature, if the stochastic structure of αi\alpha_i is treated as unknown and possibly correlated with xitx_{it} then αi\alpha_i is called a fixed effect.

FE-Identification

  • E(εitxi,αi)=0E(\varepsilon_{it} | x_{i},\alpha_i)=0 for all tt
  • E(αixi)0E(\alpha_i |x_i)\neq 0 check by ==Hausman-Wu test==. otherwise use random effects to achieve more efficiency.
    H=(β^feβ^re)var^(β^feβ^re)1(β^feβ^re)=(β^feβ^re)(V^feV^re)1(β^feβ^re)\begin{aligned} H &=\left(\widehat{\boldsymbol{\beta}}_{\mathrm{fe}}-\widehat{\boldsymbol{\beta}}_{\mathrm{re}}\right)^{\prime} \widehat{\operatorname{var}}\left(\widehat{\boldsymbol{\beta}}_{\mathrm{fe}}-\widehat{\boldsymbol{\beta}}_{\mathrm{re}}\right)^{-1}\left(\widehat{\boldsymbol{\beta}}_{\mathrm{fe}}-\widehat{\boldsymbol{\beta}}_{\mathrm{re}}\right) \\ &=\left(\widehat{\boldsymbol{\beta}}_{\mathrm{fe}}-\widehat{\boldsymbol{\beta}}_{\mathrm{re}}\right)^{\prime}\left(\widehat{\boldsymbol{V}}_{\mathrm{fe}}-\widehat{\boldsymbol{V}}_{\mathrm{re}}\right)^{-1}\left(\widehat{\boldsymbol{\beta}}_{\mathrm{fe}}-\widehat{\boldsymbol{\beta}}_{\mathrm{re}}\right) \end{aligned}

Within-transformation MY=MXβ+Mα+MεM Y=M X\beta+M\alpha +M \varepsilon

β^fe=(i=1NtSix˙itx˙it)1(i=1NtSix˙ity˙it)=(i=1NX˙iX˙i)1(i=1NX˙iy˙i)=(i=1NXiMiXi)1(i=1NXiMiyi)\begin{aligned} \widehat{\boldsymbol{\beta}}_{\mathrm{fe}}&=\left(\sum_{i=1}^{N} \sum_{t \in S_{i}} \dot{\boldsymbol{x}}_{i t} \dot{\boldsymbol{x}}_{i t}^{\prime}\right)^{-1}\left(\sum_{i=1}^{N} \sum_{t \in S_{i}} \dot{\boldsymbol{x}}_{i t} \dot{y}_{i t}\right) \\ &=\left(\sum_{i=1}^{N} \dot{\boldsymbol{X}}_{i}^{\prime} \dot{\boldsymbol{X}}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \dot{\boldsymbol{X}}_{i}^{\prime} \dot{\boldsymbol{y}}_{i}\right) \\ &=\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{M}_{i} \mathbf{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{M}_{i} \boldsymbol{y}_{i}\right) \end{aligned}

Dummy Regression#

Take αi\alpha_i as coefficient of a dummy variable:

di=(di1,,dii,,din)=(0,,1,,0)α=(α1,,αi,,αn)\begin{aligned} &d_i=(d_{i1},\cdots,d_{ii},\cdots,d_{in})'=(0,\cdots,1,\cdots,0)'\\ &\alpha=(\alpha_{1},\cdots,\alpha_{i},\cdots,\alpha_{n})' \end{aligned}

yit=xitβ+diα+εity_{it}=x_{it}'\beta +d_i'\alpha +\varepsilon_{it} Then by OLS and Frisch-Waugh Theorem, the β^\hat{\beta} is same as fixed effect estimator.

RE v.s. FE#

RE is a linear combination of between estimator (i=1NXiPiXi)1(i=1NXiPiyi)\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{P}_{i} \mathbf{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{P}_{i} \boldsymbol{y}_{i}\right) and fixed effect estimator (i=1NXiMiXi)1(i=1NXiMiyi)\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{M}_{i} \mathbf{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{M}_{i} \boldsymbol{y}_{i}\right).

yˉi=xiβ+αi+εˉi,β^be=(i=1Nxixi)1(i=1Nxiyˉi)\bar{y}_{i}=\overline{\boldsymbol{x}}_{i}^{\prime} \boldsymbol{\beta}+\alpha_{i}+\bar{\varepsilon}_{i}, \quad \widehat{\boldsymbol{\beta}}_{\mathrm{be}}=\left(\sum_{i=1}^{N} \overline{\boldsymbol{x}}_{i} \overline{\boldsymbol{x}}_{i}^{\prime}\right)^{-1}\left(\sum_{i=1}^{N} \overline{\boldsymbol{x}}_{i} \bar{y}_{i}\right)
  • TT\rightarrow \infty, RE converges to FE
  • Var(β^RE)Var(β^FE)Var(\hat{\beta}_{RE})\leq Var(\hat{\beta}_{FE})

3. First-difference#

Another way to eliminate the individual effect.

Δyit=Δxitβ+Δεit\Delta y_{i t}=\Delta \boldsymbol{x}_{i t}^{\prime} \boldsymbol{\beta}+\Delta \varepsilon_{i t}
β^Δ=(i=1Nt2ΔxitΔxit)1(i=1Nt2ΔxitΔyit)=(i=1NΔXiΔXi)1(i=1NΔXiΔyi)=(i=1NXiDiDiXi)1(i=1NXiDiDiyi)\begin{aligned} \widehat{\boldsymbol{\beta}}_{\Delta} &=\left(\sum_{i=1}^{N} \sum_{t \geq 2} \Delta \boldsymbol{x}_{i t} \Delta \boldsymbol{x}_{i t}^{\prime}\right)^{-1}\left(\sum_{i=1}^{N} \sum_{t \geq 2} \Delta \boldsymbol{x}_{i t} \Delta y_{i t}\right) \\ &=\left(\sum_{i=1}^{N} \Delta \boldsymbol{X}_{i}^{\prime} \Delta \boldsymbol{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \Delta \boldsymbol{X}_{i}^{\prime} \Delta \boldsymbol{y}_{i}\right) \\ &=\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \mathbf{D}_{i}^{\prime} \boldsymbol{D}_{i} \boldsymbol{X}_{i}\right)^{-1}\left(\sum_{i=1}^{N} \boldsymbol{X}_{i}^{\prime} \boldsymbol{D}_{i}^{\prime} \boldsymbol{D}_{i} \boldsymbol{y}_{i}\right) \end{aligned}
  • T=2, equal to fixed effect estimator
  • T>2, not.

4. Dynamic Panel#

Weaker identification assumption than previous FE/RE (sequential exogeneity): xit=yit1x_{it}=y_{it-1}

yit=yit1β+αi+εity_{it}=y_{it-1}\beta +\alpha_i + \varepsilon_{it} E(εitxit,,xi1,αi)=0E(\varepsilon_{it}|x_{it},\cdots,x_{i1}, \alpha_i)=0

In applications it will often be useful to include time effects ftf_t to eliminate spurious serial correlation.

Inconsistency FE#

The within operator induces correlation between the AR(1) lag and the error. The result is that the within estimator is inconsistent for the coefficients when TT is fixed. A thorough explanation appears in Nickell (1981)-incidental parameter problem. β^FEpβ+(...)E(xitεitxˉiεit)=β+B1O(1/T)\hat{\beta}_{FE} \xrightarrow{p} \beta + (...)E(x_{it}\varepsilon_{it}-\bar{x}_{i} \varepsilon_{it})=\beta+B^{-1}O(1/\sqrt{T})

Solutions

  • Let TT\rightarrow \infty
  • Anderson-Hsiao Estimator (just-identification)
  • Arellano-Bond Estimator (over-identification)

Anderson-Hsiao Estimator#

Anderson and Hsiao (1982) made an important breakthrough by showing that a simple instrumental variables estimator is consistent for the parameters.

  1. first-differencing to eliminate fixed effects: E(Δyit1Δεit)=E((yityit1)(εitεit1))=σε2\mathbb{E}\left(\Delta y_{i t-1} \Delta \varepsilon_{i t}\right)=\mathbb{E}\left(\left(y_{i t}-y_{i t-1}\right)\left(\varepsilon_{i t}-\varepsilon_{i t-1}\right)\right)=-\sigma_{\varepsilon}^{2}
  2. IV for the endogeniety problem above. Using yit2y_{it-2} for Δyit1\Delta y_{it-1}:
    (yit2,,yitp1) for (Δyit1,,Δyitp)\left(y_{i t-2}, \ldots, y_{i t-p-1}\right) \text { for }\left(\Delta y_{i t-1}, \ldots, \Delta y_{i t-p}\right)
  3. Given assumption on non-serial correlation of εit\varepsilon_{it}, as NN\rightarrow \infty,
    β^ivpβE(yi1Δεi3)E(yi1Δyi2)=β\widehat{\beta}_{i v} \stackrel{p}{\longrightarrow} \beta-\frac{\mathbb{E}\left(y_{i 1} \Delta \varepsilon_{i 3}\right)}{\mathbb{E}\left(y_{i 1} \Delta y_{i 2}\right)}=\beta

Arellano-Bond Estimator#

Use (yit2,yit3,)(y_{it-2},y_{it-3},\cdots) as IV for Δyit1\Delta y_{it-1}. Apply more valid IV to increase the ==efficiency== (smaller asymptotic variance).

Using these extra instruments has a complication that there are a di§erent number of instruments for each time period. The solution is to view the model as a system of T equations.

Weak IV issue#

The Anderson-Hsiao instrument is weak if the γ\gamma is small Blundell and Bond (1998)

γ=(β1)(kk+σα2/σε2),k=1β1+β\gamma=(\beta-1)\left(\frac{k}{k+\sigma_{\alpha}^{2} / \sigma_{\varepsilon}^{2}}\right), \quad k=\frac{1-\beta}{1+\beta}
  • unit-root β=1\beta=1
  • the idiosyncratic effect ε\varepsilon is small relative to the individual-specific effect α\alpha

Arellano and Bover (1995) and Blundell and Bond (1998) introduced a set of ==orthogonality conditions== which reduce the weak instrument problem.

5. Probit with fixed effects#

yit=xitβ+αi+εy_{it}^{*} =x_{it}'\beta+\alpha_i+\varepsilon log-Likelihood function l(β,α)=ityitlogΦ(xβ+αi)+(1yit)log[1Φ(xβ+αi)]l(\beta,\alpha)=\sum_i\sum_t y_{it}log\Phi(x'\beta+\alpha_i)+(1-y_{it})log[1-\Phi(x'\beta+\alpha_i)]

we cannote difference out the fixed effects from the likelihood function, so estimation on β\beta and αi\alpha_i are required.

2-step

  1. α^i(β)=argminαl(β,α)\hat{\alpha}_i(\beta) = \arg\min_{\alpha} l(\beta,\alpha)
  2. β^=argminβl(β,α^)\hat{\beta}=\arg\min_{\beta} l(\beta,\hat{\alpha})

incidental parameter problem#

  • α^\hat{\alpha} only use TT obs, non-consistent as NN\rightarrow \infty.
  • the fixed T in α^\hat{\alpha} contaminate the β^\hat{\beta} to be inconsistent. β^mle=β+βT+Op(1/T2).\hat{\beta}_{mle} = \beta + \frac{\beta}{T}+O_p(1/T^2).
  • when TT\rightarrow \infty, require NTλ\frac{N}{T}\rightarrow \lambda that NT2(β^mleβ)=NTβ+op(1)\sqrt{NT^2}(\hat{\beta}_{mle}-\beta)=\sqrt{\frac{N}{T}}\beta +o_p(1)

6. Logit with fixed effects#

tip

similar to probit case.

7. Multinomial response model#

yij=xijβ+ziγj+aij=observed+choice coefficients+unobserved factorsy_{ij}^{*} =x_{ij}'\beta+z_i\gamma_j+a_{ij}=observed+ \text{choice coefficients} + \text{unobserved factors}

  • non-ordinal choice
  • ordinal: bond rating
  • Let Vij=xijβ+ziγjV_{ij}=x_{ij}'\beta+z_i\gamma_j (mixed logit model)

unordered choice#

We model the choice behavior using ==utility maximization== argument. McFadden (1973). yi=argmaxj{yi1,,yiJ}y_i=\arg\max_{j} \{y_{i1}^{*},\cdots,y_{iJ}^{*}\} Assume aijF(a)=eeaa_{ij}\sim F(a)=e^{-e^{-a}}, ==Type 1 extreme value distribution==. f(a)=eaeaf(a)=e^{-a-e^{-a}} Assume aija_{ij} independent of x,zx,z.

P(yi=jx,z)=eVijk=1JeVik=exijβ+ziγjk=1Jexikβ+ziγk\begin{aligned} P(y_i = j|x,z)&=\frac{e^{V_{ij}}}{\sum_{k=1}^J e^{V_{ik}}}\\ &=\frac{e^{x_{ij}\beta+z_i'\gamma_j}}{\sum_{k=1}^J e^{x_{ik}\beta+z_i'\gamma_k}} \end{aligned}

Then by MLE.

limitations

  • IIA: choice between two is independent of irrelevant alternatives. arise from iid aija_{ij} across 1,...,J1,...,J. P(yi=jx)P(yi=kx)=e(xijxik)β\frac{P(y_i=j|x)}{P(y_i=k|x)}=e^{(x_{ij}-x_{ik})'\beta}

8. Interactive effects#

Y=k=1Kβk0Xk+ε,ε=λ0f0+eY=\sum_{k=1}^{K} \beta_{k}^{0} X_{k}+\varepsilon, \quad \varepsilon=\lambda^{0} f^{0 \prime}+e
β^R=argminβRKLNTR(β)\widehat{\boldsymbol{\beta}}_{R}=\underset{\beta \in \mathbb{R}^{K}}{\operatorname{argmin}} \mathcal{L}_{N T}^{R}(\beta)
LNTR(β)=min{ΛRN×R,FRT×R}1NTYβXΛFHS2=minFRT×R1NTTr[(YβX)(YβX)MF]=1NTr=R+1Tμr[(YβX)(YβX)],\begin{aligned} \mathcal{L}_{N T}^{R}(\beta) &=\min _{\left\{\Lambda \in \mathbb{R}^{N \times R}, F \in \mathbb{R}^{T \times R}\right\}} \frac{1}{N T}\left\|Y-\beta \cdot X-\Lambda F^{\prime}\right\|_{\mathrm{HS}}^{2} \\ &=\min _{F \in \mathbb{R}^{T \times R}} \frac{1}{N T} \operatorname{Tr}\left[(Y-\beta \cdot X)^{\prime}(Y-\beta \cdot X) M_{F}\right] \\ &=\frac{1}{N T} \sum_{r=R+1}^{T} \mu_{r}\left[(Y-\beta \cdot X)^{\prime}(Y-\beta \cdot X)\right], \end{aligned}

The resulting optimization problem for F is a principal components problem, so that the optimal F is given by the R largest principal components. At the optimum, the projector MFM_F therefore exactly projects out the R largest eigenvalues of this matrix, which gives rise to the final formulation of the profile objective function as the sum over its (T − R) smallest eigenvalues.

strong factor and strong iv#

ASSUMPTION SF-Strong Factor Assumption: (i) We have 0<plimN,T1Nλ0λ0<0<\operatorname{plim}_{N, T \rightarrow \infty} \frac{1}{N} \lambda^{0 \prime} \lambda^{0}<\infty. (ii) We have 0<plimN,T1Tf0f0<0<\operatorname{plim}_{N, T \rightarrow \infty} \frac{1}{T} f^{0 \prime} f^{0}<\infty.

9. Suprious Regression#

In time-series suprious regression, the suprious regression gives bad result as the estimator is converging to a random variable not a constant.

xit=xi,t1+εityit=yi,t1+eit\begin{aligned} &x_{i t}=x_{i, t-1}+\varepsilon_{i t} \\ &y_{i t}=y_{i, t-1}+e_{i t} \end{aligned}

In panel model:

yit=α+xitβ+uituit=μi+νit\begin{gathered} y_{i t}=\alpha+x_{i t} \beta+u_{i t} \\ u_{i t}=\mu_{i}+\nu_{i t} \end{gathered}

Kao (1999) shown by sequential asymptotics:

nβ^FEdN(0,2σe25σε2)\sqrt{n} \widehat{\beta}_{F E} \stackrel{d}{\rightarrow} N\left(0, \frac{2 \sigma_{e}^{2}}{5 \sigma_{\varepsilon}^{2}}\right)

In panel, the estimator in the suprious regression gives good inference as it converges to a zero mean distribution.