# econometrics.it

Federico Belotti's niche on the web

## Spatial panel data models using Stata

A new command for estimating and forecasting spatial panel data models using Stata is now available: xsmle.

xsmle fits fixed or random effects spatial models for balanced panel data. See the mi prefix command in order to use xsmle in the unbalanced case. Consider the following general specification for the spatial panel data model:

$y_{it} = \tau y_{it-1} + \rho W y_{it} + X_{it} \beta + D Z_{it} \theta + a_i + \gamma_t + v_{it}$
$v_{it} = \lambda E v_{it} + u_{it}$

where $u_{it}$ is a normally distributed error term, $W$ is the spatial matrix for the autoregressive component, $D$ the spatial matrix for the spatially lagged independent variables, $E$ the spatial matrix for the idiosyncratic error component. $a_i$ is the individual fixed or random effect and $\gamma_t$ is the time effect. xsmle fits the following nested models:

i) The SAR model with lagged dependent variable ($\theta=\lambda=0$)

$y_{it} = \tau y_{it-1} + \rho W y_{it} + X_{it} \beta + a_i + \gamma_t + u_{it}$,

where the standard SAR model is obtained by setting $\tau=0$.

ii) The SDM model with lagged dependent variable ($\lambda=0$)

$y_{it} = \tau y_{it-1} + \rho W y_{it} + X_{it} \beta + D Z_{it} \theta + a_i + \gamma_t + u_{it}$,

where the standard SDM model is obtained by setting $\tau=0$. xsmle allows to use a different weighting matrix for the spatially lagged dependent variable ($W$) and the spatially lagged regressors ($D$) together with a different sets of explanatory ($X_{it}$) and spatially lagged regressors ($Z_{it}$). The default is to use $W=D$ and $X_{it}=Z_{it}$.

iii) The SAC model ($\theta=\tau=0$)

$y_{it} = \rho W y_{it} + X_{it} \beta + a_i + \gamma_t + v_{it}$,
$v_{it} = \lambda E v_{it} + u_{it}$,

for which xsmle allows to use a different weighting matrix for the spatially lagged dependent variable ($W$) and the error term ($E$).

iv) The SEM model ($\rho=\theta=\tau=0$)

$y_{it} = X_{it} \beta + a_i + \gamma_t + v_{it}$,
$v_{it} = \lambda E v_{it} + u_{it}$.

v) The GSPRE model ($\rho=\theta=\tau=0$)

$y_{it} = X_{it} \beta + a_i + v_{it}$,
$a_i = \phi W a_i + \mu_i$,
$v_{it} = \lambda E v_{it} + u_{it}$,

where also the random effects have a spatial autoregressive form.

The command was written together with Andrea Piano Mortari and Gordon Hughes.

You may install it by typing

net install xsmle, all from(http://www.econometrics.it/stata)

in your Stata command bar.

HTH,
Federico

## sfcross and sfpanel: stochastic frontier analysis using Stata

Two new Stata commands for the estimation and post-estimation of cross-sectional and panel data stochastic frontier models. sfcross extends the official frontier capabilities by including additional models (Greene 2003; Wang 2002) and command functionality, such as the possibility to manage complex survey data characteristics. Similarly, sfpanel allows to estimate a much wider range of time-varying inefficiency models compared to the official xtfrontier command. In particular, when estimation is done with likelihood-based methods, the SF model is:

$y_{it} = \alpha + X_{it}\beta + v_{it} \pm u$

where $v_{it}$ is a normally distributed error term and $u$ is a one-sided strictly non-negative term representing inefficiency. The sign of the $u$ term is positive or negative depending on whether the frontier describes a cost or production function, respectively. Among the time-varying inefficiency models $(u=u_{it})$, sfpanel fits:

i) the true fixed-effects (TFE) and the true random-effects (TRE) models developed by Greene (2005), in which both time-invariant unmeasured heterogeneity $(\alpha=\alpha_i)$ and time-varying firm inefficiency are considered;

ii) the Battese and Coelli (1995) model, in which the $u_{it}$ is obtained by truncation at zero of the normal distribution with mean $(Z_{it} \delta)$, where $Z_{it}$ is a set of covariates explaining the mean of inefficiency;

iii) the time decay model by Battese and Coelli (1992), in which $u_{it}=u_i B(t)$, and $B(t)=\{\exp[-\eta(t-T_i)]\}$. $u_i$ is assumed to be truncated-normally distributed with non-zero mean and constant variance, while $\eta$ governs the temporal pattern of inefficiency.

iv) the flexible parametric model by Kumbhakar (1990), in which $u_{it}=u_i B(t)$ , and $B(t)=[1+\exp(bt+ct^2)]^{-1}$.

Among the time-invariant inefficiency models $(u=u_i)$, sfpanel fits:

v) the Battese and Coelli (1988) model, in which $u_i$ is truncated-normally distributed with non-zero mean and constant variance;

vi) the Pitt and Lee (1981) model, in which $u_i$ is half-normally distributed with constant variance;

When estimation is done with least squares methods, the SF production model is:

$y_{it} = \alpha + X_{it}\beta + v_{it}$

Among the time-varying inefficiency models $(\alpha=\alpha_{it})$, sfpanel fits:

vii) the Lee and Schmidt (1993) model, in which $\alpha_{it} = \theta_t \delta_i$ and $\theta_t$ are parameters to be estimated. This model is a special case of Kumbhakar (1990), in which $B(t)$ is represented by a set of dummy variables for time.

viii) the Cornwell et al. (1990) model, in which $\alpha_{it} = \delta_{i0} + \delta_{i1} t + \delta_{i2} t^2$

Among the time-invariant inefficiency models $(\alpha=\alpha_i)$, sfpanel fits:

ix) the Schmidt and Sickles (1984) model in which $\alpha_i$ can be either fixed or random.

The two commands were written together with Silvio Daidone, Giuseppe Ilardi and Vincenzo Atella.

You may install them by typing

net install sfcross, all from(http://www.econometrics.it/stata)
net install sfpanel, all from(http://www.econometrics.it/stata)

in your Stata command bar.

Click here to access the accompanying paper.

HTH,
Federico

## twopm: estimating two-part models using Stata

A new Stata command to estimate two-part models for mixed discrete-continuous outcomes is now available at SSC/econometrics.it.

In two part models, a binary choice model is estimated for the probability of observing a zero versus positive outcome. Then, conditional on a positive outcome, an appropriate regression model is estimated for the positive outcome.

twopm focuses on continuous outcomes modeled using regress or glm. When the outcome is a count variable, such models are known as hurdle models. Of special note is that twopm allows the user to leverage the capabilities of predict and margins to calculate predictions and marginal effects from the combined first- and second-part models.

It was written together with Partha Deb.
You may install the command by typing

ssc install twopm

in your Stata command bar.

HTH,
Federico