R/spflow.R
spflow.Rd
We implement three different estimators of spatial econometric interaction models (Dargel 2021) that allow the user to estimate origin-destination flows with spatial autocorrelation.
By default the estimation will include spatial dependence in the dependent
variable and the explanatory variables, which leads to the spatial Durbin
model (SDM) (Anselin 1988)
.
Moreover, the model includes an additional set of parameters for intra
regional flows that start and end in the same geographic site (as proposed
by LeSage and Pace (2009)
).
Both default options can be deactivated via the estimation_control
argument,
which gives fine grained control over the estimation.
spflow(
spflow_formula,
spflow_networks,
id_net_pair = id(spflow_networks)[["pairs"]][[1]],
estimation_control = spflow_control()
)
A formula specifying the spatial interaction model (for details see section Formula interface)
A spflow_network_multi()
object that contains information on the
origins, the destinations and their neighborhood structure
A character indicating the id of a spflow_network_pair()
(only relevant if
the spflow_network_multi()
contains multiple spflow_network_pair
-objects:
defaults to the of them)
A list generated by spflow_control()
that provides fine grained control
over the estimation procedure
An S4 class of type spflow_model-class()
Our estimation procedures makes use of the matrix formulation introduced by LeSage and Pace (2008) and further developed by Dargel (2021) to reduce the computational effort and memory requirements. Further generalizations to deal with non-cartesian and rectangular flows are developed by Dargel and Thomas-Agnan (2023) .
The estimation procedure can be adjusted through the estimation_method
argument in spflow_control()
.
Maximum likelihood estimation is the default estimation procedure. The matrix form estimation in the framework of this model was first developed by LeSage and Pace (2008) and then improved by Dargel (2021) .
The S2SLS estimator is an adaptation of the one proposed by
Kelejian and Prucha (1998)
, to the case of origin-destination
flows, with up to three neighborhood matrices
Dargel (2021)
.
A similar estimation is done by Tamesue and Tsutsumi (2016)
.
The user can activate the S2SLS estimation via the estimation_control
argument
using the input spflow_control(estimation_method = "s2sls")
.
The MCMC estimator is based on the ideas of
LeSage and Pace (2009)
and incorporates the improvements
proposed in Dargel (2021)
.
The estimation is based on a tuned Metropolis-Hastings sampler for the
auto-regressive parameters, and for the remaining parameters it uses Gibbs
sampling.
The routine uses 5500 iterations of the sampling procedure and considers the
first 2500 as burn-in period.
The user can activate the S2SLS estimation via the estimation_control
argument
using the input spflow_control(estimation_method = "mcmc")
.
The function offers a formula interface adapted to spatial interaction
models, which has the following structure:
Y ~ O_(X1) + D_(X2) + I_(X3) + P_(X4)
This structure reflects the different data sources involved in such a model.
On the left hand side there is the independent variable Y
which
corresponds to the vector of flows.
On the right hand side we have all the explanatory variables.
The functions O_(...)
and D_(...)
indicate which variables are used as
characteristics of the origins and destinations respectively.
Similarly, I_(...)
indicates variables that should be used for the
intra-regional parameters.
Finally, P_(...)
declares which variables describe origin-destination
pairs, which most frequently will include a measure of distance.
All the declared variables must be available in the provided
spflow_network_multi()
object, which gathers information on the origins and
destinations (inside spflow_network()
objects), as well as the
information on the origin-destination pairs (inside a spflow_network_pair()
object).
Using the short notation Y ~ .
is possible and will be interpreted as
usual, in the sense that we use all variables that are available for each
data source.
Also mixed formulas, such as Y ~ . + P_(log(X4) + 1)
, are possible.
When the dot shortcut is combined with explicit declaration, it will only be
used for the non declared data sources.
The following examples illustrate this behavior.
Consider the case where we have the flow vector Y
and the distance vector
DIST
available as information on origin-destination pairs.
In addition we have the explanatory variables X1, X2
and X3
which
describe the regions that are at the same time origins and destinations of
the flows.
For this example the four formulas below are equivalent and make use of all
explanatory variables X1, X2
and X3
for origins, destinations and
intra-regional observations.
Y ~ .
Y ~ . + P_(DIST)
Y ~ X1 + X2 + X3 + P_(DIST)
Y ~ D_(X1 + X2 + X3) + O_(X1 + X2 + X3) + I_(X1 + X2 + X3) + P_(DIST)
Now if we only want to use X1 for the intra-regional model we can do the following (again all four options below are equivalent).
Y ~ . + I_(X1)
Y ~ . + I_(X1) + P_(DIST)
Y ~ X1 + X2 + X3 + I_(X1) + P_(DIST)
Y ~ D_(X1 + X2 + X3) + O_(X1 + X2 + X3) + I_(X1) + P_(DIST)
This behavior is easily combined with transformation of variables as the two equivalent options below illustrate.
log(Y + 1) ~ sqrt(X1) + X2 + P_(log(DIST + 1))
Anselin L (1988).
Spatial Econometrics: Methods and Models.
Springer Netherlands.
2020-03-24.
Dargel L (2021).
“Revisiting estimation methods for spatial econometric interaction models.”
Journal of Spatial Econometrics, 10.
https://doi.org/10.1007/s43071-021-00016-1.
Dargel L, Thomas-Agnan C (2023).
“Efficient Estimation of Spatial Econometric Interaction Models for Sparse OD Matrices.”
TSE Working Paper, n. 23-1409, February 2023, https://www.tse-fr.eu/publications/generalized-framework-estimating-spatial-econometric-interaction-models.
Kelejian HH, Prucha IR (1998).
“A Generalized Spatial Two-Stage Least Squares Procedure for Estimating a Spatial Autoregressive Model with Autoregressive Disturbances.”
The Journal of Real Estate Finance and Economics, 99--121.
https://doi.org/10.1023/A:1007707430416.
LeSage JP, Pace RK (2008).
“Spatial Econometric Modeling of Origin-Destination Flows.”
Journal of Regional Science, 941--967.
https://doi.org/10.1111/j.1467-9787.2008.00573.x.
LeSage JP, Pace RK (2009).
Introduction to Spatial Econometrics.
CRC Press.
Tamesue K, Tsutsumi M (2016).
“Dealing with Intraregional Flows in Spatial Econometric Gravity Models.”
In Patuelli R, Arbia G (eds.), Spatial Econometric Interaction Modelling, chapter 6, 105--119.
Springer International Publishing.
https://doi.org/10.1007/978-3-319-30196-9_6.
# Estimate flows between the states of Germany
spflow(spflow_formula = y9 ~ . + P_(DISTANCE),
spflow_networks = multi_net_usa_ge,
id_net_pair = "ge_ge")
#> --------------------------------------------------
#> Spatial interaction model estimated by: MLE
#> Spatial correlation structure: SDM (model_9)
#> Dependent variable: y9
#>
#> --------------------------------------------------
#> Coefficients:
#> est sd t.stat p.val
#> rho_d 0.497 0.030 16.499 0
#> rho_o 0.333 0.037 9.001 0
#> rho_w -0.227 0.044 -5.117 0
#> (Intercept) 10.198 2.161 4.719 0
#> (Intra) 9.871 1.531 6.445 0
#> D_X 0.983 0.069 14.321 0
#> D_X.lag1 0.509 0.115 4.437 0
#> O_X -0.759 0.038 -19.917 0
#> O_X.lag1 -0.367 0.093 -3.965 0
#> I_X 2.035 0.083 24.650 0
#> P_DISTANCE -2.622 0.384 -6.829 0
#>
#> --------------------------------------------------
#> R2_corr: 0.9921423
#> Observations: 256
#> Model coherence: Validated
# Same as above with explicit declaration of variables...
# ... X is the only variable available
# ... it is used for origins, destination and intra-state flows
spflow(spflow_formula = y9 ~ X + P_(DISTANCE),
spflow_networks = multi_net_usa_ge,
id_net_pair = "ge_ge")
#> --------------------------------------------------
#> Spatial interaction model estimated by: MLE
#> Spatial correlation structure: SDM (model_9)
#> Dependent variable: y9
#>
#> --------------------------------------------------
#> Coefficients:
#> est sd t.stat p.val
#> rho_d 0.497 0.030 16.499 0
#> rho_o 0.333 0.037 9.001 0
#> rho_w -0.227 0.044 -5.117 0
#> (Intercept) 10.198 2.161 4.719 0
#> (Intra) 9.871 1.531 6.445 0
#> D_X 0.983 0.069 14.321 0
#> D_X.lag1 0.509 0.115 4.437 0
#> O_X -0.759 0.038 -19.917 0
#> O_X.lag1 -0.367 0.093 -3.965 0
#> I_X 2.035 0.083 24.650 0
#> P_DISTANCE -2.622 0.384 -6.829 0
#>
#> --------------------------------------------------
#> R2_corr: 0.9921423
#> Observations: 256
#> Model coherence: Validated
# Same as above
spflow(spflow_formula = y9 ~ O_(.) + D_(.) + I_(.) + P_(DISTANCE),
spflow_networks = multi_net_usa_ge,
id_net_pair = "ge_ge")
#> --------------------------------------------------
#> Spatial interaction model estimated by: MLE
#> Spatial correlation structure: SDM (model_9)
#> Dependent variable: y9
#>
#> --------------------------------------------------
#> Coefficients:
#> est sd t.stat p.val
#> rho_d 0.497 0.030 16.499 0
#> rho_o 0.333 0.037 9.001 0
#> rho_w -0.227 0.044 -5.117 0
#> (Intercept) 10.198 2.161 4.719 0
#> (Intra) 9.871 1.531 6.445 0
#> D_X 0.983 0.069 14.321 0
#> D_X.lag1 0.509 0.115 4.437 0
#> O_X -0.759 0.038 -19.917 0
#> O_X.lag1 -0.367 0.093 -3.965 0
#> I_X 2.035 0.083 24.650 0
#> P_DISTANCE -2.622 0.384 -6.829 0
#>
#> --------------------------------------------------
#> R2_corr: 0.9921423
#> Observations: 256
#> Model coherence: Validated
# Same as above
spflow(spflow_formula = y9 ~ O_(X) + D_(X) + I_(X) + P_(DISTANCE),
spflow_networks = multi_net_usa_ge,
id_net_pair = "ge_ge")
#> --------------------------------------------------
#> Spatial interaction model estimated by: MLE
#> Spatial correlation structure: SDM (model_9)
#> Dependent variable: y9
#>
#> --------------------------------------------------
#> Coefficients:
#> est sd t.stat p.val
#> rho_d 0.497 0.030 16.499 0
#> rho_o 0.333 0.037 9.001 0
#> rho_w -0.227 0.044 -5.117 0
#> (Intercept) 10.198 2.161 4.719 0
#> (Intra) 9.871 1.531 6.446 0
#> D_X 0.983 0.069 14.321 0
#> D_X.lag1 0.509 0.115 4.437 0
#> O_X -0.759 0.038 -19.917 0
#> O_X.lag1 -0.367 0.093 -3.964 0
#> I_X 2.035 0.083 24.650 0
#> P_DISTANCE -2.622 0.384 -6.829 0
#>
#> --------------------------------------------------
#> R2_corr: 0.9921423
#> Observations: 256
#> Model coherence: Validated