Spatial clusterwise regression by an iterated spatially weighted regression algorithm
Description
This function implements a spatial clusterwise regression based on the procedure suggested by Andreano et al. (2017) and Bille' et al. (2017).
Usage
Awsreg(data, coly,colx,kernel,kernel2,coords,bw,tau,niter,conv,eta,numout,sout)
Arguments
data
A data.frame.
coly
The dependent variable in the c("y_ols") form.
colx
The covariates in the c("x1","x2") form.
kernel
Kernel function used to calculate distances between units (default is "bisquare", other values: "exponential", "gaussian","tricube").
kernel2
Kernel function used to calculate distances between units in the second step (default is "gaussian", other values: "exponential").
coords
The coordinates in terms of longitude and latitude.
bw
The bandwidth parameter of the initial weights.
tau
The confidence test parameter of the difference between regression parameters.
niter
The maximum number of iterations.
conv
The smallest accepted difference between the weights in two successive iterations.
eta
The parameter that regulates which is the weight of the weights of the previous iteration in the moving average that updates the new weights.
numout
The minimum number of areal units accepted for each cluster.
sout
Minimum value of weights such as to be considered equal to zero. Parameter used essentially to control clusters consisting of too few areal units.
Details
Author really thanks Bille' A.G. for her contribution to revising the original code.
Value
A object of Awsreg class with:
groups
Estimated clusters.
Author(s)
R. Benedetti
References
Andreano, M.S., Benedetti, R., and Postiglione, P. (2017). "Spatial regimes in regional European growth: an iterated spatially weighted regression approach", Quality & Quantity. 51, 6, 2665-2684.
Bille', A.G., Benedetti, R., and Postiglione, P. (2017). "A two-step approach to account for unobserved spatial heterogeneity", Spatial Economic Analysis, 12, 4, 452-471.
Examples
data(SimData)
SimData = SimData[1:50,]
coords = cbind(SimData$long, SimData$lat)
#######################
dmat<-gw.dist(coords,focus=0,p=2,theta=0,longlat=FALSE)
bw<-bw.gwr(y_ols~A+L+K,
data=SpatialPointsDataFrame(coords,SimData),
approach="AIC",kernel="bisquare",
adaptive=TRUE,p=2,theta=0,longlat=FALSE,dMat=dmat)
#######################
aws<-Awsreg(data=SimData,
coly=c("y_ols"),
colx=c("A","L","K"),
kernel="bisquare",
kernel2="gaussian",
coords=coords,
bw=bw,
tau=0.001,
niter=200,
conv=0.001,
eta=0.5,
numout=15,
sout=1e-05)
SimData$regimes = aws$groups
plot(lat~long,SimData,col=regimes,pch=16)
Spatial clusterwise regression by a constrained version of the Simulated Annealing
Description
This function implements a spatial clusterwise regression based on the procedure suggested by Postiglione et al. (2013).
Usage
Sareg(data, coly,colx, cont, intemp, rho, niter, subit, ncl, bcont)
Arguments
data
A data.frame
coly
The dependent variable in the c("y_ols") form.
colx
The covariates in the c("x1","x2") form.
cont
The contiguity matrix.
intemp
The initial temperature.
rho
The temperature decay rate parameter.
niter
The maximum number of iterations.
subit
The number of sub-iterations for each iteration.
ncl
The number of clusters.
bcont
A parameter that regulates the penalty of simulated annealing in non-contiguous configurations of the clusters.
Value
A object of Sareg class with:
groups
Estimated clusters.
Author(s)
R. Benedetti
References
Postiglione, P., Benedetti, R., and Andreano, M.S. (2013). "Using Constrained Optimization for the Identification of Convergence Clubs", Computational Economics, 42, 151-174.
Examples
data(SimData)
SimData = SimData[1:50,]
coords = cbind(SimData$long, SimData$lat)
#######################
dmat <-gw.dist(coords,focus=0,p=2,theta=0,longlat=FALSE)
W <- matrix(0,nrow(dmat),ncol(dmat))
W[dmat < 0.2] <- 1
diag(W) <- 0
#######################
sa <- Sareg(data=SimData,
coly = c("y_ols"),
colx = c("A"),
W,
intemp=0.5,
rho=0.96,
niter=30,
subit=3,
ncl=2,
bcont=-4)
SimData$regimes = sa$groups
plot(lat~long,SimData,col=regimes,pch=16)
Simulated data for estimating spatial regimes.
Description
Simulated production function like data for estimating spatial regimes; data has been generated for the paper "F. Vidoli, G. Pignataro, R. Benedetti, F. Pammolli, "Spatially constrained cluster-wise regression: optimal territorial areas in Italian health care", forthcoming.
Usage
data(SimData)
Format
SimData is a simulated dataset with 500 observations and 7 variables.
- long
Longitude
- lat
Latitude
- A
Land input
- L
Labour input
- K
Capital input
- clu
Real regime
- y_ols
Production output
500 units (100 units for each of the 5 regimes) are generated and, for each unit, the longitude and latitude coordinates are randomly drawn by using two Uniform distributions from 0 to 50 and from -70 to 20, i.e. U(0,50) and U(-70,20), respectively. Consequently, we set the matrix of covariates which include the constant, A, L and K variables by drawing from U(1.5,4). For each regime, finally, a different (in the coefficients) spatial function is set assuming a linear functional form. More in particular, we set 5 different vectors of parameters (including the intercept): beta1 = (13,0.5,0.3,0.2), beta2 = (11,0.8,0.1,0.1), beta3 = (9,0.3,0.2,0.5), beta4 = (7,0.4,0.3,0.3) and beta5 = (5,0.2,0.6,0.2) and a normally distributed error term in N(0,1).
Author(s)
Vidoli F.
References
F. Vidoli, G. Pignataro and R. Benedetti "Identification of spatial regimes of the production function of Italian hospitals through spatially constrained cluster-wise regression", Socio-Economic Planning Sciences (in press) https://doi.org/10.1016/j.seps.2022.101223
Examples
data(SimData)
Spatial constrained clusterwise regression by Spatial 'K'luster Analysis by Tree Edge Removal
Description
This function implements a spatial constrained clusterwise regression based on the Skater procedure by Assuncao et al. (2002).
Usage
SkaterF(edges,data,coly,colx,ncuts,crit,method=1,ind_col,lat,long,tau.ch)
Arguments
edges
A matrix with 2 colums with each row is an edge.
data
A data.frame with the informations over nodes.
coly
The dependent variable in the c("y_ols") form.
colx
The covariates in the c("x1","x2") form.
ncuts
The number of cuts.
crit
A scalar or two dimensional vector with with criteria for groups. Examples: limits of group size or limits of population size. If scalar, is the minimum criteria for groups.
method
1 (default) for OLS, 2 for Quantile regression, 3 for logit
ind_col
Parameter still not used in this version.
lat
Parameter still not used in this version.
long
Parameter still not used in this version.
tau.ch
Chosen quantile (for method = 2).
Details
Author really thanks Renato M. Assuncao and Elias T. Krainski for their original code (skater, library spdep).
Value
A object of skaterF class with:
groups
A vector with length equal the number of nodes. Each position identifies the group of node.
edges.groups
A list of length equal the number of groups with each element is a set of edges
not.prune
A vector identifying the groups with are not candidates to partition.
candidates
A vector identifying the groups with are candidates to partition.
ssto
The total dissimilarity in each step of edge removal.
Author(s)
F. Vidoli
References
For method = 1: F. Vidoli, G. Pignataro, and R. Benedetti. (2022) "Identification of spatial regimes of the production function of italian hospitals through spatially constrained cluster-wise regression. In: Socio-Economic Planning Sciences, page 101223, doi: https://doi.org/10.1016/j.seps.2022.101223
For method = 2: Vidoli, F., Sacchi A. & Sanchez Carrera E. (2025) "Spatial regimes in heterogeneous territories: The efficiency of local public spending" In: Economic modelling https://doi.org/10.1016/j.econmod.2025.107139
Examples
data(SimData)
coords = cbind(SimData$long, SimData$lat)
#######################
neighbours = tri2nb(coords, row.names = NULL)
bh.nb <- neighbours
lcosts <- nbcosts(bh.nb, SimData)
nb <- nb2listw(bh.nb, lcosts, style="B")
mst.bh <- mstree(nb,5)
edges1 = mst.bh[,1:2]
#######################
ncuts1 = 4
crit1 = 10
coly1 = c("y_ols")
colx1 = c("A","L","K")
# OLS
sk = SkaterF(edges = edges1,
data= SimData,
coly = coly1,
colx= colx1,
ncuts=ncuts1,
crit=crit1,
method=1)
SimData$regimes = sk$groups
# plot(lat~long,SimData,col=regimes,pch=16)
## quantile 0.8
# sk2 = SkaterF(edges = edges1,
# data= SimData,
# coly = coly1,
# colx= colx1,
# ncuts=ncuts1,
# crit=crit1,
# method=2,tau.ch=0.8)
#
# SimData$regimes_q = sk2$groups
# plot(lat~long,SimData,col=regimes_q,pch=16)