Space–Time Conditional Autoregressive Modeling to Estimate Neighborhood-Level Risks for Dengue Fever in Cali, Colombia

Abstract. Vector-borne diseases affect more than 1 billion people a year worldwide, causing more than 1 million deaths, and cost hundreds of billions of dollars in societal costs. Mosquitoes are the most common vectors responsible for transmitting a variety of arboviruses. Dengue fever (DENF) has been responsible for nearly 400 million infections annually. Dengue fever is primarily transmitted by female Aedes aegypti and Aedes albopictus mosquitoes. Because both Aedes species are peri-domestic and container-breeding mosquitoes, dengue surveillance should begin at the local level—where a variety of local factors may increase the risk of transmission. Dengue has been endemic in Colombia for decades and is notably hyperendemic in the city of Cali. For this study, we use weekly cases of DENF in Cali, Colombia, from 2015 to 2016 and develop space–time conditional autoregressive models to quantify how DENF risk is influenced by socioeconomic, environmental, and accessibility risk factors, and lagged weather variables. Our models identify high-risk neighborhoods for DENF throughout Cali. Statistical inference is drawn under Bayesian paradigm using Markov chain Monte Carlo techniques. The results provide detailed insight about the spatial heterogeneity of DENF risk and the associated risk factors (such as weather, proximity to Aedes habitats, and socioeconomic classification) at a fine level, informing public health officials to motivate at-risk neighborhoods to take an active role in vector surveillance and control, and improving educational and surveillance resources throughout the city of Cali.


INTRODUCTION
Vector-borne diseases (VBDs), more specifically mosquitoborne arboviruses, are responsible for 1 billion infectious disease cases each year, globally. 1,2 Mosquitoes transmit a variety of arboviruses and are the most common vectors. Dengue fever (DENF) is a mosquito-borne disease that is responsible for most of the global burden of arboviruses. 3 More than 40% of humans are at risk of transmission, with incidence rising 30-fold in the last 50 years, and it is estimated that there are approximately 390 million DENF infections annually. 4 Dengue fever is primarily transmitted by the Aedes aegypti and Aedes albopictus mosquitoes. 5,6 Both species are container-breeding mosquitoes that have become prolific in urban areas because of the widespread availability of breeding habitats. 7 Dengue is a flavivirus that causes DENF, and there are four serotypes that follow the human cycle. 8 The incubation period ranges from 3 to 14 days after being bit by an infected mosquito, and symptoms can last from 2 to 7 days 9 ; however, approximately 80% of infected individuals are asymptomatic. Infection from one serotype will result in lifelong immunity to that serotype; however, secondary infection with another serotype can lead to severe forms of DENF, 10 such as dengue hemorrhagic fever (DHF) and dengue shock syndrome (DSS). Dengue hemorrhagic fever and DSS primarily not only affect pediatric patients but also have been found among adults (especially the elderly); mortality from dengue is highest among children and those who have experienced DSS. 11 It is critical to implement surveillance strategies that can improve the understanding of DENF transmission. Improving DENF surveillance can facilitate the timely reporting of disease cases, reduce underreporting, inform policy-makers, increase disease awareness, and define funding and research priorities, 12 ultimately reducing the economic and public health burdens in at-risk locations around the world. 13 Approaches and advances in geographic information science and spatial epidemiology play critical roles in DENF surveillance-such as tracking diffusion and cyclic patterns, detecting clusters, mapping disease rates and risk, 14 and understanding the place-based determinants of disease transmission. 15 Dengue fever risks and rates will vary by place, and covariate data are needed to identify significant variables responsible for observable spatial patterns. 16 Therefore, it is critical to examine the social, economic, environmental, biological, and institutional factors that may affect DENF prevalence in a particular area. Urban regions are highly complex, and neighborhoods are the scale that public health departments most effectively operate. 17 Therefore, more small-area studies in spatial epidemiology are required to effectively uncover the spatial and temporal heterogeneity of DENF rates across urban landscapes at these fine levels of granularity. For example, education, income, age, access to care, and quality of prevention strategies are known to strongly influence an individual's susceptibility to VBDs. 18,19 Likewise, the dynamics of how temperature, precipitation, and humidity affect vector abundance and DENF transmission are critical to developing and implementing effective sub-seasonal risk model 20 and long-term mitigation in response to climate change. 21 Conditional autoregressive (CAR) models can be used to examine how DENF risk is influenced by socioeconomic, environmental, and accessibility risk factors, and lagged weather variables. For example, a particular location may be influenced by DENF rates and a variety of explanatory variables (e.g., socioeconomic status and proximity to Aedes habitats) contained in surrounding locations (spatial spillover/diffusion effects). Conditional autoregressive models can be fitted to data under Bayesian paradigm (i.e., relying on prior beliefs/ borrowing information to inform future estimations) using Bayesian hierarchical models (BHMs), which are widely used techniques in geography and public health to model spatial and spatiotemporal data. 22 In short, BHMs can model complicated spatial and space-time processes by conditionally modeling the variations in data, the process, and unknown parameters. 23 The temporal extension, space-time (ST)-CAR, can estimate the value of a variable (e.g., disease rates) at a particular location and time, which will be related to current and past values of the surrounding locations and time periods, essentially testing for spatiotemporal dependence. ST-CAR models have been used to study the effect of air pollution on human health, 24 substance abuse, and its relationship with child abuse, 25 influenza, 26 and DENF. [27][28][29] More research is necessary to examine the local variations in DENF transmission dynamics at very fine spatial and temporal scales. Delmelle et al. 30 used a geographically weighted regression (GWR 31 ) model which identified six significant socioeconomic and environmental independent variables (including proximity to tire shops and population density) of DENF rates in Cali, Colombia, at the neighborhood level. However, the explanatory power of GWR and its temporal extension-GTWR 32 -is limited. Both CAR and ST-CAR models produce model-based estimates and inference derived from varying effects via spatial random fields-for example, borrowing strength from neighborhood spatial and temporal proximity, whereas GWR models allow the covariates to vary in space (and time in the case of GTWR), but inference is ad hoc. In other words, ST-CAR models can estimate spatially and temporally varying associations between dependent (e.g., disease rates) and independent variables based on locally weighted regressions in both geographic and attribute spaces, whereas GTWR can only produce local FIGURE 1. Neighborhoods in Cali, Colombia, and their ranking by socioeconomic strata (1)(2)(3)(4)(5)(6). This figure appears in color at www.ajtmh.org. SPACE-TIME MODELING OF NEIGHBORHOOD-LEVEL RISKS FOR DENGUE estimates in geographic space. It is therefore worthwhile to use ST-CAR modeling in small-area DENF studies at fine temporal scales.
This study uses a ST-CAR modeling approach to examine the influence of socioeconomic, environmental, weather, and climatic variables on DENF outbreaks in Cali, Colombia, at the neighborhood and weekly levels between 2015 and 2016. The approach can determine if DENF rates and covariates in one neighborhood are influenced by rates and covariates in surrounding neighborhoods and time periods. We also estimate disease risks using temporally lagged weather variables. Our modeling approach is capable of identifying regions with FIGURE 2. Temporal distribution of weekly dengue fever (DENF) cases in Cali, Colombia from January 2015 through December 2016 (top); spatial distribution of DENF cases for the study period (bottom). This figure appears in color at www.ajtmh.org. high-risk clusters at the neighborhood level. The results provide detailed insight about the spatial heterogeneity of disease risk and the associated risk factors at a fine level, informing public health officials to motivate at-risk neighborhoods to take an active role in vector surveillance and control, and improving educational and surveillance resources throughout the city of Cali.
The remainder of this article is as follows: Section 2 provides information about Cali, the DENF cases, and candidateindependent variables; the technique to select the lagged weather variables; and the ST-CAR modeling approach. Section 3 provides the modeling results, including maps of the estimated DENF rates by week for each neighborhood in Cali. Section 4 discusses key findings, strengths, limitations, and avenues of future research. Section 5 summarizes findings with concluding remarks.

DATA AND METHODS
Study area and data. Cali is the second largest city in Colombia and the third most populous, with an estimated 2010 population of 2.3 million (average density of 4,000 km 2 ). The city comprises 340 neighborhoods (Spanish: barrios), which are classified by socioeconomic stratum (ranging from 1 to 6), where a ranking of 1-2 is low, 3-4 is middle, and 5-6 is high. The classifications are defined by the external physical characteristics of the dwelling, its immediate surroundings, and its urban context. For example, urban context includes variables such as poverty, social deviation, urban decay, industry, and commercial; immediate surroundings include access roads, and sidewalks; and the characteristics of the dwelling include front lawn, garage, façade material, door material, front of the house dimensions, and windows (income is not considered). This stratification is only applied to residential constructions. 33 Figure 1 provides a map of neighborhoods in Cali and their corresponding ranking. The average size of neighborhoods in Cali is 0.35 km 2 . Some of the smaller neighborhoods are in the city core, which is where the city was founded. The largest neighborhood is to the south and corresponds to newer developments, and is an area that houses three of the largest universities in Cali.  Figure 2, top). The cases were geocoded to the neighborhood level using each neighborhood's name as the address locator in the geocoder algorithm in ArcGIS 10.6 (ESRI, Redlands, CA). Each DENF case record contained a neighborhood where the infected individual lived (individual addresses were not available), and then the geocoder aggregated the cases to a particular neighborhood after a successful match. As a result, 26,503 of 35,498 DENF cases (74.6%) were successfully geocoded and aggregated to the neighborhood level in Cali. Cases that were not geocoded did not have an address or a neighborhood; therefore, it was impossible to assign coordinates to the unmatched cases. Figure 2 (bottom) provides a map of the total DENF cases per neighborhood between 2015 and 2016 in Cali. The eastern portion of Cali observed the highest number of DENF cases between the 2 years in our study period. These neighborhoods were majority low-strata (1 and 2); however, there are numerous middle-strata (3 and 4) neighborhoods in the central and southern portions of Cali with a high number of DENF cases. There are also high strata neighborhoods with a high proportion of cases in the central and western regions of Cali, especially those that are adjacent to lower strata neighborhoods.
The socioeconomic and demographic data were provided by the Colombian census (either 2005 or 2010 estimates provided by the city of Cali), including population density, age, race, households with sewer and water access, educational attainment, employment status, and socioeconomic stratum, among others. The last national census occurred in 2005, whereas the new 2018 census has yet to be released. The location of healthcare centers and the environmental variables were provided by the city of Cali (2010 data)-green zones, rivers, tire shops, water pumps, cemeteries, and plant nurseries, which were geocoded as point layers with the exception of green zones (area: polygons) and rivers (lines). The environmental variables are included as potential Aedes habitats. For green zones, the area of the green zones for each neighborhood was computed in square kilometers. Similar to Delmelle et al., 30 relative proximity to rivers, tire shops, water pumps, cemeteries, plant nurseries, and healthcare centers was computed by using kernel density estimation (KDE)representing the density of points for each layer. Kernel density estimation was also computed to produce the density of trees. Zonal statistics in ArcGIS 10.6 was used to summarize the average KDE for each neighborhood in Cali.
Finally, the weather variables selected for evaluation ( Table 1) are consistent with current understanding of how weather conditions impact Aedes survival, abundance, and behavior, including viral transmission rates 20,34 (Table 5). Given our goal of developing a risk model using 2015-2016 weekly-level DENF data, the weather variables entail weekly summaries of 2014-2016 daily meteorological observations collected at the Cali international airport and obtained from the Global Historical Climate Network archive 35 maintained by the National Centers for Environmental Information (http:// www.ncdc.noaa.gov). All weekly weather variables were computed following the methods outlined in the study by Eastin et al., 20 and summary statistics for the 2014-2016 period are provided in Table 1. A total of 49 candidate-predictor variables of DENF were evaluated in this study (Table 1). Because of the differences in units of measurements, each variable was normalized between 0 and 1 (i.e., between the maximum and minimum values listed in Table 1) for all subsequent analyses.
Methodology. Principal component analysis (PCA) and variance inflation factor (VIF) testing. Because of the large number of socioeconomic and demographic variables, and environmental variables (n = 36), we used VIF 36 testing and a PCA 37 in Stata. We did not include all variables in the PCA because we wanted to interpret particular independent variables of DENF individually, which is important for potential decision-making. For example, identifying the effects of individual Aedes habitats (e.g., tire shops and trees) can provide more intuitive results than grouping them in a PCA. As a result, five variables were selected for subsequent modeling with a VIF value < 3: population density, tree density, relative proximity to rivers, relative proximity to tire shops, and relative proximity to plant nurseries.
The PCA's purpose is to reduce and simplify the variables into new variables (components) that explain a large degree of variation without collinearity between the components. Tables 2 and 3 provide the results of the PCA analysis and include the variables that were included-which were not included in the VIF testing. Table 2 describes the variance explained by the top three principal components, and Table 3 shows the variables that belong to each component. After examining the eigenvalues and component loadings with coefficients > 0.40, the first two components were selected for inclusion in the space-time modeling, which explains 60.8% of the variance (42% and 18.8%, respectively).
Principal component (PC) 1 is strongly associated with (correlation close to 0.5 and greater) neighborhood strata, individuals who are employed, retired persons, married individuals, empty households, and individuals with high education. PC2 mainly explains occupied households, individuals who do housework, and individuals who are students. PC3 explains the combined effect from disabled individuals but was excluded from further analysis, and it was included as a separate variable in further VIF testing with the remaining covariates. PC1 can be interpreted as employed, higherincome people who are likely to be older and married because of the large coefficients of retired and married individuals. PC2 can be interpreted as people who are likely to spend more time at home than those in PC1 because they are either students or do housework for a living. PC2 probably also captures younger individuals because of the inclusion of students.
Selecting lagged weather variables. First, VIF testing was used to assess multicollinearity among the candidate weekly weather variables (Table 1) and remove the most intercorrelated variables while retaining the most independent. Seven variables (with VIF values < 3) were selected for the subsequent cross-correlation analysis: Tavg, DTRmax, RHrng, RainT, RainD, CoolD, and WarmD. Next, lagged crosscorrelations were computed between weekly DENF rates and each weather variable over an 8-week window (Table 4). Such time frame was considered the most biologically plausible based on the existing literature of how weather affects the full vector life cycle and viral transmission dynamics 20,34 (see Table 5). Finally, an optimal lag for each weather variable was selected among those weeks exhibiting a statistically significant (at the 5% level; adjusted for multiple testing) crosscorrelation within the 1-to 8-week-lag window (0-week lags were not considered because of their lack of predictive potential). For six variables (Tavg, DTRmax, RainT, RainD, CoolD, and WarmD), the selected optimal lags (5 weeks, 5 weeks, 3 weeks, 5 weeks, 2 weeks, and 5 weeks, respectively) exhibited the maximum cross-correlation coefficient (Table 4). For RHrng, the maximum correlation occurred at 6 weeks, but the 3-week lag was deemed optimal because the existing literature suggests that relative humidity is most influential on adult vectors and the gonotrophic cycle ( Table 5). The optimal lags were included in the ST-CAR models.
Poisson generalized linear models (GLMs). First, a Poisson GLM is computed to detect significant effects of independent variables on a dependent variable (DENF risk), and the presence of spatiotemporal autocorrelation in the residuals. The Poisson GLM is defined as where Y ij is the observed DENF count in neighborhood i at week j, E ij is the expected DENF count in neighborhood i at week j, and R ij is the disease risk in neighborhood i at week j (see Supplemental Materials for the results of the Poisson GLM). Global Moran's I 54 was then computed to detect spatial autocorrelation of the Poisson GLM residuals for each time period. Essentially, the Global Moran's I test determines if there is evidence of unexplained spatial autocorrelation in the residuals, and if positive spatial autocorrelation is detected, then the assumption of independence is not valid for the data, and spatiotemporal autocorrelation should be considered when estimating covariate effects on the dependent variable. The global Moran's I index ranges from −1 to 1, whereas −1 indicates strong negative spatial autocorrelation, 0 indicates complete spatial randomness, and one indicates strong positive spatial autocorrelation, and the statistic is defined as where n is the total number of neighborhoods, w ij is the spatial weight between neighborhood i and l, x is the mean of residuals for all neighborhoods, x i is the residual value in neighborhood i, * Neighborhood strata classification between 1 and 6. † Proportion (%) of individuals in each category: 1) occupied households, 2) work from home, 3) retired persons, 4) disabled persons, 5) students, 6) married persons, 7) empty households, and 8) holds college degree or higher.  ST-CAR modeling. Next, a BHM 24,25 is defined using a Poisson data model (for case/population data). The model "represents the spatiotemporal pattern in the mean response with a single set of spatially and temporally autocorrelated random effects. The effects follow a multivariate autoregressive process of order 1. 54 In other words, when going from 1 week to another (e.g., j + 1), it will yield an effect on the dependent variable (DENF risk). Therefore, this model examines linear trends which can be interpreted as how DENF risk is influenced across time.
It is assumed that the estimated effect on DENF risk in the ST-CAR model is not specific to a particular week, but a process that is influenced by the covariate data across the weeks (temporal unit). Suppose a study region is divided into a collection of N nonoverlapping areal units (e.g., neighborhoods) indexed by i 2 f1, . . . ,Ng and the data are observed for multiple time periods, that is: j 2 f1, . . . ,Tg. As suggested before, ST-CAR models use prior distributions, where the CAR distributions state that adjacent variables in space or time are conditionally autocorrelated, and nonadjacent variables are conditionally independent. The spatial weight matrix is defined as W = ðw ik Þ, where a value of 1 indicates that i and k are spatially adjacent, and 0 otherwise. Because it is unknown where a person was infected, the abovementioned adjacency matrix is a proxy for Aedes' maximum range-because they do not fly more than 400 m from where they emerged as larvae. 10 A temporal weight matrix can also be defined as D = ðd jt Þ, where a value of 1 is given if t − j = 1, and 0 otherwise. The first part of the model is defined as β k ∼ Nð0;1;000Þ k 2 f1, . . . ,pg, where Y ij is the observed DENF count in neighborhood i at week j, E ij is the expected disease count in neighborhood i at week j, and R ij is the disease risk in neighborhood i at week j. X T ij ðx ij1 . . . ,x ijp Þ is a vector of known covariates p for neighborhood i and week j. The parameter β is an associated p × 1 vector of regression parameters, which can come from the initial Poisson GLM in Equations (1) and (2). The term O is a vector of known offsets ðO 1 , . . . ,O N Þ K × N , where O j is a K × 1 column vector of offsets (expected DENF cases) for week j ðO 1j , . . . ,O Kj Þ. An offset variable is used to scale the modeling of the mean in Poisson's regression with a log link, which is the case in the aforementioned model. For example, because the dependent variable is rates, the offset can enforce that 10 cases of DENF in 1 week is not the same magnitude as 10 cases of DENF in 6 weeks. The parameter f ij denotes spatiotemporally autocorrelated random effects for neighborhood i and week j. A variety of spatiotemporal structures can be fit for f ij . Here, we use a model that estimates the evolution of the spatial response surface over time without forcing it to be the same for each time period (Rushworth et al. 55 ).
where f j = ðf 1j , . . . ,f Nj Þ is a vector of random effects for week j. Temporal autocorrelation is enforced because f j depends on f jÀ1 . fðf 1 Þ enforces spatial autocorrelation in the random effects, where the spatial structure is defined in the CAR prior in Equation (8): where ρ controls the spatial autocorrelation, with ρ = 1 indicating strong spatial autocorrelation, which is conditional on the mean random effects of adjacent neighborhoods. ρ = 0 represents independent random effects with a constant mean and constant variance. The conditional precision is controlled by τ, where precision is higher when more prior information (e.g., adjacent neighborhoods) is borrowed to determine the posterior estimates. Equation (9) (CAR prior) enforces temporal autocorrelation in the random effects and is defined as . . . ,Tg; (9) where Qðρ,WÞ is a precision matrix that is defined as ρðdiagonal½W1 À WÞ + ð1 À ρÞI, where I is a N × N identity matrix and "1" is a vector of ones (N × 1). The α controls the temporal autocorrelation, where 0 is temporally independent and 1 is strongly temporally dependent. The CAR priors also include weakly informative hyperpriors (i.e., probability distribution from priors to inform/update posterior values), which are the three parameters defined in the following text: τ ∼ Uniform½0; 1; 000, α ∼ Uniform½0; 1; ρ ∼ Uniform½0; 1: The values of the hyperpriors are selected in a way so that our Bayesian inferences are robust and not sensitive to these choices. For example, a nonstationary spatial process would occur when ρ = 1, and a nonstationary temporal process would occur if α = 1. Overall, the ST-CAR model states that when going from 1 week to another (j + 1), it yields an effect on the dependent variable (DENF risk), which is influenced by spatially and temporally dependent covariates. In other words, DENF risk in a target neighborhood is influenced by current and past values of DENF risk and covariates at surrounding neighborhoods and time periods (which is a process that evolves over time). Conceptually, a spatial example would suggest that a neighborhood with a low risk of DENF would have an increased risk of DENF if an adjacent neighborhood reported a high risk of DENF (dependence/ autocorrelation). Statistical inference is derived from Markov chain Monte Carlo (MCMC) simulations. 56 Markov chain Monte Carlo is a popular technique for Bayesian inference when posterior distributions are not available in closed forms. We used MCMC to generate samples from the posterior distributions induced by our ST-CAR models and used them for parameter and density estimations. For this study, we selected 220,000 MCMC samples; initial 20,000 samples are removed as burnin, and the thinning parameter is set to 10 (keeping every 10th value and removing all others), which thins the samples to reduce autocorrelation of the Markov chain. As a result, 20,000 samples are used for statistical inference. The deviance information criterion (DIC) was used for model assessment and comparison, where lower values of the DIC indicate a better model fit. The results of two models are presented in Section 3: model 1 (DENF with no lags) and model 2 (DENF with lagged weather variables). Using CARBayesST 24 in R, the two Poisson GLM models (DENF with no lags and DENF with lags) were each fitted to the ST-CAR model described earlier.

RESULTS
ST-CAR results. The following subsections contain summaries of model results, relative risk estimates for each independent variable, and mapping the posterior estimates of DENF rates in Cali for particular time periods (temporal cross sections).
Model 1. Table 6 (left) summarizes the results of model 1 (DENF with no lags). The 95% credible intervals do not contain SPACE-TIME MODELING OF NEIGHBORHOOD-LEVEL RISKS FOR DENGUE a value of 0, indicating that all the covariates exhibit influential relationships with DENF risk at the neighborhood level in Cali between January 2015 and December 2016. The spatial autocorrelation value of 0.98 indicates that there is very strong spatial dependence of the data after adjusting for covariate effects. The temporal autocorrelation value of 0.11 suggests that there is some presence of temporal dependence of the data after adjusting for covariate effects. Therefore, DENF risk in Cali at the neighborhood level is influenced by DENF rates and covariates in surrounding neighborhoods and time periods (weeks). Table 6 (left) also shows the relative risk estimates (%) of DENF in Cali between January 2015 and December 2016. The results suggest a 4.7% increase in DENF risk for PC1, a 4.7% decrease for PC2, a 12.9% increase for proximity to plant nurseries, a 5.5% increase for proximity to tire shops, a 1.5% decrease for population density, an 11.6% decrease for proximity to rivers/ravines, a 36% increase for tree density, a 635% increase for average temperature, a 26.6% increase for days with maximum temperature > 32°C, a 139% increase for relative humidity range, a 45.8% decrease for total rainfall, a 16.8% decrease for total rainy days, a 76.4% increase for cool days, and a 73.6% decrease for warm days when adjusted for other variables. The results will be further explained in the Discussion.
Model 2. Table 6 (right) summarizes the results of model 2. The 95% credible intervals do not contain a value of 0, indicating that all the covariates exhibit influential relationships with DENF risk at the neighborhood level in Cali between January 2015 and December 2016. The spatial autocorrelation value of 0.98 indicates that there is very strong spatial dependence of the data after adjusting for covariate effects. The temporal autocorrelation value of 0.11 suggests that there is some presence of temporal dependence of the data after adjusting for covariate effects. Therefore, DENF risk in Cali at the neighborhood level is influenced by DENF rates and covariates in surrounding neighborhoods and time periods (weeks). The DIC (66,964.16) of model 2 is slightly lower than that of model 1 (DENF with no lags, DIC = 67,055.98). In general, the lagged weather variables in model 2 shrunk the CIs of the coefficients (most notably for average temperature and days temp max), which decreases the uncertainty of our model's risk estimates. Table 6 (right) also shows relative risk estimates of DENF in Cali between January 2015 and December 2016. The results suggest a 7.6% increase in DENF risk for PC1, a 3.9% increase for PC2 (negative relationship [decreased risk] in model 1), a 19.8% increase for proximity to plant nurseries, an 11.8% increase for proximity to tire shops, a −16% decrease for population density, a 12.9% decrease for proximity to rivers/ ravines, a 30% increase for tree density, a 25.2% decrease for average temperature, a 125.1% increase for days with maximum temperature > 32°C, an 85% increase for relative humidity range, a 91.2% decrease for total rainfall, a 50.7% decrease for total rainy days, a 25.3% decrease for cool days, and an 89.3% increase for warm days. The negative to positive relationship between DENF and PC2 observed when comparing models 1 and 2 is difficult to interpret. This could be because of the lagged weather variables affecting the posterior estimates. The magnitude of PC2 (low RR) is much lower than that of other independent variables; therefore, we hypothesize that people who spend more time at home (PC2) may or may not be more susceptible to DENF, and further investigation is required.
Mapping the posterior estimates of DENF. When comparing the six temporal cross sections, July 2016 experienced the highest DENF risk (estimated posterior mean values) after accounting for the 14 covariates (including the lagged weather variables). Interestingly, some locations with high rates of DENF are classified as middle-(3 or 4) or highstrata neighborhoods (5 or 6). These middle-and high-strata neighborhoods are adjacent to low-strata neighborhoods (1 or 2), which suggests that there is spatiotemporal dependence between them. In other words, there is evidence that middleand high-strata neighborhoods are at higher risk when surrounded by lower-strata neighborhoods after accounting for the covariates in the models. Some of the highest proportion of cases were not only observed in the eastern portions of Cali (now shown here) but also include some of the highest and most densely populated neighborhoods of the city. After computing rates per 1,000 persons (Figure 3), the eastern portions of Cali (which include some of the poorest neighborhoods) have lower reported rates of DENF than central, southern, and western regions of Cali.

DISCUSSION
This study is the first of its kind to model space-time risk at the neighborhood and weekly levels of DENF across 2 years of disease surveillance data, while also incorporating temporally lagged weather variables in a ST-CAR approach. Coupling the lagged weather variables with the spatial covariates of DENF risk, the models include neighborhood-level effects explaining where and why certain locations are more at risk than others. A quintessential spatiotemporal model decomposes the variability in the outcome in large-scale variations (the mean function) and small-scale variations (the spatiotemporal random effects). 22 Independent variables were used to characterize large-scale variations, whereas spatiotemporal autocorrelation was used to characterize small-scale variations. The lagged independent variables modified the large-scale variations, and, for these data, it seems they have no effect on small-scale variations. The predictive power of the models remains the same, very close DIC values (see Table 6).
That is why we are seeing quite robust estimates of spatial autocorrelation parameters in both models. There are many key findings that warrant further investigation and explanation.
Influence of socioeconomic and environmental factors. First, there is very strong evidence that there are both spatial and temporal dependences between DENF risk and the significant covariates for adjacent neighborhoods in Cali. Although DENF had a much lower temporal autocorrelation (value of 0.11 for both models 1 and 2), this can be explained by the distribution of cases between 2015 and 2016-there were three distinct peaks, but DENF remains a persistent threat because of the four serotypes of the virus. The very strong spatial autocorrelation values (although extremely high) for DENF suggest that outbreaks in adjacent neighborhoods are strongly related (i.e., living next to a neighborhood with high cases will strongly influence DENF risk and cases in your neighborhood of residence).
When examining the results of the three socioeconomic covariates (PC1, PC2, and population density), the increased risk of DENF for PC1 is an interesting finding. This corroborates with Delmelle et al., 30 who suggest that in the southern part of Cali, houses are typically bigger with relatively larger yards, which may provide a suitable habitat for Aedes to breed. Neighborhoods with a high proportion of people in the PC1 category (e.g., employed, older, and more educated) are also typically adjacent to middle-or lower-strata neighborhoods (with the exception of the extreme south), which also may increase the risk of disease due to the very strong evidence of space-time dependence between the locations, as suggested by the models. An increased risk of DENF was reported for neighborhoods with higher proportion of individuals in the PC2 category (i.e., work from home and students). This finding corroborates with strong evidence that Aedes proliferates in and around homes, [57][58][59] and it has also been found that cases can substantially decline if Aedes trap interventions are put in place in at-risk communities. 60 Another unexpected result was the negative relationship between population density and DENF. One explanation can be that some of the densely populated neighborhoods in the eastern part of Cali have a high concentration of Afro-Colombian population, which has been suggested to be less susceptible to the viruses. 61,62 Furthermore, there is evidence that shows that low-density areas with poor infrastructure may have increased Aedes presence 63 and DENF's complex immunology, and the herd immunity resulting after infection from the viruses 64 may have contributed to these patterns. Among other factors, high density of populations may not necessarily be a main risk factor of VBD transmission. 65 Although increases in population and urbanization will undoubtedly increase the risk of VBD transmission, the true effect of population density may vary at fine spatial levels (e.g., neighborhoods). Public health authorities typically carry out fumigation and education programs in neighborhoods with higher population densities, which may result in a lack of targeted interventions in less dense areas with Aedes populations (e.g., sewers and green areas). Vector control efforts in Cali include storm sewer control, spraying, use of guppy fish, visits to health facilities and homes, and community educational campaigns. 35 Closer proximity to plant nurseries, tire shops, and higher tree density all exhibited an increased risk of DENF. Plant nurseries and tire shops are common breeding grounds of Aedes; thus, neighborhoods within close proximity are generally at a higher risk of disease transmission. Tree density may also be a significant risk factor of Aedes presence because studies have shown that high tree shade density stimulates breeding, and tree holes are suitable water containers where Aedes have been found in abundance. 66,67 Closer proximity to rivers and ravines (i.e., any moving bodies of water) resulted in a significantly lower risk of DENF. Aedes require stagnant water as a breeding ground; therefore, the flowing water of a river or ravine would prove to be an unsuitable habitat for the mosquitoes. Although flooding events during the rainy season could create stagnant water sources surrounding the rivers, however, floods may also act as a disruptive force on Aedes habitat by flushing out their breeding sites and eggs. 68,69 Further research using remotesensing techniques (such as flow analysis) could provide insight to areas prone to stagnant water.
Influence of weather factors. In general, the use of optimally lagged weather variables shrunk the CIs about the mean model coefficients and relative risk estimates. Specifically, the individual relationships between each lagged weather covariate and disease risk became either consistently positive or consistently negative (except for average temperature). Moreover, because the optimal lags were different among the weather covariates, our results imply that individual covariates impact different stages of the vector life cycle (see Table 5). Given that such relationships could inform future space-time modeling and mitigation efforts, the most likely physical connections between local weather conditions and the Aedes life cycle (Tables 4 and 5) are discussed in greater detail based on our model 2 results (Table 6).
Variables with an optimal 5-week lag (Tavg, DTRmax, RainD, and WarmD) are most influential on larval development. Both DTRmax and WarmD exhibit the expected positive relationship (Table 5) and large predictive importance based on relative risk estimations (Table 6), whereas RainD exhibits the expected negative relationship with moderate importance. Despite Tavg having an unexpected negative relationship with DENF, its predictive importance is relatively small. Five weeks before above-average DENF rates, the weather is often characterized by multiple days with short-lived rain showers (RainD is above average, its correlation with DENF is negative, and the regression coefficient is negative), yet each day experiences sufficient sunshine to allow the daily maximum temperature to exceed 32°C (WarmD is above average, its correlation with DENF is positive, and the regression coefficient is positive).
The short-lived rain showers also induce evaporative cooling that lowers the daily minimum temperature and increases the daily temperature range (DTRmax is above average, its correlation with DENF is positive, and the regression coefficient is positive), leading to slightly cooler, but above-average daily mean temperatures (Tavg remains above average, its correlation with DENF remains positive, but the regression coefficient is slightly negative, and the relative risk is small). Overall, regular rainfall combined with average-to above-average temperatures produces numerous stagnant pools within a favorable thermal environment for prolific larval development.
Variables with an optimal 3-week lag (RainT and RHrng) are most influential on the gonotrophic cycle. Both RainT and RHrng exhibit the expected relationships (Table 5) with similar moderate levels of predictive importance (Table 6). Weather conditions 3 weeks before above-average DENF rates are often characterized by minimal rainfall (RainT is below average, its correlation with DENF is negative, and the regression coefficient is negative) and clear skies, which allows the relative humidity to fluctuate between small daytime values and large nighttime values (RHrng is above average, its correlation with DENF is positive, and the regression coefficient is positive). Overall, the relatively dry conditions maximize solar heating, minimize evaporative cooling, and promote the warm temperatures most favorable for Aedes feeding.
Finally, variables with an optimal 2-week lag (CoolD) are most influential on the gonotrophic cycle and extrinsic incubation. Specifically, CoolD exhibits the expected negative relationship (Table 5), but its relative importance is small and roughly equivalent to that of Tavg. Weather conditions 2 weeks before above-average DENF rates often exhibit above-average temperatures (CoolD is below average, its correlation with DENF is negative, and the regression coefficient is negative), whereby the warmer temperatures will accelerate both vector feeding and viral replication within the vector, which increases the potential for transmission to humans.
Overall, such consistent multi-lag relationships reinforce the idea that DENF outbreak dynamics are dependent on a complex combination of weather conditions 2-5 weeks prior (Eastin et al. 20 ). Careful monitoring of weather conditions within this time window could optimally inform any subseasonal DENF mitigation efforts. Moreover, it should be noted that a moderate El Niño occurred during our 2015-2016 study period, 70 and there is strong evidence that major DENF epidemics are more severe during El Niño events. 20,71 Therefore, any sub-seasonal mitigation efforts should also account for interseasonal climate variability.
Limitations. Despite the strengths and contributions of this research, there are notable limitations and areas of future work that are worth discussing. First, the underreporting of cases and unmatched addresses during the geocoding process likely undermines the true burden of DENF. Underreporting is also a major issue in the low-strata neighborhoods and also not uniform across the city, whereas surveillance systems can improve the identification of risk factors throughout the city. 72 Second, the socioeconomic and demographic data were a mix from the Colombian National Census and 2010 population estimates. Colombia recently administered a new national census (the first since 2005) but is currently unavailable. Using 2005 and 2010 data for this study will bias the results, but the neighborhood classifications (strata) mostly remained unchanged. The uncertainty resulting from using outdated census data is a common limitation found in many studies in Latin America and developing countries. Several articles examining patterns of DENF in Colombia have recognized outdated population census being a critical issue that could affect the findings of their research. [73][74][75][76][77][78] Until more recent census are available, different population growth models could address this issue, or relying on disaggregated population projection, using dasymetric mapping. 79,80 Third, including vector surveillance data (presence/absence) in each neighborhood would improve the accuracy of the relative risk estimates. 17 Fourth, the spatial weight matrix only considered adjacent neighborhoods as "neighbors"; we recognize that individual activity spaces expand far beyond locations nearby their home. Future research can implement different spatial and temporal weight matrices for sensitivity analysis purposes. Fifth, the weather conditions and severity of outbreaks for DENF varied between 2015 and 2016, which may have affected the model results. Future work can disaggregate the years and run two separate models for further examination. Sixth, the space-time patterns of the DENF outbreaks did not exhibit much seasonality, which is likely because of only using 2 years of data. Further work can use 5-10 years of DENF and weather data to detect potential seasonal patterns of the epidemics. Finally, weather variability is represented by a single weather station (and thus limited to the temporal domain).
Future work would benefit from multiple weather stations that can document spatial variability across the region.
Opportunities. Chikungunya and Zika are transmitted by the same vector (Aedes), and further investigation can develop a multivariate space-time (MVST) CAR model to examine which neighborhoods are at higher risk for one disease, two diseases, or all three concurrently. Multivariate space-time approach in CAR modeling is still in its infancy stages, 24 and the development of such models can more accurately examine and compare the co-occurrence of diseases transmitted by the same vector. A long-term goal is to develop a multiparameter early warning system (EWS) that informs public health officials and motivates effective vector surveillance and control measures. The space-time models and methods described herein represent progress toward that goal. However, effective EWS development would require larger and more comprehensive databases (for both robust model development and independent validation) than those currently available. As noted earlier, an EWS system would benefit from longitudinal information regarding socioeconomic, demographic, and environmental variables combined with spatial information regarding weather variability across the region.

CONCLUSION
A ST-CAR modeling approach was used to examine significant socioeconomic, demographic, environmental, and meteorological risk variables of DENF in Cali, Colombia, during 2015 and 2016. The temporally lagged weather covariates can significantly estimate when risk of transmission is highest, and the spatial covariates can help explain the differences in disease risk at the neighborhood level. Adding weather and climate data to a space-time model can improve disease surveillance, especially for VBDs that require specific conditions for transmission to occur. This study demonstrated that there was strong spatial and temporal dependence between adjacent neighborhoods and time periods, which provides strong evidence that DENF transmission is influenced by characteristics and phenomena occurring in surrounding locations. We also provide evidence that DENF is not just a disease of the poor; although risk factors may be higher in neighborhoods of lower socioeconomic status, we have shown that the transmission dynamics of DENF are place and temporally based. Despite this study being retrospective in nature, the modeling approach can be applied in a contemporary surveillance setting when significant outbreaks have not yet occurred, highlighting at-risk areas to help promote proactive community health and improve public health educational campaigns and targeted interventions. We hope that this research influences further small-area space-time analysis because we support the notion that disease prevention (in general) should start at the neighborhood and community levels.