|
|
||||||||
| ABSTRACT |
|
|
|---|
| INTRODUCTION |
|
|
|---|
An important epidemiologic feature of schistosomiasis is its focal distribution, which is governed by deficient access to sanitation and clean water, occurrence of suitable freshwater habitats for the intermediate host snail and its abundance within these, and human water contact activities.68 Along with the initiation of the Schistosomiasis Control Initiative (http://www.schisto.org) in 2002, new control efforts have been launched in sub-Saharan Africa. The goal in high-burden areas in Africa is to accomplish morbidity control facilitated by regular administration of praziquantel.3,7 These efforts are in line with recommendations set forth by the World Health Organization to systematically treat communities with schistosome infection prevalences
50%.1 As an initial step of a control program, it is thus necessary to identify high-risk communities to target chemotherapy interventions. Remote sensing (RS) of environmental features and geographic information system (GIS) platforms, coupled with Bayesian spatial statistics are powerful tools for risk prediction and mapping of schistosomiasis and other helminth diseases.911 One of the latest RS products is a digital elevation model (DEM), which is facilitated by the launch of the shuttle radar topography mission (SRTM) in early 2000.12 The digital topographic data acquired are freely available at three arc-seconds for Africa. Before, such data were mostly generated from optical stereo data, which are often affected by cloud cover and lack of sunlight.13
The purpose of this study was to derive RS environmental data, to extract variables with a GIS software, and to test their suitability for spatially distributed risk prediction of intestinal schistosomiasis caused by Schistosoma mansoni, focusing on a 79 x 76 km area in Côte dIvoire. The accompanying risk map can guide control interventions and spatial targeting against this neglected tropical disease.
| MATERIALS AND METHODS |
|
|
|---|
The climate is tropical wet-dry (Aw; groupings of Köppen-Geiger-Pohls climatic types) and has two distinct seasons; a wet season from March to September and a dry season from October to February. The annual precipitation is between 500 mm and 1,750 mm, with most of the rainfall occurring between June and September. The average temperatures are 1920°C during the winter months and 2427°C during the summer months.16
Demographic and parasitologic data. The demographic and parasitologic data presented were obtained during cross-sectional surveys conducted between October 2001 and February 2002. All schools in the two education districts of Man, with the exception of schools in the district town and the ones with < 100 pupils, were included in the surveys. Overall, 57 schools with 5,448 school children registered in grades 35 were enrolled for questionnaire and parasitologic surveys.
The name, age, and sex of school children were obtained from existing education registries for the school year of 20012002. School children were then examined for S. mansoni, soil-transmitted helminths, intestinal protozoa, and Plasmodium infections. We focus on S. mansoni. Details of field and laboratory investigations have been described in detail elsewhere.10 Briefly, a single Kato-Katz thick smear was prepared from each stool sample and examined under a light microscope by experienced laboratory technicians. The number of S. mansoni eggs was counted and recorded.
School children who were egg positive for S. mansoni were treated with a single 40 mg/kg oral dose of praziquantel, and those with a soil-transmitted helminth infection were given 400 mg of albendazole in a single oral dose.1,3
Drainage systems and stream order. Runoff travels from higher to lower altitudes and usually becomes organized in a branched network of stream channels, which then form a drainage system. We focused on the overground movement of water across a surface. This is of interest because surface waters serve as the habitat for the intermediate host snail of S. mansoni, namely Biomphalaria pfeifferi.17
Between drainage basins on the crests of ridges are watershed boundaries. A watershed is an area that drains water to an outlet. All run off within a watershed will be routed to the same outlet. An outlet is the point at which water flows out of a watershed, usually the point with the lowest elevation along the boundary of the drainage basin. The catchment and its corresponding stream network can be visualized as a pear with a tree growing from the stem inside the pear, the tree branches being the streams and the base of the stem being the outlet.16
Water flowing across a surface will always be routed in the steepest down slope direction. To obtain a flow direction raster from a DEM, we determined the flow direction for every single grid cell. There are eight valid output directions, relating to the eight adjacent cells into which flow could travel. This approach is commonly referred to as a D8 (eight-direction) flow model and follows the methodology of Jensen and Domingue.18 Once the flow direction of each grid cell is determined, it is possible to identify which upslope cells flow into which downslope cells.
To extract streams, we used a flow direction raster where each cell is assigned a value equal to the number of grid cells flowing into it, which then becomes an accumulation raster. All cells with a value greater than a selected threshold are considered stream channels. To delineate catchments, outlet points need to be identified. We used the stream network, considering the junctions where two streams or network segments flow together, and thus form outlets. Following this approach we obtained a watershed for each stream segment with the stream junctions as the outlet points.18 We used stream ordering of Strahler;19,20 two headwater streams or alternatively tributaries are given the order 1, and the resulting stream after the confluence of the two tributaries has order 2. If two streams of the same order
join, the continuing stream will be given the order
+ 1. However, if two streams flow together with a different order, the downstream link will keep the order of the higher stream.
Digital elevation model. The DEM used in our study is a raster representation of a continuous surface, originating from the SRTM, referring to the earth surface elevation. Data from SRTM version 2 (also called the finished version) were downloaded from ftp://e0srp01u.ecs.nasa.gov/srtm/. For the scientific community, digital topographic data is available for 80% of the earths surface with a 90-meter spatial resolution.21 The Landsat ETM+ scene was acquired from the U.S. Geological Survey (USGS) and was taken in December 2002 during the dry period, which we used for verification purposes.
The necessary tiles to cover the area under investigation were mosaicked and missing values were replaced with ENVI version 4.0 (Research Systems, Inc., Boulder, CO) and projected to WGS 84 UTM 29 N. Errors in DEMs are usually classified as either sinks or peaks. A sink is an area surrounded by higher elevation values and an area of internal drainage. They may be natural, particularly in glacial or karst areas, but most are artifacts from sampling errors. Therefore, we removed sinks before further processing steps were carried out.
With the ArcHydro Tools, public domain utilities developed by the Center for Research in Water Resources (University of Texas, Austin, TX), the Environmental Systems Research Institute (ESRI), and ArcGIS version 9.0 (ESRI, Redlands, CA), watersheds and rivers were delineated and ordered according to Strahler.19,20 For the delineation of the catchments, a threshold value of 60,000 was applied to derive a less dense stream network on the flow accumulation raster to derive the outlets. The stream ordering was done with a stream network that was derived with a threshold of 2,000. When we applied these threshold values, the largest streams in our study area were assigned the order four and the surveyed schools are located in four different catchments, which were labeled arbitrarily, i.e., catchments I, II, III, and IV. The hydrologic modeling was carried out in an extended area exceeding 100,000 km2 with the area under investigation approximately in the center. This was done to take into account that, for example, a nearby mountain slope outside the study area may still contribute a relevant amount of precipitation to the area.
Statistical analysis. School children were subdivided into two age groups (610 years and 1116 years). The chi-square test was used to compare proportions between groups in STATA version 9.0 (Stata Corporation, College Station, TX). Demographic and environmental covariates were fitted in logistic regression models for S. mansoni infection. Covariates significant at a level of 15% (P value based on the likelihood ratio test) were fitted in multiple Bayesian logistic regression models using WinBUGS version 1.4 (Imperial College and Medical Research Council, London, United Kingdom).
Bayesian spatial models. For the spatial modeling of S. mansoni infection, we considered both stationary (i.e., spatial correlation was modeled as a function of distance only; models 1 and 2) and non-stationary (i.e., spatial correlation was modeled as a function of both distance and location; models 3 and 4) underlying spatial dependence. We followed the same approach as outlined in our previous studies; thus, this approach introduced location-specific random effects, which model a latent spatial process.
Let Yij and pij be the status and probability of S. mansoni infection, respectively, of school child j in village i. We assumed that Yij arises from a Bernoulli distribution, Yij ~ Be(pij) and modeled covariates Xij and village-specific random effect
i on the log it(pij), that is log it(pij) = XijTß +
i, where ß is the vector of regression coefficients. We introduced the spatial correlation on the
is by assuming that
= (
1,
2, ... ,
N)T has a multivariate normal distribution,
~ MVN(0,
), with variance-covariance matrix
. We also assumed an isotropic spatial process, i.e.,
kl =
2 exp(
dkl), where dkl is the shortest straight-line distance between villages k and l,
2 is the geographic variability known as the sill, and
is a smoothing parameter that controls the rate of correlation decay with increasing distance. The range is defined as the minimum distance at which spatial correlation between locations is less than 5%, and it is calculated as 3/
for the exponential correlation structure we have adopted.22
To take into account non-stationarity, we partitioned the study area in K ecologic sub-regions, i.e., water catchment, and assumed a local stationary spatial process
k = (
k1,
k2, ...
kN)T in each ecologic sub-region k=1, . . . ,K, with multivariate normal distribution, and variance-covariance matrix
k,
k ~ MVN(0,
k) and (
k)ij =
k2 corr(dij;
k). We then viewed spatial correlation in our area as a mixture of the different spatial processes and modeled the spatial random effect
i at location i as a weighed average of the sub-region-specific (independent) stationary processes as follows:
i =
k=1Kaik
ki where the weights aik are decreasing functions of the distance between location i and the centroids of the subregions k.23,24 Under the above specifications,
has a multivariate normal distribution,
~ N(0,
k=1kAkT
kAk) with Ak = diag{a1k, a2k, . . . , ank}.
Given our Bayesian modeling framework, a vague normal prior distribution for the ß parameters with large variances (i.e., 10,000), inverse gamma priors for
k2, and uniform priors for
k, k = 1, . . . , K were chosen. Markov chain Monte Carlo simulation was used to estimate the model parameters.25 We ran a single chain sampler with a burn in of 5,000 iterations. We assessed convergence by inspection of ergodic averages of selected model parameters. The deviance information criterion (DIC) was used to assess the goodness-of-fit of the different models.26 The smaller the DIC, the better the model fit. Finally, Bayesian kriging was used to generate smooth risk maps for S. mansoni infection prevalence with covariates from both multivariate stationary and non-stationary models.27
Model validation. For the model validation, a training sample from the current database was used. From the 57 schools, the Bayesian spatial model was fitted in 40 (70.2%) randomly selected ones. The remaining 17 schools (29.8%) were used for model validation. The accuracy was assessed by comparing the model prediction and observed prevalence for the 17 school locations. The predicted S. mansoni infection prevalence was defined as correctly predicted when the observed prevalence for a school was within the 95% Bayesian credible interval (BCI) or within the 75%, 50%, 25%, and 5% BCIs, respectively, which resulted from the predictive posterior distribution of that location.28
| RESULTS |
|
|
|---|
|
2 = 8.15, degrees of freedom [df] = 1, P = 0.004). The infection prevalence in the younger age group was 37.0% and that in the older group was 41.8% (
2 = 11.37, df = 1, P = 0.001). We found a highly statistically significant difference with regard to S. mansoni infection prevalence and catchment area; the prevalence in catchments I, II, III, and IV was 11.2%, 38.3%, 59.9% and 62.9%, respectively (
2 = 660.20, df = 3, P < 0.001), as well as stream order; schools in close proximity to stream order 1 had a mean infection prevalence of 22.6%, whereas those in proximity to stream order 3 showed a 2.6-fold higher infection prevalence (Table 1
Association between S. mansoni and demographic and hydrologic indicators.
Table 2
shows the results of the bivariate logistic regression analysis. Altitude, stream order, and catchment were all significantly associated with the infection prevalence of S. mansoni. The odds of a S. mansoni infection decreases with increasing altitude (odds ratio [OR] = 0.48, 95% confidence interval [CI] = 0.440.53). School children living near a stream with orders 2 or 3 were significantly more likely to be infected with S. mansoni than their counterparts living near a stream with order 1. The respective ORs were 3.10 (95% CI = 2.723.52) and 6.10 (95% CI = 4.917.57). Pupils attending schools located in catchments II, III and IV had a significantly higher odds of S. mansoni infection compared with schools in catchment I (catchment II, OR = 4.91, 95% CI = 3.996.03; catchment III, OR = 11.81, 95% CI = 9.5214.65; and catchment IV, OR = 13.38, 95% CI = 9.5618.72).
|
Spatial analyses.
As shown in Table 2
, the geographic variation of the S. mansoni infection prevalence was largely explained by the co-variates age, sex, altitude, and stream order, regardless of whether multiple logistic stationary regression (models 1 and 2) or non-stationary models (models 3 and 4) were used.
To model the sub-region specific spatial processes in models 3 and 4, we only used the centroids of catchments I, II and III because catchment IV only includes three schools. These schools were at the border with catchment III (Figure 1
); thus, for subsequent analyses they were accounted for by this catchment area. As shown by almost identical DIC values, the two stationary and the two non-stationary regression models showed similar model fits.
|
2) in one of the three sub-regions.
Model validation.
Table 3
shows the number and percentage of the test locations with observed S. mansoni prevalences that fell into each of the selected BCIs of the posterior predictive distribution. Model 1 predicted the lowest percentage of test locations when a 95% BCI was considered, namely 88.2% in comparison with the other three models where each model correctly predicted 94.1% of the test locations. Model 2 predicted the highest percentage of test locations when a 75% BCI was considered, whereas model 3 predicted the highest percentage of test locations (35.3%) at a narrow BCI of 25%. At the narrowest BCI (i.e., 5%), all four models predicted only one test location correctly.
|
|
|
| DISCUSSION |
|
|
|---|
We were unable to find recent and georeferenced topographic maps from the area under investigation at the desired scale and quality, which we would have used for digitizing rivers and other water bodies. In the absence of high-quality topographic maps, we also considered the use of conventional data derived from optical space-borne sensors to delineate lakes and rivers. However, the scenes were mostly compromised by cloud cover, smoke from forest fires, or lack of sunlight.30 In addition, such scenes at spatial resolution < 15 meters (e.g., Landsat, ASTER, SPOT) are often only available at relatively high costs, but would have a resolution approximately six times higher or more than the SRTM DEM. This led us to an alternative approach, which we applied here using readily available SRTM data to delineate rivers. When we followed this approach, additional advantages became apparent, such as the extraction of watersheds and altitude and estimation of the slope.
On the basis of the current experiences, we encourage public-health workers to incorporate SRTM data or, more generally, high-resolution DEM data, in their work. The life cycles of many parasitic diseases include vectors or intermediate hosts, for which water bodies serve as their habitats or their breeding sites. Thus, these SRTM data not only can be used to extract the elevation and the slope, which has been done or suggested,10,3133 but also to calculate further valuable features for disease mapping and spatial prediction.
The nine SRTM DEM tiles that were mosaicked to cover the whole study area in the western part of Côte dIvoire proved to be of good quality. For example, comparisons of the height details of mountain peaks from a topographic map and the DEM showed good agreement. Comparisons of the derivates from the DEM, e.g., the delineated rivers or the shaded relief, correspond with a LANDSAT enhanced thematic mapper plus (ETM+ taken in 2002 and acquired from the USGS) scene used for verification purposes.
One shortcoming of the SRTM DEM used here is that the radar signal is backscattered from various hard objects such as buildings and trees that project above the earths surface. To be more precise, the model that we used is a digital surface model rather than a DEM. The accuracy of the hydrologic information extracted from the DEM is directly related to the quality and resolution of the DEM. For example, it is possible that flow paths of extracted rivers are influenced by forests with a dense tree canopy. In addition, the resolution of the DEM does not capture all the hydrologic characteristics, which determine the flow paths of streams, e.g., watershed boundaries are not automatically delineated with a high accuracy if multiple shallow drainages cross an area of low relief.18 Another shortcoming is that the stream order and the elevation are correlated. Low stream orders are more likely to occur in high areas and vice versa.
Biomphalaria pfeifferi, the intermediate host snail of S. mansoni in the current study area, is an aquatic species.17 This means that suitable habitats for this snail have to be perennial freshwater bodies and cannot be intermittent or ephemeral. Thus, B. pfeifferi can only migrate along streams and is not able to cross watershed boundaries. Additionally, the water body should be stagnant or the flow velocity should not exceed 0.3 meters/second.34,35 The slope of streams has been suggested as a proxy for the water velocity.32 However, spatial modeling of this covariate did not show a significant association to S. mansoni infection prevalence in the present study area.10 The stream order was considered as an additional potential proxy. Generally, stream order correlates with gradient, drainage area, channel widths, and discharge.20 Beyond structural changes in the stream channel, there are observable changes in stream ecosystems from the headwaters to the mouth. Because stream order increases with the downstream distance it might be used as a proxy for the suitability or quality of habitats for B. pfeifferi. Our assumption was that the velocity decreases with increasing stream order, which means that the living conditions for the intermediate host snails become more suitable as the stream order increases. Streams with a low order are more likely to dessicate during the dry season. Thus, there is a higher probability that these streams represent less suitable habitats for B. pfeifferi. This is one possible explanation why children living near a stream with a low stream order were at a significantly lower risk of a S. mansoni infection than children living near a stream with a high order.
The associations between the prevalence of S. mansoni and stream order showed that ordering according to Strahler19,20 with threshold values of 2000 acknowledge our expectations. We applied different threshold values and also a different stream ordering method, namely that of Shreve.36 The odds of a S. mansoni infection increased with increasing stream order.
During the hydrologic modeling to derive streams and to order them after an approach suggested by Strahler, a vast amount of data was generated. For example, areas of all derived watersheds and the length of the draining river within the according watersheds. These two hydrologic parameters could be used to estimate the drainage density of each watershed, and thus could be used for further statistical analyses. The topographic wetness index (TWI), which is a function of the upstream contributing area and the slope of the landscape,37 could also be derived and used for subsequent analyses. The TWI has been used before, for example, to identify annual net primary production, vegetation patterns38 or the distribution of Echinococcus multilocularis infections of foxes in Germany.39 These variables could be uniquely attributed to the watershed and could replace our arbitrarily-set watershed labels. An interesting feature of the spatially distributed models is that the respective ORs were higher than in the corresponding non-spatial models. In addition, the 95% BCIs were somewhat larger, which is a result of the introduction of the spatial component. When we compared the ORs and the 95% BCIs of the stationary with those of the non-stationary spatial models, it appears that they are virtually the same because the differences in the DIC values of the different models were negligible.
A next logical step is to model stream flow and/or the water velocity in conjunction with rainfall data. Additional environmental data will be needed, such as measurements from stream gauging stations, soil infiltration rates, and information about the stream channel morphology. Obtaining this kind of data for sub-Saharan Africa and other developing regions represents a formidable challenge. In this context RS data hold promise to complete missing data and to enhance the existing models. Alternatively, there is a need to validate other proxies for missing information.
In conclusion, the approach presented in this report only uses a selected set of covariates for multivariate Bayesian geostatistical modeling of risk profiles and spatial prediction of the distribution of S. mansoni, namely two age categories, sex, altitude, stream order, and water catchments. We achieved remarkable results because the predicted risk showed a good accuracy with the observed infection prevalence of S. mansoni following a rigorous model validation approach. The data collection and preparation is relatively straightforward, which is an important feature for broader public-health application. We encourage other groups to adopt and further develop our spatial risk profiling approach to different geographic areas and other neglected tropical diseases to facilitate spatial targeting of control interventions in a timely, equity-based, and cost-effective manner.
Received December 7, 2006. Accepted for publication February 6, 2007.
Acknowledgments: We thank the education officers, directors, teachers, children of the schools surveyed, and the field and laboratory team (A. Allangba, A. Fondio, K. L. Lohourignon, F. Sangaré, B. Sosthène, and M. Traoré) for their participation in the study.
Financial support: This study was supported by the Swiss National Science Foundation through grants to C. Beck-Wörner and J. Utzinger (PP00B-102883), G. Raso (PBBSB-109011) and P. Vounatsou (3252B0-102136). Giovanna Raso was supported by the Novartis Foundation.
* Address correspondence to Jürg Utzinger, Department of Public Health and Epidemiology, Swiss Tropical Institute, P.O. Box, CH-4002 Basel, Switzerland. E-mail: juerg.utzinger{at}unibas.ch ![]()
Authors addresses: Christian Beck-Wörner, Penelope Vounatsou, and Jürg Utzinger, Department of Public Health and Epidemiology, Swiss Tropical Institute, P.O. Box, CH-4002 Basel, Switzerland. Giovanna Raso, Molecular Parasitology Laboratory, Queensland Institute of Medical Research, 300 Herston Road, Herston, Queensland 4006, Australia. Eliézer K. NGoran, Centre Suisse de Recherches Scientifiques, BP 1303, Abidjan 01, Côte dIvoire and UFR Biosciences, Université dAbidjan-Cocody, Abidjan, Côte dIvoire. Gergely Rigo and Eberhard Parlow, Institute of Meteorology, Climatology and Remote Sensing, Department of Environmental Sciences, University of Basel, Klingelbergstrasse 27, CH-4056 Basel, Switzerland.
Reprint requests: Jürg Utzinger, Department of Public Health and Epidemiology, Swiss Tropical Institute, P.O. Box, CH4002 Basel, Switzerland, Telephone: 41-61-284-8129, Fax: 41-61-284-8105, E-mail: juerg.utzinger{at}unibas.ch.
| REFERENCES |
|
|
|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |