## Introduction

Malaria has become one of the greatest health concerns in Bangladesh, in common with many other developing countries. The malaria endemic region of Bangladesh is located along the eastern and southern border, with about 35 million people at risk. Malaria constrains regional and nationwide economic development, causing deaths, morbidity, and economic losses. Economists believe that vector-borne diseases are responsible for an economic growth penalty of up to 1.3% per year.1 The risk of contracting malaria in endemic areas deters investment, which has negative impact on economic productivity.

Efforts by the Government of Bangladesh to eradicate malaria have met with only limited success. To manage malaria epidemics effectively, especially in the years of their intensification, public health agencies can benefit from advance knowledge of when and where climate conditions favorable to vector proliferation could be expected, enabling epidemic containment and treatment efforts to be focused where they will be most needed.

Here, we look at remotely sensed climatic correlates to years of heavy malaria outbreaks in Bandarban district, southeastern border of Bangladesh, bounded between 22°N and 22.5°N latitude and 91.5°E and 92.5°E longitudes. This is a hilly area mostly covered with tropical evergreen forest.

## Malaria, Mosquito, and Environment

*Anopheles dirus* (*An. dirus*) is the most widespread malaria vector mosquito in forested areas of Southeast Asia.2,3 Breeding habitats of *An. dirus* include puddles on footpaths and turbulence pits at the heads of drainage gullies, which hold water for some time without supplemental rainfall.4 In Bangladesh, *An. dirus* and the malaria parasite (*Plasmodium* spp.) are mostly restricted to the hilly districts in the east and south. The forested, green hills of Bandarban district are at the center of the malaria endemic area. The forest and forest fringe areas report more than 90% of total positive cases and more than 70% of total *Plasmodium falciparum* cases in all districts of Bangladesh.5–7 Because mosquitoes are sensitive to moisture and temperature conditions, weather monitoring can give an indication of their activity level and hence the risk of malaria transmission.8 Therefore, this work investigates the application of satellite remote sensing for the characterization of weather conditions in the Bandarban district and their impacts on interannual variation in malaria incidence.

Bandarban district is located in Chittagong division, the south-eastern part of Bangladesh (http://www.bangladeshgov.org/bdmaps). Ecosystem type here is highland covered with evergreen forest. Bandarban district has sub-tropical warm, wet, and humid climate.3 Total annual rainfall amount is around 3,000 mm per year, average temperature 21–22°C, and relative humidity around 80%. Monthly temperature and humidity are stable from year to year (standard deviations are on the order of 1°C and 1%, respectively), but interannual variation in total annual rainfall can be large (1,000 mm/year). In general, two seasons are defined in the annual cycle: wet and warm during April–October and cool and dry during December–February. During the wet season, the maximum temperature can reach 30°C or above.1,9 Variation in monthly rainfall between dry and wet months is the main influence on mosquito activity. In summer, the wet, warm, and humid climate is quite favorable for mosquito activity and produces an increase in the number of malaria cases.

Three principal environmental factors for mosquito activity and malaria epidemiology are important: temperature, humidity, and rainfall. The optimum temperatures for mosquitoes' development and activity are 25–27°C. If daytime temperature exceeds 40°C, mosquitoes are less active and parasite transmission is limited.10 In general, a larger amount of rainfall stimulates mosquitoes' development. However, frequent rainfall during the monsoon period might produce a reduction in malaria transmission, because it washes out eggs and reduces chances for development of adult mosquitoes.11 In the environment of Bandarban district, *An. dirus* females stay active during the period when precipitation exceeds 50 mm per month. However, a combination of large rainfall and hot weather during June–August might reduce mosquito activity. Mosquito activity and malaria transmission are also reduced if the humidity drops below 60%.8,12,13

In our previous work, we analyzed 10 years of malaria data for Bangladesh as a whole.1 Correlation and regression analyses were used to build a predictive model using satellite remote sensing data. In this work, we extend the malaria and satellite series to 14 years, focus on a single district, investigate principal component regression for building a more robust estimator model, and quantify model skill using leave-one-out cross-validation.

## Data and Methods

Available public health data included district population; numbers of patients whose blood sample were tested for malaria and number of positive tests for malaria. Satellite remote sensing data was used in the form of vegetation indices derived from surface radiance in visible and infrared wavelengths.

### Public health data.

Malaria statistics were collected from Director General of Health, Bangladesh Ministry of Health. Malaria data were represented by annual number of people with fever or history of fever within the last 48 hours, an absence of signs of other disease, and inadequate antimalarials (or none) during the 4 weeks before present illness in local hospital. Standard blood slide examination was done by a medical officer trained in malaria microscopy, according to the World Health Organization (WHO) Guidelines and slide readers were blinded to the clinical diagnoses. Ten percent each of the positive and negative slides were reexamined blindly by the National Malaria Reference Laboratory to evaluate the accuracy of the study's slide examination results. Tests could detect both *P. falciparum* and *P. vivax*. The hospital data were aggregated by local administrative unit health centers and further by administrative districts.7,14 Available malaria statistics are the number of persons who were tested for malaria in a hospital and the number of positive malaria cases. The dynamics of annual number of persons who were tested for malaria and number of positive malaria cases for the Bandraban district during the investigated period along with population size are shown in Figure 1. Although the district population increases, the number of persons who were tested for malaria and number of positive malaria cases are slightly decreasing, indicating some success of malaria prevention measures such as bed net distribution. If the number of positive malaria cases is expressed as a percent of the number of persons who were tested for malaria, as is normally done in malaria research as an indicator of malaria incidence, a decreasing trend in malaria incidence is also seen.

_{trend}is the malaria estimated from trend, and DY is the deviation (%) from the trend. The DY is a good relative metric of malaria cases. In 1998, DY was 119% or 19% above the trend, whereas in 1992, DY was 87% or 13% below the trend. Thus, malaria outbreaks were relatively light in 1992 (lower DY), whereas relatively severe in 1998 (higher DY).15,16

### Satellite data: Vegetation health indices.

The Advanced Very High Resolution Radiometer (AVHRR) is a satellite-mounted instrument that measures radiances from the earth's surface in several visible and infrared bands, achieving global coverage several times daily at a spatial resolution of 1 km^{2}. The AVHRR-measured satellite data for solar energy reflected/emitted from the land surface were collected from the United States National Oceanic and Atmosphere Administration (NOAA) Global Vegetation Index (GVI) data set from 1992 through 2005. The AVHRR sensor counts in the visible (VIS, 0.58–0.68 μm, band1), near infrared (NIR, 0.72–1.00 μm, band2), and infrared (IR, 10.3–11.3 μm, band4 and 11.3–12.3 µm, band5) spectral regions were used to generate the GVI. Post-launch-calibrated VIS and NIR counts were converted to reflectance17 and used to calculate the Normalized Difference Vegetation Index (NDVI = [NIR − VIS]/[NIR + VIS]). The band4 counts were converted to brightness (radiative) temperature (BT).

*a*is a coefficient quantifying the share of VCI versus TCI contribution in overall vegetation health. Because this share is not known for specific location it is assumed that the shares are equal,

*a*= 0.5.19 All three indices are scaled to range from 0 (severe vegetation stress - low NDVI, hot BT) to 100 (exceptionally favorable conditions - high NDVI, cool BT).1,20 The GVI data set processed at 16 km

^{2}spatial resolution and weekly time resolution was averaged over land pixels in Bandarban, Bangladesh.17

## Results

### Correlation analysis.

We begin by examining the correlation of yearly malaria deviation from trend DY with the VH indices for each week of the year, where DY is the detrended index of malaria incidence intended to capture interannual variability in intensity of malaria outbreaks. Figure 2 shows the dynamics of correlation coefficients for DY versus VCI, TCI, and VHI for the Bandarban district. During cooler months, when mosquitoes are less active, correlation with DY is low. The beginning of the rainy season (Week 16), when mosquito activity starts, the correlation of TCI with annual malaria case index starts increasing, reaching a maximum of +0.56 at the end of June (Weeks 25–26) during first peak malaria season (May–June). After first peak malaria season, correlation is gradually decreasing to low indicating that TCI has low predictive ability. From Week 32 the correlation again increases reaching a secondary maximum of +0.53 at the end of September (Week 35) during second peak malaria season (August–September).9 After this second maximum correlation decreases to low level by the beginning of the next cool season in November. Correlations of malaria cases with VCI and VHI are lower (|*r*| < 0.50).

It is important to emphasize that the correlation of DY with TCI is positive, indicating that a larger number of malaria cases (DY is above the trend) is associated with TCI greater than 50, which indicates cooler weather. A smaller number of malaria cases (DY is below trend) is associated with lower TCI (below 50, hotter weather).1

The differences in TCI dynamics between years with relatively high and low malaria incidence can be seen by comparing TCI for individual years. For example, 1992 (year with low DY) and 1998 (high DY) TCI time series shown in Figure 3, indicate that the observed positive correlation of malaria cases with TCI indicate that above average malaria cases are associated with higher TCI (cooler thermal condition) and below average malaria cases are associated with lower TCI (hotter) during Weeks 22–27 and 33–37 and in the weeks before and after.19

### Regression analysis.

Results of multiple linear regression of DY on the Equations 5 and 6

Variable | Parameter estimate | Standard error | t value | Pr > |t| |
---|---|---|---|---|

TCI_{22} | 0.04359 | 0.33019 | 0.13 | 0.8987 |

TCI_{23} | 0.27024 | 0.50766 | 0.53 | 0.6110 |

TCI_{24} | −1.75771 | 0.83040 | −2.12 | 0.0721 |

TCI_{25} | 2.01674 | 0.83730 | 2.41 | 0.0469 |

TCI_{26} | 1.58026 | 0.93359 | 1.69 | 0.1344 |

TCI_{27} | −1.67617 | 0.61073 | −2.74 | 0.0287 |

Equation 5: R^{2} = 0.73, RMSE = 6.40, F = 3.11, P < 0.08 | ||||

Variable | Parameter estimate | Standard error | t value | Pr > |t| |

TCI_{33} | −1.66775 | 0.62139 | −2.68 | 0.0278 |

TCI_{34} | 1.15251 | 0.82263 | 1.40 | 0.1988 |

TCI_{35} | 1.53209 | 0.57741 | 2.65 | 0.0291 |

TCI_{36} | −0.6309 | 0.77003 | −0.82 | 0.4363 |

TCI_{37} | −0.19507 | 0.43840 | −0.44 | 0.6681 |

Equation 6: *R*^{2} = 0.79, RMSE = 5.25, *F* = 6.09, *P* < 0.01.

RMSE = root mean-square error.

A comparison of the relative degree of statistical significance of the model with those of the partial regression coefficients reveals multicollinearity. The overall model for Equation 6 is highly significant with *F* values of 6.09 and *P* values much smaller than 0.05 but model for Equation 5 is not significant at *P* < 0.05. The largest *P* value for a partial regression coefficient for both models are not significant at *P* < 0.05 level. This type of result is a natural consequence of multicollinearity: the overall model may fit the data quite well, but because several independent variables measure similar phenomena (vegetation index for consecutive weeks), it is difficult to determine which of the individual variables contribute significantly to the regression relationship. The TCI of neighboring weeks is highly correlated with correlation coefficients of over 0.97.

### Principal component regression.

To avoid multicollinearity, we used an alternative method of estimation, principal component regression, which results in estimation better than ordinary least squares. This alternative has the potential to produce more precision in the estimated coefficients and smaller estimation errors when the model is applied to independent data.22,23

Using principal components regression methodology, the variables in model Equations 5 and 6 were transformed into new orthogonal or uncorrelated variables called principal components of the correlation matrix. Principal components were sequentially tested for their contribution to improving the regression model for malaria cases, keeping only those that resulted in a significant (at the 0.05 level) reduction in residual variance. Once the regression coefficients for the reduced set of orthogonal variables are calculated, they are transformed into a new set of coefficients that correspond to the original or initial correlated set of variables in the model equations (5 and 6). These new coefficients are called principal component estimators.4

The first part of Table 2A shows the eigenvalues of TCI for Weeks 22–27. The “Eigenvalue” column shows the first principal component (PC) accounts for most of the total variance in interannual TCI over those weeks (5.59), the second accounts for much less variance (0.62), and the others account for much smaller fractions yet of the variance. The “Difference” column gives the difference between adjacent eigenvalues, or the rate of decrease in variances of the PCs. The proportion of total variation accounted for by each of the components is given in the “Proportion” column. The first component accounts for 93% of the total variation, a result that is typical when the original variables are highly correlated. The “Cumulative” column indicates that 99% of the total variation in the six variables is explained by four components.

Principal component results for first malaria peak (Weeks 22–27)

Eigenvalues | Proportion | Cumulative | ||||
---|---|---|---|---|---|---|

Eigenvalue | Difference | |||||

1 | 5.59215618 | 5.25778218 | 0.9320 | 0.9320 | ||

2 | 0.33437401 | 0.29885956 | 0.0557 | 0.9878 | ||

3 | 0.03551445 | 0.00857204 | 0.0059 | 0.9937 | ||

4 | 0.02694241 | 0.01823296 | 0.0045 | 0.9982 | ||

5 | 0.00870945 | 0.00640596 | 0.0015 | 0.9996 | ||

6 | 0.00230349 | 0.0004 | 1 | |||

Eigenvectors | ||||||

Prin1 | Prin2 | Prin3 | Prin4 | Prin5 | Prin6 | |

TCI_{22} | 0.404343 | −0.440092 | 0.750207 | 0.157439 | −0.166207 | 0.166144 |

TCI_{23} | 0.407259 | −0.424366 | −0.411644 | 0.447469 | 0.177628 | −0.502815 |

TCI_{24} | 0.416436 | −0.245754 | −0.425313 | −0.291000 | 0.157872 | 0.689703 |

TCI_{25} | 0.418769 | 0.095212 | 0.105669 | −0.754476 | 0.046412 | −0.482714 |

TCI_{26} | 0.409027 | 0.412989 | −0.194202 | 0.183256 | −0.767929 | 0.033529 |

TCI_{27} | 0.393138 | 0.621460 | 0.194852 | 0.295781 | 0.569237 | 0.098720 |

The second part of Table 2A (“Eigenvectors”) shows the eigenvectors for each of the PCs. These coefficients relate the components to the original variables listed on the first column and are scaled so that their sum of squares is unity. The first PC approximates an average of the weekly TCIs, with slightly larger weights to TCI_{24} (0.416) and TCI_{25} (0.418). Note that while the first PC accounts for most of the interannual variation in TCI, this does not necessarily mean that it encapsulates the relationship between TCI and malaria incidence (DY); this has to be determined by regression of DY against the PCs.

The first part of Table 2B shows the eigenvalues of TCI for Weeks 33–37. The “Eigenvalue” column shows the first PC accounts for most of the total variance in interannual TCI over those weeks (4.95), and the others account for much smaller fractions yet of the variance. The first component accounts for 98% of the total variation, a result that is typical when the original variables are highly correlated.

Principal component results for second malaria peak (Weeks 34–37)

Eigenvalues | Proportion | Cumulative | |||
---|---|---|---|---|---|

Eigenvalue | Difference | ||||

1 | 4.94689827 | 4.91321826 | 0.9894 | 0.9894 | |

2 | 0.03368001 | 0.01911162 | 0.0067 | 0.9961 | |

3 | 0.01456839 | 0.01182582 | 0.0029 | 0.9990 | |

4 | 0.00274257 | 0.00063180 | 0.0005 | 0.9996 | |

5 | 0.00211077 | 0.0004 | 1 | ||

Eigenvectors | |||||

Prin1 | Prin2 | Prin3 | Prin4 | Prin5 | |

TCI_{33} | 0.445969 | −0.566017 | 0.583334 | 0.345175 | 0.145989 |

TCI_{34} | 0.448409 | −0.314283 | −0.116108 | −0.745235 | −0.362353 |

TCI_{35} | 0.447864 | −0.107851 | −0.674776 | 0.149300 | 0.556931 |

TCI_{36} | 0.448478 | 0.295429 | −0.183769 | 0.494097 | −0.658549 |

TCI_{37} | 0.445338 | 0.694220 | 0.396417 | −0.243018 | 0.321759 |

The second part of Table 2B (“Eigenvectors”) shows that the first PC approximates an average of the weekly TCIs, with slightly larger weights to TCI_{34} (0.448) and TCI_{36} (0.448).

Table 3 shows a summary of stepwise selection of PCs for TCI for both peak malaria periods. For Weeks 22–27 TCI (Equation 5) the first and sixth PCs were significant estimators, whereas for Weeks 33–37 (Equation 6) the first, second, and third PCs were significant estimators. In each case, the first PC corresponds approximately to the average value of TCI over Weeks 22–27 and 33–37, respectively, whereas the other PCs describe trends in TCI over the period.

Selection of principal components for prediction based on stepwise regression

Model | Principal components | R^{2} | F value | Pr > F |
---|---|---|---|---|

TCI_{22–27} (Equation 5) | Prin1 Prin6 | 0.57 | 7.24 | 0.009 |

TCI_{33–37} (Equation 6) | Prin1 Prin2 Prin3 | 0.74 | 9.60 | 0.003 |

## Validation of Model

Validation is the step in which the ability of the chosen model to adequately describe the phenomenon under study, in this case interannual variation in malaria incidence, is tested using independent evidence. Because the training data is short, leave-one-out cross-validation19 (the jackknife technique) was used to verify the predictive value of vegetation indices derived from satellite imaging for malaria cases later in the season. A model (DY = *f* [TCI]) was developed with 1 year out and this model was applied to the removed year to estimate the number of malaria cases deviation from trend (DY) based on satellite data of the eliminated year. The eliminated year was then returned to the data set and the next year was removed for model development and testing. Each year data were removed one at a time and the candidate model was fit 13 times to the eliminated year. As a result of this procedure, 13 independent estimations were obtained. Each year of data was successively removed and a principal components regression model using TCI from Weeks 33–37 was fit to reduced data set employing the same criteria as those used above for fitting the entire set. Finally, an estimation of malaria cases for the eliminated year was made from the regression equation derived using data from the other years. As a result of this procedure, 14 independent estimations were obtained.

Figure 4A displays observed versus independently estimated time series of percent positive malaria cases, which shows that years of both high and low malaria incidence is generally estimated well, with *R*^{2} between estimated and observed percent of malaria cases of 0.78. The root-mean-square estimation error is 2.08 as compared with 3.28 for a naive forecast of the mean incidence of malaria cases for each year, corresponding to a 74% reduction in estimation error variance.

We also performed similar validation for the Weeks 22–27. Figure 4B displays observed versus estimated time series of percent of malaria cases with *R*^{2} between estimated and observed percent of malaria cases of 0.65.

## Discussion

We tested indices derived from satellite remote sensing characterizing moisture (VCI) and thermal (TCI) conditions derived from satellite remote sensing as estimators for estimation of malaria cases in the Bandarban district, Bangladesh, over 1992–2005. Correlation between the number of malaria cases deviations from trend (DY) with TCI was strong during June and September (main malaria season) and weak during November through May. Beginning in April (warm season starts), the correlation with TCI rapidly increases, and in June (Weeks 22–27) the correlation reaches a maximum, corresponding to the first of the two seasonal peaks in malaria transmission (before the heaviest summer monsoon rains). A second correlation maximum is reached in September (Weeks 33–37), during the second peak in malaria transmission. We found that interannual variation of malaria incidence was more sensitive to thermal (TCI) than to moisture (VCI) conditions or the composite vegetation health (VHI) index. The correlation between TCI and malaria incidence was large enough to enable statistical models to be developed that skillfully estimated interannual variation in malaria incidence based on TCI. Although this correlation is qualitatively consistent with established understanding of the meteorology and mosquito ecology in this region of Bangladesh, comparing the VHI with available surface and satellite data for traditional meteorological variables, such as temperature, precipitation, and cloudiness, should offer further insight into exactly how TCI reflects variables relevant to malaria transmission. Further work will also study the distribution of malaria cases within each year, and whether remotely sensed VHI from early in the year can provide advance warning of several weeks or months of conditions conducive to severe malaria outbreaks. This would enable better deployment of public health resources to prevent and treat malaria.

## Conclusion

Correlation and regression analysis shows that evaluation of malaria epidemic risk can be estimated with VHI. Similar models based on VHI might be calibrated to estimate malaria and other vector-borne diseases in other regions as well. The Vegetation Health Indices used in this study are available in real time (weekly) at http://orbit.nesdis.noaa.gov/smcd/emcb. Further study might include other high-resolution data such as sea surface temperature and soil moisture from satellite instruments, such as MODIS, AMSR–E, and RADARSAT, and combining satellite sensor data with ground-based weather data.

- 1.↑
Rahman A, Kogan F, Roytman L, 2006. Analysis of malaria cases in Bangladesh with remote sensing data.

*Am J Trop Med Hyg*74: 17–19. - 3.↑
Rosenberg R, Maheswary N, 1982. Forest malaria in Bangladesh. I. Parasitology.

*Am J Trop Med Hyg*31: 175–191. - 4.↑
Russel F, West L, Manwell D, Macdonald G, 1963.

*Practical Malariology*. London, UK: Oxford University Press. - 5.↑
Elias M, Rahman M, 1987. The ecology of malaria carrying mosquito

*Anopheles philippinensis*Ludlow and its relation to malaria in Bangladesh.*Medical Research Council Bulletin, Bangladesh*13: 15–28. - 6.
Ingrid F, Van B, 2004. Drug resistance in

*Plasmodium falciparum*from the Chittagong Hill Tracts, Bangladesh.*Trop Med Int Health*9: 680–687. - 7.↑
Paresul A, 2008.

*Malaria country report. Malaria and Parasitic Disease Control Unit*. Bangladesh: Directorate General of Health Services. - 8.↑
Hay I, Rogers J, Randolph E, Stern I, Cox J, Shanks D, Snow W, 2002. Hot topic or hot air? Climate change and malaria resurgence in east African highlands.

*Trends Parasitol*18: 530–534. - 9.↑
Faiz M, Yunus B, Rahman R, Hossain A, Pang W, Rahman E, Bhuiya N, 2002. Failure of national guidelines to diagnose uncomplicated malaria in Bangladesh.

*Am J Trop Med Hyg*67: 396–399. - 10.↑
Mcmichael J, Haines A, Slooff R, 1996.

*Climate Change and Human Health*. Geneva, Switzerland: World Health Organization, 29. - 11.↑
Githeko A, Lindsay S, Confalonieero U, Patz J, 2000. Climate change and vector-borne diseases: a regional analysis.

*Bull World Health Organ*78: 1136–1147. - 12.↑
Bouma M, 2003. Methodological problems and amendments to demonstrate effects of temperature on the epidemiology of malaria. A new perspective on the highland epidemics in Madagascar, 1972–89.

*Trans R Soc Trop Med Hyg*97: 133–139. - 13.↑
Pampana E, 1969.

*A Text Book of Malaria Eradication*. London, UK: Oxford University Press, 17–63. - 14.↑
Wickramasinghe R, Gunawardena M, Mahawithanage T, 2002. Use of routinely collected past surveillance data in identifying and mapping high risk areas in a malaria endemic area of Sri Lanka.

*SE Asian J Trop Med Publ Health*33: 678–684. - 15.↑
Brockwell P, Davis R, 2000.

*Introduction to Time Series and Forecasting*. New York: Springer, 15–39. - 16.↑
Salazar L, Kogan F, Roytman L, 2008. Use of remote sensing data for estimation of winter wheat yield in the United States.

*Int J Remote Sens*29: 175–189. - 17.↑
Kidwel B, 1997.

*Global Vegetation Index User's Guide*. Camp Springs, MD: U.S. Dept. of Commerce, National Oceanic and Atmospheric Administration, National Environmental Satellite Data and Information Service, National Climatic Data Center, Satellite Data Services Division. - 19.↑
Jensen R, 2000.

*Remote Sensing of the Environment: An Earth Resource Perspective*. Upper Saddle River, NJ: Prentice Hall. - 20.↑
Kogan F, Bangjie Y, Guo W, Pei Z, Jiao X, 2005. Modeling corn production in China using AVHRR-based vegetation health indices.

*Int J Remote Sens*26: 2325–2336.