|
|
||||||||
| ABSTRACT |
|
|
|---|
| INTRODUCTION |
|
|
|---|
Several researchers have investigated methods for estimating prevalence of infection from data collected from screening pools of insects. The alternative term "group testing" has been used by a number of these. Chiang and Reeves3 considered the maximum likelihood estimator (MLE) for the case of equal pool sizes. They also presented graphs giving exact confidence intervals (CIs) for selected pool sizes. Thompson4 studied the properties of the MLE for the case of equal pool sizes and suggested an optimal choice of pool size based on minimization of the mean square error. Burrows5 produced a bias reduced estimator and explored its properties. Katholi and others6 also considered the MLE and in addition gave formulas for the calculation of exact CIs suitable for hand calculation given only a table of the F-distribution. Barker7 extended the study of the MLE to the case where the sizes of the samples tested were no longer equal. In addition, she produced a simple expansion for the MLE suitable for hand calculation. She also studied moment estimators, a bias reduces estimator for the equal pool size case based on the Jackknife method, which is different from the one given by Burrows, and a Bayesian approach to the problem. Finally, she considered several different CIs based on classic methods and the Bayesian approach. Hepworth8 produced similar results for clusters of pools with equal sample sizes within the clusters. He also produced exact CIs that are different from those produced by Barker.7
Each of these approaches assumed that a collection of samples (pools) were gathered and each pool tested with the assay to see if it was positive. Point estimates and CIs are calculated, based on the total number of pools screened, the number of positive pools observed, and the pool sizes.
Pool screening is now commonly used as a tool to monitor both parasitic915 and viral1619 arthropod-borne infections. However, it is not often appreciated that each of the methods for calculating the prevalence of infection in a vector population from pool screen data is based on a model that is in turn built on certain underlying assumptions concerning the methods used for collecting the samples. Furthermore, it is often unappreciated that deriving estimates of infection prevalence from a sampling procedure (whether it be by screening individual insects collected in a trap or by screening pools) is subject to errors introduced by the sampling process and that this source of error is influenced both by the number of insects examined and by the methods used to examine them. This can lead to invalid estimates of infection rates in the vector population, because the underlying assumptions made by the statistical models may be violated. The overall goal of this study is to clarify these assumptions and indicate how different sampling protocols may be in concordance with, or in violation of, these assumptions. First, we discuss the assumptions that underlie the methods that have been developed for calculating infection rates based on the results obtained by pool screening. Second, we discuss the consequences of sampling pools on the estimates of infection rates, and in particular, how sample numbers and pool sizes affect the precision of these estimates. Finally, we provide a discussion of the correct interpretation of CIs that are obtained from estimates of infection prevalence that are obtained from these methods.
| RESULTS AND DISCUSSION |
|
|
|---|
If individual insects are tested, the estimate of the probability that a given insect is infected is simply the ratio of the number of positive insects observed divided by the total number of insects examined. The underlying probability model for this situation is the binomial distribution, and CIs for the probability of infection are calculated from this by standard methods.20
When pools are sampled, the probability that a pool is not infected is (1 p)n, where p is the probability that any given individual insect is infected, and n is the pool size. The probability model for the number of positive pools observed is again a Binomial distribution. Again, standard methods are used to calculate point estimates and CIs. When the pools are not all the same size, denote the size of the ith pool by ni; the probability that the pool is not infected is again (1 p)ni. The distribution of the number of positive pools is no longer binomial, however, and the calculation of point estimators and CIs require considerable computation.7
The calculation methods using either equal or unequal pools assume that the samples are collected from an essentially infinite population and that the methods used in the sampling draw a random sample from this population. In these situations, the model assumes that the sample that is collected is assayed in its entirety. As shown below, the larger the sample (i.e., the more individual insects examined), the more accurate the estimate of the prevalence of infection will be. Thus, when one plans to use a pool screening approach to screen collected vectors, it is best to collect and screen as many individual insects as possible, given the limitations imposed by time and cost.
It is also possible to take another approach to screening equal sized pools that in some cases may be more efficient than screening the entire collection. Here the investigator collects and screens pools of equal size until R positive pools, each containing K insects, are obtained. Data collected in this way may be used to provide an estimate of the infection rate in the vector population. However, it is important to note that the assumptions underlying this method are different from those of the model described above. In this case, the probability model is based on the total number of trials required to find R positive pools and the appropriate probability model is no longer the binomial; it is the negative binomial. Therefore, the methods used to calculate the point estimates of infection rate and the associated CIs using this model will not be the same as those described above, and the values for the estimate of prevalence of infection will also differ somewhat from those obtained when one has screened an entire collection of insects. It is also important to note that this negative binomial model will not support the use of unequal pool sizes. A technical report describing this method and the properties of the estimates obtained is currently in preparation by the authors. In summary, one can calculate the prevalence of infection from screens involving equal or unequal pools when screening an entire collection or by adopting a sequential collection and screening protocol if one has pools of equal size. However, it is imperative that care be taken to insure that the collection protocol does not violate the underlying assumptions of the probability model that was used to construct the computational algorithm that one plans to use to analyze the resulting data. The most important of these assumptions when one is using the methods developed by Katholi and others,6 Hepworth,8 or Barker7 is that the collection screened must represent a truly random sample of an essentially infinite population.
What if one actually collects a number of insects that is greater than the laboratory is capable of screening? For example, if one faces a trap in which 1,500 mosquitoes have been collected, one might be tempted to just pull 500 of these from the trap to test. As described in detail in the appendix, such sub-sampling is allowable as long as one is sampling from an essentially infinite population, although at first glance the situation seems otherwise. In particular, the contents of each trap can be viewed as analogous to a bowl containing balls of two different colors: say red and green. A sub-sample is drawn from this bowl, and the question of interest is, "are there any red balls in the sub-sample?" A "red ball" detection assay is performed on the sub-sample and returns a yes or no answer. Sampling from a finite population containing only two kinds of objects follows the hypergeometric distribution. When this is done, the usual assumption is that the size of the overall population is known and that the proportion of red balls in this overall population is also known. The hypergeometric distribution gives the probabilities that a sub-sample of a given size contains X red balls. Because our test is a yes or no test, the probabilities of interest are the probability that our sample contains no red balls (i.e., is negative) and the complementary probability that there is at least one red ball. Again, our testing procedure is a Bernoulli trial; however, the probability of a negative sample conditional on there being M red balls in the original collection of size N is the complex quantity
![]() |
where N is the number in the trap, K is the size of the sub-sample, and M is the unknown number of red balls in the trap. However, the random variable M has a binomial distribution with parameters N and p. Hence, as is shown in the Appendix, the unconditional probability that we observe a negative test is (1 p)K. This being the case, samples found and tested as sub-samples from traps or other collection methods fit the same probability model discussed by Chiang and Reeve,3 Thompson,4 Katholi and others,6 Barker,7 Burrows,5 and Hepworth.8 However, it is important to note that these models all assume that sampling (and any subsequent sub-samples) are drawn from an essentially infinite population of insects. Thus, it is necessary to ensure that any sampling scheme to be used in conjunction with pool screening will draw an overall sample that is small compared with the total insect population present.
How many insects should be included in each pool, how many insects should be screened and how should pools be constructed?
Statisticians generally consider two measures of merit when considering the quality of a point estimate. These are bias and mean square error (MSE). Because the maximum likelihood estimator of a parameter is a function of the observed random variable, it is also a random variable and hence has an expected value (mean) and a variance. Bias is a measure of the extent to which the expected value of the estimator fails to equal the true value of the parameter. Thus, if we denote the estimate of p by
, Bias(
) = E(
) p. Similarly, the MSE is defined as MSE(
) = E(
p)2. The ideal estimator will have zero bias and minimum MSE. Unfortunately, it is not always easy to achieve these goals. Many investigators argue that they are willing to give up some bias for a smaller MSE. It is important, therefore, when considering an estimator to have a good grasp of the bias and MSE for the estimator and how they are influenced by ancillary factors (e.g., pool size).
For the pool screening estimator (when pool sizes are equal), it is not difficult to show that it is biased when the pool size is greater than 1 and that the bias increases as the pool size increases (see Appendix for mathematical details). The bias is in an upward direction (i.e., on average, the point estimate is somewhat larger than the actual infection rate). The size of the bias is also influenced by the value of the unknown parameter, p. Again, it can be shown that the bias increases as p increases. However, the bias is quite small when the true value of p is small and becomes substantial only as p (or the infection rate) gets larger than, say, 0.1 (or 10%).
The MSE also is influenced by both p and the pool size. Thompson4 showed that for any p, the MSE is approximately minimized when the pool size is taken to be
![]() |
Tables 1
3![]()
give some data for the influence of the pool size (K) on the bias and the expected values of the endpoints of a 95% CI for the cases when the infection rates are 1/50 (2.0%), 1/400 (0.25%) and 1/1,000 (0.1%), respectively. It is shown in the Appendix that the key factor in the ultimate size of the bias is a combination of the pool size (K) and the number of pools (M). As discussed below, the chemistry of the assay will generally be the limiting factor in determining the upper bound of the pool size. Thus, the main factor that one may influence in a sampling scheme will be the number of pools. As this number becomes large for any pool size (K) and any unknown p, the bias tends to zero at about the rate 1/M; the MSE gets small at the same rate as well.
|
|
|
Another point made by the calculations summarized in Tables 1
3![]()
is that the range between the endpoints of a 95% CI is generally much larger than the degree of bias introduced into the point estimator by the pool screening process. For example, in Table 2
, the range between the upper and lower bounds of a 95% CI is roughly 4-fold (0.001, 0.004). The 95% CI is also fairly stable, increasing by just 9% when comparing the interval obtained from screening pools of 100 insects to the interval obtained from screening a pool size of 1 (i.e., screening each insect individually; Table 2
). Thus, the majority of the error surrounding the point estimate of the infection rate is associated with the random variation in the sampling process itself and not with the process of pool screening. That is, most of this error is also present when one screens individual insects (by dissection for example) and needs to be considered when calculating infection rates obtained by this method as well.
Given that the point estimates derived from pool screening will be biased to some extent, questions immediately arise concerning how large the pools should be. As pointed out by Hepworth,8 if the pools are too large, it is likely that they will all test positive, leading to the estimate
= 1; if they are too small, we are likely to have a very large number of negative pools, and the costs associated with the testing will be unreasonable. The sensitivity and specificity of the assay will also impose limits on just how large the pools can be. The Bernoulli model assumes that the assay has perfect sensitivity for the size of the pool tested and that the specificity is also perfect; that is, there are never any false positives or false negatives. If this is not the case, the estimates produced will be biased upward by the presence of the false positives or downward by false negatives. In this regard, it is important to note that the specificity of the assay becomes particularly important when the number of true positives is likely to be very low.
Thus, the optimal pool size will be a function of both the statistical model and the chemistry of the assay used to screen the pools. For example, if one suspects that the infection rate is 1/400, the estimator of Thompson4 recommends pools of size 635. Chiang and Reeves3 suggest using a pool size of K = log(1/2)/log(1 p), which would give an equal chance of a positive or negative pool. For an infection rate of 1/400, this formula leads to a pool size of 277. Both of these are likely to be larger than can possibly be handled by the chemistry of most assays used in pool screening. For example, most PCR-based assays have been shown to be capable of handling a maximum of 100 insects in a pool while retaining an acceptable level of sensitivity and specificity.6,1015,21 Similarly, the currently commercially available antibody-based tests may be used with a maximum pool size of 50 mosquitoes.2226 When taken together, these findings suggest that the biochemical properties of the assay are more likely to impose limits on the optimal pool size than are the constraints imposed by the statistical analysis of the results. Therefore, when one is searching for a rare event (i.e., examining a population in which the prevalence of positive insects is low), it is probably best to use the largest pool size deemed possible by the chemical limitations of the assay. Furthermore, because the pool screening models all assume a sensitivity and specificity of 100%, it is imperative that the sensitivity and specificity of the assay be rigorously evaluated on synthetic pools containing different numbers of insects to ensure that the assay is performing optimally at any given pool size before that pool size is chosen for screening unknown samples.
How many insects do I need to screen to get an accurate estimate of the prevalence of infection? Before beginning the process of estimating the occurrence rate of a rare event, it is instructive to consider exactly what you might expect while collecting specimens. Suppose that the proportion of the population being sampled that exhibit the characteristic of interest is p, 0 < p < 1 (i.e., the proportion of positive insects is between 0% and 100%). If we consider the random variable X, where X = number of specimens collected before the first one with the characteristic (i.e., a positive) is found, it is well known that X has a geometric distribution. That is,
![]() |
From this it is easily shown that the probability that X is less than or equal to any particular value, say z, is equal to
![]() |
This formula may be used to give estimates of how probable it would be that we will observe a specimen with the characteristic of interest (i.e., a positive insect) when we have examined z specimens. Tables 4
6![]()
summarize the probability of detecting a positive insect at different prevalences of infection when one tests varying numbers of insects in pools of varying sizes. For example, if the event rate is 1/50, screening as few as 40 individual insects yields a better than 50% chance of finding a positive insect (Table 4
). When the true prevalence is 1/400, screening 10 pools of size 25 yields a 47% chance of finding a positive pool, and when the pool size is 50, the chance is 71% after screening 10 pools (Tables 5
and 6
). Tables 4
6![]()
can be used to help plan the sampling process with respect to whether to screen individual insects or to screen pools, and if screening pools, how large should the pools be. Tables like these are easily calculated from equation Pr(X
z) = 1 (1 p)Kz, where K is the pool size and z is the number of pools.
|
|
|
How should pools be constructed? As noted above, the sampling models on which all of the pool screening models are based assume that one is collecting a random sample from an essentially infinite population. Thus, it is important to try to ensure that when one is devising a sampling strategy that one attempts to obtain a sample from the overall population that is as random as possible. For example, this means that traps (and therefore collections) should be distributed as randomly as possible throughout the study area to ensure that the insects collected are as representative as possible of the overall population.
A second assumption inherent in the pool screen model is that each infection in the insect population is independent of all others (i.e., that infected insects are distributed randomly throughout the overall insect population). However, it is known that this is often not the case, and temporal and spatial aggregation of infected insects is often observed.27 One way to correct for aggregation will be to combine collections from all traps into a single population and to create pools from the combined population. This approach will produce the most accurate estimates of the overall infection prevalence in the vector population. However, spatial and temporal data may be lost when using this approach.
Finally, some mention should be made of the fact that sampling schemes are often devised to specifically target infected insects. For example, traps are often set in areas where evidence exists for ongoing transmission, or sampling strategies are used that specifically attract a sub-population of insects that are likely to be infected. These sampling strategies are therefore most effective in documenting ongoing transmission but may lead to an upwardly biased estimate of the actual prevalence of infection in the vector population.
What is the meaning of a CI in a pool screen calculation? CIs are often reported and interpreted as a measure of precision of a point estimate that has been calculated from the data. This view is inaccurate, so it is important to understand exactly what a CI is and what it says. First of all, the endpoints of a CI form a bivariate random variable that is a function of the original observations. The distribution of this bivariate random variable depends on the distribution of the original observations and the functions used to define the values. Usually this is very complex. The values computed for endpoints from the observed data are only one of possibly an infinite number of values that can result from the experiment depending on the distribution of the original data and the sample actually observed. A CI is said to "cover" the true value of the parameter of interest if the interval contains that value. Thus, for any CI derived from the experimental observations, you can never be absolutely sure that the CI actually contains the true infection rate. When we speak of a 95% CI, we mean that if we repeated the sampling process (i.e., the experiment) a large number of times (infinitely many) and calculated the interval, the proportion of times that our computed CI would cover the true value is 0.95. To put this another way, if an investigator plans to use an algorithm that produces a 95% CI, he knows before he gathers his data that the odds are 19:1 that the algorithm, when applied to his data, will produce an interval that covers the true value. The investigator does not know, however, if the value he actually calculates with his or her data actually covers the true value or not, because there is still a 1/20 chance that the true value will fall outside of the calculated interval.
Many CI procedures have coverage probabilities that vary over the range of the parameter of interest (in this case p, the rate of the event). The CI is called exact if the smallest of these coverage probabilities is at the desired level. Hence, the CI calculated will often be too conservative (i.e., have a coverage > 95%) but will never yield an interval that is too short. Clopper-Pearson intervals were included in a widely used computer program to calculate prevalence of infection from equal size pools (Pool screen v1.0).6 Barker7 obtained equivalent intervals for unequal pool sizes. These CIs are exact. However, they tend to be quite conservative for small values of p (i.e., when the infection rate is low). It should be noted that this dependence of the coverage probability of the CI for a parameter on the parameter is a feature in discrete distributions.
One final observation needs to be made here as well. The failure of a specific CI to contain a pre-specified value of the parameter, p, is not the same as a statistical test of a hypothesis in general. Some CIs are found by "unwinding" a test; however, this is not always the case. Thus, the CIs calculated by the methods described above should not be used as a stand-alone statistical method for testing hypothesis regarding the prevalence of infection in a vector population (e.g., have we succeeded in lowering the transmission rate below a defined cut-off with our control program?). Tests for specific hypotheses can be developed without too much effort, particularly in the case of equal pool sizes. For example, it is easy to show (see Appendix at www.ajtmh.org) that a test of whether the observed prevalence of infection is statistically equal to or lower than a defined cut-off value (i.e., H0: p
p0 versus Ha: p > p0), based on the observed number of positive pools, T, rejects H0 whenever T > t
t
is the critical point for a level
test. To clarify, recall that the usual test procedure for a statistical test (e.g., a t test) involves the calculation of some test statistic. An appropriate table is then consulted to obtain the "critical value," and the null hypothesis is rejected at the predetermined level (e.g.,
= 0.05) if the test statistic exceeds the critical value. The value
represents what statisticians call the level of type I error; that is, if the test were conducted a large number of times, the type I error represents the proportion of times we would incorrectly reject the null hypothesis when it is true. For the pool screen model the value of the critical point depends on the pool size (K), the number of pools (M), the value of p0, and the choice of
and can be calculated from the formula
![]() |
This test is the uniformly most powerful test for this hypothesis.28 Because of the dependence of the critical values on M, K, and p0, it is impossible to calculate a table of critical values a priori. Although this formula looks very complex, the computations can easily be carried out using the probability calculator included in almost any statistical package.
| APPENDIX |
|
|
|---|
![]() |
Thus, the conditional probability that the sub-sample of size K will contain no infected experimental units is given by,
![]() |
From this conditional probability we obtain the unconditional probability that X = 0 by using the fact that M is a random variable with Binomial (N,p) distribution. Thus, the unconditional probability is given by
![]() |
Note that when M = N K + l,l = 1,2,L,K, P(X = 0|M) = 0. Hence, the unconditional distribution is given by
![]() |
However, from the binomial theorem we have that
![]() |
so the unconditional probability that the sub-sample contains no positive experimental units is P(X = 0) = (1 p)K. This establishes that the model used for the Pool screen programs is appropriate in the case of sub-sampling as well.
Statistical properties of the pool screen estimator.
The estimator based on testing pools gives rise to the MLE
= 1 (1 T/M)1/K where K is the pool size, M is the number of pools tested, and T is the number of positive pools observed. If this expression is expanded in a Taylor expansion about the E(T) and the expectation taken, we see that the dominant term in the expansion for the bias is given by the expression
![]() |
The next term in the expansion has multiplier 1/M2 and successive terms are in terms of increasing powers of 1/M. Some simple hand calculations show that this first term accounts for most of the bias shown in Table 1
in the body of the paper. Some comments are in order concerning the size of the bias. The quantity {[1 (1 p)K]/[(1 p)K]} is easily shown to be a strictly monotone increasing function of K for any fixed p. Similarly, it is strictly monotone increasing as a function of p, 0 < p < 1 for any fixed K. Clearly, it is possible for the bias to become arbitrarily large as K increases for any fixed p and as p approaches 1 for any fixed K. On the other hand, because of the restrictions placed on the size of K by the chemistry of the assay, there is a practical upper bound on the size of K. Similarly, if the infection potential is moderate to large (say > 1 in 50), one would not do pool screening. In either case, however, the bias can be made acceptably small by increasing the number of pools, M. A similar approach shows that the leading term in the MSE is given by
![]() |
This also grows under the same circumstances discussed above and the same restrictions apply. Thus, the MSE also becomes small as M becomes large. Note that this formula for the MSE is equal to the asymptotic variance of
as calculated from Fishers information. Thus, both the bias and the MSE are negligible when the number of pools (M) grows large for any pool size (K).
Testing a simple hypothesis.
A test of the simple hypothesis H0: p
p0 versus Ha: p > p0 can be constructed when the pool sizes are equal using the statistic T, the number of positive pools observed. Under the null hypothesis, T has the distribution,
![]() |
It is easily shown that the family of distributions {g(t|p),0 < p < 1} has monotone likelihood ratio and that the statistic T is sufficient for p. By the Karlin-Rubin theorem,28 the test that rejects when T > tcrit is uniformly most powerful. Because the distribution of T is discrete, tcrit is defined to be the minimum for t
{0,1, . . . ,m} such that PPo (T > t)
. Critical values can be calculated using the following formula:
![]() |
Received August 19, 2005. Accepted for publication January 13, 2006.
Acknowledgments: The authors thank Drs. Naomi Lang-Unnasch and Eddie W. Cupp for critically reading the manuscript. We also thank the two anonymous reviewers of an earlier version of this manuscript for constructive comments. In particular, we thank the reviewer who pointed out a major conceptual error that we have corrected. We also acknowledge the role of the Onchocerciasis Control Program in the Americas and the former Onchocerciasis Control Programme in West Africa in supporting our work.
* Address correspondence to Thomas R. Unnasch, University of Alabama at Birmingham, Division of Geographic Medicine, BBRB 203, 1530 3rd Avenue South, Birmingham, AL 35294-2170. E-mail: trunnasch{at}geomed.dom.uab.edu ![]()
Authors addresses: Charles R. Katholi, PhD, University of Alabama in Birmingham, Department of Biostatistics, Ryals 317, 1665 University Blvd., Birmingham, AL 35294-0022, E-mail: ckatholi{at}uab.edu. Thomas R. Unnasch, University of Alabama at Birmingham, Division of Geographic Medicine, BBRB 203, 1530 3rd Avenue South, Birmingham, AL 35294-2170, E-mail: trunnasch{at}geomed.dom.uab.edu.
| REFERENCES |
|
|
|---|
This article has been cited by other articles:
![]() |
P. FISCHER, S. M. ERICKSON, K. FISCHER, J. F. FUCHS, R. U. RAO, B. M. CHRISTENSEN, and G. J. WEIL PERSISTENCE OF BRUGIA MALAYI DNA IN VECTOR AND NON-VECTOR MOSQUITOES: IMPLICATIONS FOR XENOMONITORING AND TRANSMISSION MONITORING OF LYMPHATIC FILARIASIS Am J Trop Med Hyg, March 1, 2007; 76(3): 502 - 507. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |