Evaluation of Commercial Rapid Lateral Flow Tests, Alone or in Combination, for SARS-CoV-2 Antibody Testing

ABSTRACT. Antibody tests can be tools for detecting current or past severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2 [coronavirus disease 2019 (COVID-19)]) infections. Independent test evaluations are needed to document the performance with different sample sets. We evaluated six lateral flow assays (LFAs) and two laboratory-based tests (EUROIMMUN-SARS-CoV-2 ELISA and Abbott-Architect-SARS-CoV-2-IgG). We tested 210 plasma samples from 89 patients diagnosed with acute COVID-19. These samples were collected at different time points after the onset of symptoms. In addition, 80 convalescent plasma samples, and 168 pre-pandemic samples collected from adults in the United States and in Africa were tested. LFA performance varied widely, and some tests with high sensitivity had low specificity. LFA sensitivities were low (18.8–40.6%) for samples collected 0 to 3 days after symptom onset, and were greater (80.3–96.4%) for samples collected > 14 days after symptom onset. These results are similar to those obtained by ELISA (15.6% and 89.1%) and chemiluminescent microparticle assay (21.4% and 93.1%). The range of test specificity was between 82.7% and 97%. The combined use of two LFAs can increase specificity to more than 99% without a major loss of sensitivity. Because of suboptimal sensitivity with early COVID-19 samples and background reactivity with some pre-pandemic samples, none of the evaluated tests alone is reliable enough for definitive diagnosis of COVID-19 infection. However, antibody testing may be useful for assessing the status of the epidemic or vaccination campaign. Some of the LFAs had sensitivities and specificities that were comparable to those of more expensive laboratory tests, and these may be useful for seroprevalence surveys in resource-limited settings.


INTRODUCTION
As of October 13, 2020 about 37.9 million severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections (coronavirus disease 2019 ) and 1.01 million deaths have been recorded worldwide. 1 Reverse transcription-polymerase chain reaction (RT-PCR) tests are currently the gold standard for diagnosis of COVID-19. 2,3 Because these assays detect viral RNA, they indicate active infection and potential infectivity. However, molecular tests are expensive, they require significant laboratory infrastructure, and they are in short supply. As the infection progresses, viral titers in the upper respiratory tract often decrease so that viral RNA may be undetectable in respiratory specimens. 4 Antibody tests for COVID-19 detect human antibodies to viral proteins. Thus, antibody tests may be useful for the diagnosis of recent infections after antibodies have been produced or for verifying past infections in persons who were not tested by RT-PCR when they were ill; they may also have value as markers for immunity to the virus. Potential use cases for COVID-19 antibody testing have been reviewed previously. 5 It is clear that antibody test results alone for individual patients do not provide diagnostic certainty, because no test has 100% sensitivity and specificity. However, together with the medical history and clinical signs, antibody tests results can also be very helpful for individual diagnosis. Similarly, although positive antibody test results do not guarantee immunity, they suggest prior infection and immunity. Despite these limitations, antibody tests can be useful tools for assessing COVID-19 activity in communities, and they may have some value for individual diagnosis. This is especially true in low-and middle-income countries that lack resources for widespread molecular testing. Another advantage of antibody tests over laboratory-based molecular diagnostic tests is that they do not require biosafety 3 level laboratory containment. 6 Antibody tests for COVID-19 use a variety of test platforms, but they use a limited number of viral antigens. Most tests detect antibodies to SARS-CoV-2 spike (full length, receptor binding domain, variable domain of the spike protein) or nucleocapsid proteins, alone or in combination. [7][8][9][10][11][12] Some tests detect IgG antibodies only, whereas others detect IgG, IgA, and/or IgM. Even when tests use the same viral antigen target, test performance can vary based on the expression system used, antigen purity, secondary antibodies, diagnostic platform, and quality control. In recent months, more than 150 rapid format antibody tests have been developed and marketed. 13 Only a few have been evaluated independently to date, and only a small number have received emergency use authorization from the U.S. Food and Drug Administration 14 or other authorities. Although several test evaluation studies have recently been published or posted online, most compare a limited number of antibody tests with a relatively small panel of samples from a single geographic area. 11,[15][16][17] Therefore, more independent test evaluation data from various geographic regions are urgently needed.
Our study focused on the evaluation of rapid, point-of-care tests using lateral flow assay (LFA) technology that could be especially helpful for antibody testing in low-and middleincome countries. We tested a panel of plasma samples that included not only confirmed COVID-19 cases from the United States, but also pre-COVID-19 samples from the United States and sub-Saharan Africa for broader specificity testing. The primary objectives of the study were to assess the sensitivity and specificity of tests for antibodies to SARS-CoV-2 proteins, and the kinetics of antibody responses relative to the time of symptom onset. Furthermore, we compared the background reactivity (false-positive rates) for sera from the United States and sera from areas in Africa where people are exposed to diseases that are absent or uncommon in the United States.

METHODS
Ethical approval and patient samples. The study was approved by the Human Research Protection Office (HRPO) at Washington University (institutional review board identification no. 202004088). We tested de-identified plasma samples that were collected with informed consent under protocol HRPO 2020003085 and archived serum/plasma samples that were collected pre-COVID-19 under HRPO 201102546. The study is registered at ClinicalTrials.gov with identifier no. NC04360954.
Plasma samples from subjects with confirmed, symptomatic COVID-19 infections (WU-350) were collected at Barnes-Jewish Hospital, an affiliated teaching hospital of Washington University School of Medicine. Metadata associated with these deidentified samples included the date of sample collection, the type and date of onset of symptoms, gender, age, and other demographic information. The date of symptom onset was considered to be day 0. Convalescent plasma samples (WU-353) were collected from patients in the St. Louis area with previously documented COVID-19 infections who were now at least 14 days after symptom resolution. In this group, the earliest collection time was 21 days and the latest was 56 days after symptom onset. Initial molecular testing for these patients was performed with commercially available tests at several certified diagnostic laboratories (Quest Diagnostics, BJC Healthcare, Barnes-Jewish Hospital, Mercy Hospital, LabCorp, Missouri Baptist Medical Center).
Archived U.S. pre-COVID samples were collected at Barnes-Jewish Hospital before October 2019 ( Table 1). The African pre-COVID samples were collected in western Uganda (58 samples) and in southern Cote d'Ivoire (30 samples). 18,19 All archived plasma samples had been stored at -20 C. All samples in this study were divided into aliquots in identical plastic tubes and labeled with study-specific barcodes. The study data manager held the key linking barcodes to sample numbers and their metadata. Aliquots of samples were stored at 4 C no longer than 2 weeks prior to use.
Antibody test kits. We evaluated six rapid-format, pointof-care LFA antibody test kits that detect both IgM and IgG antibodies to recombinant viral proteins ( Table 2). The kits were selected based on availability and provision of tests by the sponsor. A seventh rapid test kit (qSARS-COV-2 IgG/ IgM cassette rapid test; Cellex, Research Triangle Park, NC) was evaluated, but results were excluded from this report based on requests from the manufacturer and the donor of the test kits (Foundation for Innovative New Diagnostics, Geneva, Switzerland) related to concerns about product integrity and performance of the lots tested. No results were excluded from the analysis based on requests from funders or manufacturers after results were disclosed to them.
Test procedures. We performed the EUROIMMUN ELISA and LFA tests according to the manufacturers' instructions for use. The Abbott Architect chemiluminescent microparticle assay (CMIA) IgG test was performed in a clinical laboratory at Barnes-Jewish Hospital in St. Louis, MO (https://www. barnesjewish.org/Medical-Services/Laboratory-Services) according to the company's instructions for use. Persons who performed the tests did not have access to study sample numbers, RT-PCR results, or metadata. Samples were tested in several batches and randomized within the batch. LFA tests were read by two independent readers, and results were recorded on a paper form that contained the matching barcode. Results were compared, and a third reader was used in the case of discordant results. All LFA and ELISA testing was performed with the same panel of 458 samples (Table  1). Only 413 samples were tested with the Abbott CMIA test, because this test required 200 mL plasma, and we did not have enough plasma to test some of the samples.
Data analysis. Antibody test results for each test kit (IgM, IgG antibody) were entered into a test result database with the participant's unique identifier barcode number using double data entry. Results were merged into a database that contained the unique identifier barcode number, the participant identification number for the parent studies WU-350 and WU-353, and metadata. Practical characteristics related to test acceptability were scored by technicians  LATERAL FLOW TESTS FOR SARS-COV-2 ANTIBODY TESTING who performed the tests using a preprinted test evaluation form. Sensitivity and specificity were calculated using standard methods, and binomial 95% CIs were calculated for these estimates. Pairwise comparison analyses of differences in test results were performed with McNemar's test, and P values were adjusted using a false-discovery rate adjustment. 20 All analyses were conducted using SAS v. 9.4 (SAS Institute, Cary, NC).

RESULTS
Sensitivity. The sensitivity of the six LFAs and two laboratory-based tests varied by test and by time after the onset of COVID-19 symptoms ( Table 3). The sensitivity of the LFAs to detect IgM or IgG antibodies were between 18.8% and 40.6% 0 to 3 days after the onset of symptoms, and peaked between 80.3% and 96.4% more than 14 days after the onset of symptoms (Table 3). IgM positivity rates tended to be greater than IgG rates for the first 2 weeks after symptom onset, but there was little difference at later time points. The sensitivity of the laboratory-based EUROIMMUN ELISA and the Abbott CMIA IgG tests was 15.6% and 21.4% 0 day and 1 to 3 days after the onset of symptoms, respectively, and were 89.1% and 93.1%, respectively, more than 14 days after the onset of symptoms. High sensitivity was not linked to the antigen or antigen combination used in the tests (compare Tables 2 and 3).
Samples from three patients (2.6% of all patients) collected 27, 31, and 38 days after the onset of symptoms were antibody negative with all eight tests. None of these three study participants had a history of diseases or treatments known to affect immune responses to infections or vaccines. Sixteen patients provided samples at multiple time points . 14 days after onset of symptoms, and most samples were positive with all tests. One patient tested negative with two samples at days 16 and 23 with the BioMedomics test. A different patient was negative for antibodies on day 15 with five tests, but tested positive with all tests at day 18.
We performed pairwise comparisons of the sensitivity of the tests. There were no significant differences between the outcomes (IgM, IgG, and IgM or IgG) 0 to 3 days after the onset of symptoms; there were between 0 and 10 significant differences 4 to 7 days after symptom onset and between 4 and 10 significant differences in the 8-to 14-day time period. The largest differences in sensitivity occurred . 14 days after symptom onset, with 14, 13, and 11 significant differences for IgM or IgG, IgG, and IgM, respectively (Supplemental Tables S1 and S2). At that time, the BTNX and Sienna/COVIBlock tests had the greatest sensitivity estimates, with 96.4% and 94.2%, respectively. Sienna had significantly greater sensitivity compared with three of seven tests, and the BTNX LFA had significantly greater sensitivity compared with five of the seven tests (Supplemental Tables S1 and S2).
Specificity. Test specificity was highly variable, with a range between 82.7% (BTNX) and 95.2% to 95.8% (BioMedomics, Innovita, Sienna) ( Table 4). Specificity rates tended to be greater for IgG (92.9-100%) than for IgM (83.7-95.2%). Also, test specificity was slightly greater (especially for IgM) for pre-pandemic samples from the United States than for those from Africa (Supplemental Table S3). The combined specificity of LFAs with pre-pandemic samples was 81.3% for U.S. pre-COVID-19 samples, but only 68.2% for sub-Saharan pre-COVID-19 samples (P 5 0.053), but this difference was mostly a result of lower IgM specificity (82.5% versus 68.2%,   Table  2) for evaluation after our main testing had been completed. We tested this test with a subset of 74 samples from subjects with COVID-19 at various time points after the onset of symptoms and 74 sample collected before the pandemic (Supplemental Table S6). Sensitivities for IgM or IgG antibodies in the sample subset were 62.2% and 64.9% for the first BTNX test, and 60.8% and 62.2% for the BTNX Liberty test, respectively. However, the numbers of positive IgM or IgG tests for pre-COVID-19 samples were 27 and 29 for the first BTNX test, but only five and six for the Liberty test, respectively. These results suggest the BTNX Liberty test had similar sensitivity but improved specificity compared with the first BTNX test.
Test performance characteristics. All six LFAs were easy to perform, but there were small differences related to packaging, the amount of plasma needed, incubation time, and difficulty in reading the test (Supplemental Table S7). For example, the BioMedomics test produced control lines with inconsistent intensity and more diffuse positive test lines than other tests.

DISCUSSION
We evaluated six LFAs and two laboratory-based tests independently for detecting antibodies to SARS-CoV-2 proteins. Our results show that these tests have low sensitivity in the first week after the onset of symptoms. Sensitivity rates were much greater for samples collected more than 14 days after symptom onset. These results are consistent with results from previous studies and a number of meta-analyses that evaluated some of the same tests (Table 6). 15,21,22 This means that although LFAs have relatively limited value for diagnosing COVID-19 shortly after symptom onset, they have good sensitivity later during the course of the infection. Because antigen detection LFAs are readily available, the main use case for antibody LFAs is for detecting anti-spike protein antibodies after natural infection or vaccination. Because currently available vaccines induce antibodies against the spike protein, only tests that detect antibodies against that protein are suitable for assessing responses to vaccines. We compared the sensitivity of the six LFAs and two laboratory-based tests evaluated in our study with data from other independent evaluation studies or with data provided by the manufacturer (Table 6). This analysis focused on sensitivity for samples from subjects collected more than 14 days after the onset of symptoms. The most extensive data were available for the EUROIMMUN ELISA. The sensitivity of the Abbott CMIA has been reported to be between 83% and 100%, and it was 91.8% in our study. Sensitivities for the LFAs in our study were in the same range as in previous studies (80-95%). Specificities in our study were also generally similar to those noted in previous reports. However, we did not confirm exceptionally high specificities of 99% to 100%, which have been reported for some of the tests we evaluated. 14 This may be a result, in part, of the fact that our study included pre-COVID-19 samples from sub-Saharan Africa. Specificity is a challenge for SARS-CoV-2 antibody tests. This is especially true for LFAs when looking at IgM. Although IgM appears earlier than IgG after the onset of symptoms (Table 3), we found that the specificity is lower compared with IgG, and IgM results alone add little reliable information. However, specificity of LFAs could be increased to . 99% by requiring positive test results with certain two-test combinations with only minor reductions in sensitivity. In addition, our results suggest that the best LFAs had similar sensitivities and specificities as the two laboratory-based antibody tests (ELISA and CMIA). Thus, LFAs may be a good alternative to expensive and technically demanding laboratory-based tests. This is especially true for settings in which immediate results for individual samples are desired and in low-resource settings in the developing world. On the other hand, automated tests may be preferable for mass testing in high-resource settings, The WHO developed a target product profile for rapid antibody tests. 23 According to this profile, rapid point-of-care tests to detect prior infection should have a minimal sensitivity of 90% and a minimal specificity of 97%. Our results show that none of the tests we evaluated satisfied these targets. However, certain two-test combinations satisfied the specificity target and provided good sensitivity for samples collected more than 14 days after the onset of symptoms All the tests we evaluated were newly developed and the in early stages of commercialization at the time of our study. We originally evaluated seven LFAs, but results from the Cellex test were redacted from the study because of logistic problems and shipping delays that might have compromised test performance. Furthermore, the BioMedomics and BTNX tests we evaluated were recalled 8 and 10 weeks after we evaluated these tests, respectively. The companies thought that delays during transit may have affected test performance. However, we included the results from these tests because no obvious problems were noted during our evaluation. We think it is unlikely that shipping delays would decrease test specificity. The BTNX Liberty test that we evaluated with a subset of samples had greater specificity than the original BTNX test, and this may justify a more thorough evaluation. The BioMedomics test produced a rather diffuse positive test band compared with those in other tests. Four of the six LFAs produced inconsistent intensities of control lines. Although this is not important for qualitative detection, it may be a problem for semiquantitative or quantitative detection if the control line is used for comparison. The companies may be able fix this in future versions of the tests. These experiences and others illustrate problems that can occur when tests are moved rapidly to the market and when shipments are delayed because of shipping and customs clearance.
Although we do not know the retail costs for the LFAs that we evaluated, LFAs are generally less expensive than laboratory-based assays when equipment and personnel costs are included in the cost analysis. The LFAs work with very small sample volumes and with a variety of specimens (plasma, serum, whole blood). They enable rapid point-ofcare testing for antibodies with capillary blood samples. As alluded to earlier, point-of-care testing may be especially useful in low-and middle-income countries where transport of specimens to centralized testing facilities and delays in reporting results are often major challenges.
A recent systematic review and meta-analysis concluded that the available evidence does not support the continued use of existing point-of-care antibody tests. 21 Although this may be true for persons with early infections, we think these tests have real value for certain use cases. Another systematic review concluded the sensitivity of antibody tests is too low in the first week after symptom onset to have a primary role in the early diagnosis of COVID-19, but they may still have a role for documenting recent infections in individuals later in the course of their illness, when RT-PCR tests are either missing or negative. The same study acknowledged that antibody tests are likely to have a role for documenting previous SARS-CoV-2 infections with samples that are collected 15 or more days after the onset of symptoms. 22 Our study supports this view, with additional data obtained with six LFAs and a large panel of well-characterized plasma samples. Our study has a few limitations. We focused on plasma samples that were collected in ethylenediaminetetraacetic acidcoated vacutainers and were stored/handled optimally, which might not always be the case in remote settings where rapid LFAs could be especially valuable. The positive samples we tested were from persons with clinical symptoms who presented to a hospital or testing station. Thus, we have no data on the performance of these tests with samples from persons with asymptomatic infections. Another limitation of our study is that we no data on the duration of antibody responses after SARS-CoV-2 infection. We also do not know whether antibodies detected by LFAs correlate with protective immunity or whether these tests might be useful for assessing immune responses after vaccination. These are important questions that merit further study.
In conclusion, our study shows that a subset of the LFAs that we examined had comparable sensitivities and specificities to laboratory-based ELISA or CMIA tests for antibodies to SARS-CoV-2. Sensitivities for active or recent infection increased with time after the onset of symptoms, and these values were very good 14 days or longer after symptom onset. The best tests we evaluated had good specificity, but there is room for improvement. Dual testing with certain test combinations provided excellent specificity. Thus, we believe that currently available LFAs provide clinically useful information regarding current or recent COVID-19 in individuals. Additional studies should be performed to assess their value as surveillance tools for SARS-CoV-2 in populations and for documenting antibody responses to vaccines.