Keywords
tuberculosis; tuberculin skin test, diagnostic accuracy, sensitivity, specificity, Predictive Value of Tests, Receiver operating characteristic curve, case-control study
This article is included in the Emerging Diseases and Outbreaks gateway.
During the past decade, the frequency of extrapulmonary forms of tuberculosis (TB) has increased. These forms are often miss-diagnosed. This statement of the TB epidemiological profile modification, conduct us to reflect about the utility of the Tuberculin Skin Test (TST) in active TB detection. This study aimed to evaluate the diagnostic accuracy performance of the TST for active tuberculosis detection.
This was a case-control, multicenter study conducted in 11 anti-TB centers in Tunisia (June-November2014). The cases were adults aged between 18 and 55 years with newly diagnosed and confirmed tuberculosis. Controls were free from tuberculosis. A data collection sheet was filled out and a TST was performed for each participant.
Diagnostic accuracy measures of TST were estimated using Receiver Operating Curve (ROC) curve and Area Under Curve (AUC) to estimate sensitivity and specificity of a determined cut-off point.
Overall, 1050 patients were enrolled, composed of 336 cases and 714 controls. The mean age was 38.3±11.8 years for cases and 33.6±11 years for controls.
The mean diameter of the TST induration was significantly higher among cases than controls (13.7mm vs.6.2mm; p=10-6). AUC was 0.789 [95% CI: 0.758-0.819; p=0.01], corresponding to a moderate discriminating performance for this test. The most discriminative cut-off value of the TST, which was associated with the best sensitivity (73.7%) and specificity (76.6%) couple was ≥ 11 mm with a Youden index of 0.503. Positive and Negative predictive values were 3.11% and 99.52%, respectively.
TST could be a useful tool used for active tuberculosis detection, with a moderate global performance and accepted sensitivity and specificity at the cut-off point of 11 mm. However, it cannot be considered as a gold standard test due to its multiple disadvantages.
tuberculosis; tuberculin skin test, diagnostic accuracy, sensitivity, specificity, Predictive Value of Tests, Receiver operating characteristic curve, case-control study
The title of the manuscript was modified. Some grammar mistakes were corrected in the current version. One figure was deleted: Figure 2. And we modified the underlying data.
See the authors' detailed response to the review by Oliver Stirrup
See the authors' detailed response to the review by Paula Rodríguez-Molino
The tuberculosis (TB) epidemic poses a serious public health problem, with high associated morbidity and mortality rates worldwide.1 It has been estimated that almost a quarter of the total world’s population is infected with latent TB infection and approximately 10 million have developed active TB, with about 1.6 million attributed deaths in 2021, all over the world.2
According to the World Health Organization (WHO) estimations, diagnosing and treating TB saved 66 million lives between 2000 and 2020.2 Therefore, there is a need to improve the TB diagnostic procedure. It is a priority requirement to dispose of an accurate diagnostic tool for early detection and appropriate treatment, and set up timely control measures to limit the spread of the infection.3 However, there is no international agreement about a gold-standard test for detection of active nor latent TB.3,4
The detection of active TB using a simple, rapid, and inexpensive test, like the tuberculin skin test (TST), in diagnosing TB, may help to guide easier clinical diagnosis of disease.
The TST or Mantoux test is an old, widely used test for TB infection screening and for estimating the delayed hypersensitivity reaction induced by BCG vaccination.5,6
Over the past decade, the frequency of extrapulmonary forms of TB has increased.7,8 These forms, especially lymph node TB type, are often miss-diagnosed, due to unspecific symptoms, and pose a diagnostic challenge for clinicians, requiring the use of invasive methods for diagnostic confirmation.9,10
Tunisia is a country of intermediate TB endemicity with an estimated incidence of 29/100,000 inhabitants in 2017. Pulmonary TB constituted 38% of all forms of the disease in 2017, while extra-pulmonary TB represents 62% of cases. The frequency of lymph node forms is relatively high, with a constant increase from 2.3/100,000 in 1993 to 18/100,000 in 2017.11
These TB epidemiological profile modifications in Tunisia (the increase in the frequency of lymph node forms at the expense of pulmonary forms) and all arguments cited above, leads us to reflect on the utility and the performance of the TST for active TB detection.
The utility of the TST for the detection of latent TB infection was largely discussed and debated in the literature,12,13 but its utility in diagnosing active TB has not been discussed enough in previous studies.
Therefore, we aimed, through this multicenter case-control study, to evaluate the diagnostic accuracy of the TST for the diagnosis of active TB, and to select its best cut-off value, using the ROC curve and Fagan’s Nomogram methodology.
This was a case-control, multicenter study for a purpose of a diagnostic test evaluation, conducted in 11 anti-TB centers in Tunisia, during the period from June to November 2014.
Inclusion and exclusion criteria
Included TB cases were adult patients aged between 18 and 55 years who were newly diagnosed with TB. They had confirmed TB diagnosis (pulmonary and extra-pulmonary forms of tuberculosis) based on a range of anamnestic, clinical, chest X-ray, bacteriological arguments. They were recruited from 11 anti-TB centers from the north, center and south of Tunisia (Ariana, Tunis, Sfax, Gafsa, Ben Arous, Bizerte, Sousse, Kairouan, Sidi Bouzid, Kasserine, Tataouine), at the first-time delivery of their anti-TB treatment.
Controls were recruited from basic health centers and from district hospitals, located in the same geographical zone of the anti-TB centers (in which cases were recruited) and during the same study period. The gender distribution frequency was approximately the same as that of TB cases. Overall, the same proportion of men and women in TB cases and in controls was respected. All the included controls had no clinically manifested active TB; they did not present any respiratory or extra-respiratory symptoms that could be of TB origin.
Exclusion criteria
Were excluded from the study:
‐ TB cases already treated for pulmonary or extra-pulmonary TB.
‐ TB cases and controls with a pathological condition that may lead to tuberculin anergy: acute viral infections (measles, infectious mononucleosis, influenza), lymphomas, neoplastic pathologies, sarcoidosis, severe bacterial infection, HIV infection, among others.
‐ TB cases and controls having undergone immunosuppressive treatment, corticosteroid therapy for more than one month or vaccination with live attenuated vaccines two months prior to the test.
‐ TB cases and controls with a history of known allergic reaction to one of the components of TST or during a previous administration.
Data was collected using a standardized questionnaire for all participants including information regarding demographic variables (age, sex, educational level), as well as medical history of any pathological condition that may lead to tuberculin anergy, the status of participant (case or control) and the type of TB infection for cases (pulmonary or extra-pulmonary), as well as the date of administration and lecture of the TST result.
Tuberculin skin testing (TST)
It should be noted that the TST was performed for all participants, cases, and controls, not as an investigator-mandated intervention, but as part of the normal process of diagnosing the etiology of their disease.
The test was performed using the Mantoux method, in all participants (cases and controls) by injecting 0.1 mL of tuberculin solution (tuberculin PPD [Purified Protein Derivative RT23 from Copenhagen]), strictly intradermally, on the forearm away from any other scar.6
The TST results were read 72 hours after administration by the same trained investigator. Diameter of induration was determined by calculating the average of the transverse and longitudinal diameter of induration (in mm).6
Continuous and categorical variables were summarized using the mean (± standard deviation) and relative frequencies (expressed into percentage), respectively. Pearson’s Chi-square test and Student’s T test were used for comparing two percentages or two means for independent samples, respectively. For all statistical tests, the significance level adopted was 0.05. All statistical analyses were performed using SPSS version 23.0 software.
Sensitivity and specificity
To assess the performance of the TST, we first calculated the sensitivity and specificity of the TST induration diameter as well as the Youden index for different possible thresholds (from a diameter of TST ≥ 5 mm to a TST diameter ≥ 15 mm).
The calculation of the 95% confidence intervals (95% CI) of the sensitivity and specificity was done using an Excel calculator and using the properties of the exact binomial distribution.
Sensitivity (Se) was defined as the proportion (ranging from 0 to 1) of tuberculosis cases who tested positive (true positives = TP) with TST. The proportion of cases which were not identified using TST were false negative (FN) results.14
Specificity (Sp) was defined as the proportion (ranging from 0 to 1) of controls who tested negative for the disease (true negatives = TN) with TST. The proportion of controls which were tested positive using TST were false positive (FP) results.14,15
The two intrinsic qualities of the test, sensitivity, and specificity, were aggregated into an index, known as the Youden Index noted J such that: J = (Se + Sp) – 1.
Youden index varies between -1 and +1; a value less than or equal to 0 reflects the diagnostic ineffectiveness of the test. The test performance is better when its Youden index is close to 1.16
Positive and negative predictive values (PPV and NPV) (or post-test probabilities) of TST were also determined for different possible TST thresholds, using the properties of Bayes’ theorem17 and using Fagan’s nomogram graph.18
The predictive values were deducted from Fagan’s nomogram graph based on the TB prevalence (or pre-test probability) according to TB patients’ data from three university hospitals in Tunis, Tunisia (The Rabta Hospital, Charles Nicolle Hospital and Abderrahmane Mami hospital). The estimated TB prevalence was around 1% in 2016.
The positive and negative likelihood ratios (LR) were also determined for different possible thresholds of the TST, and their 95% confidence intervals (95% CI) were calculated using the Excel calculator.
The positive LR (varies from 1 to +∞) indicates that the TST is more discriminating tuberculosis cases from non-cases when it is far from 1 and the specificity approaches 1.
The negative LR (varies from 0 to 1) indicates that the TST is better able to discriminate TB cases from non-cases when it is closer to 0 and the sensitivity approaches to 1.
ROC and area under the curve (AUC)
The overall discriminative performance of the TST was evaluated using the ROC curve,19 to determine the optimal cut-off value of the best couple (sensitivity, specificity).
We established a ROC curve, first for all participants (age, sex and site of infection combined), then according to site infection (pulmonary and lymph node tuberculosis) and according to age groups.
The AUC was calculated for each ROC curve and presented with their 95% confidence intervals (95% CI). The AUC can vary between 0.5 and 1. The closer it is to 1, the better the discriminating ability or overall diagnostic value of the TST.
All included participants were informed about the purpose of the study and have given their written informed consent to participate to the study and carrying out the TST. They were also informed about their right to refuse participation or drop out at any moment of the study collection. All collected information and data analysis was confidential and anonymous during and after data collection.
In Tunisia, there was only one national ethics committee at the time of study (2014) which was in charge of requests for clinical trial type studies only. Since no blood samples were taken or procedures performed on the participants for research purposes, and since the normal process of diagnosis and management of all patients was respected without any intervention (the TST was practiced for all participants for a diagnostic purpose and not for a research purpose), we did not submit our study to that committee at the time. The study was retrospectively approved, on August 2023 by the ethics committee of the Faculty of Medicine Of Tunis under the approval number CE-FMT/2023/03/HCN/V1.
Overall, 1050 participant were included (336 cases and 714 controls) with a mean age of 35 years. Of the cases, more than half were female (n = 179, 53.3%), sex ratio (M/F) = 0.87, with a mean (± standard deviation) age of 38.3 (±11.8) years. Of the controls, half were female (n = 358, 50.1%), sex-ratio (M/F) = 0.99, with a mean (± standard deviation) age of 33.6 (±11.0) years (see Table 1). The Bacillus Calmette-Guérin vaccination scar was present among all included controls and among most TB cases (83.8%).
Characteristics | Cases n (%) mean( ±SD) | Controls n (%) mean (±SD) | |
---|---|---|---|
Sex | Male | 157 (46.7) | 356 (49.9) |
Female | 179 (53.3) | 358 (50.1) | |
Age (years) | 38.3 (11.8) | 33.6 (11.0) | |
Education level | Analphabetic | 56 (16.7) | 29 (4.1) |
Primary/Secondary | 233 (69.3) | 451 (63.1) | |
University | 47 (14.0) | 234 (32.8) | |
Site of infection (for cases) | Pulmonary | 121 (36.0) | |
Extra-Pulmonary | |||
Lymph node | 180 (53.6) | ||
Other* | 35 (10.4) |
The mean diameter of the TST induration was significantly higher among cases than controls (13.7 mm (SD: 0.7 mm) versus 6.2 mm (SD: 6.4 mm); p = 10-6) (see Figure 1).
There was no significant difference in the mean diameter of the TST induration between male and female participants (8.5 mm (SD: 6.9 mm) versus 8.7 mm (SD: 7.9 mm); p = 0.6).
The TST induration diameter disaggregated by sex and by participant status is represented in Table 2.
For the global performance of TST among all the 1050 participants, the best discriminant cut-off value of TST was when the induration diameter of TST was ≥11 mm, with a sensitivity and specificity of 73.7% (95% CI: 68.8 % – 78.1 %) and 76.6 % (95% CI: 73.3 % – 79.5 %), respectively, with a Youden Index of 0.503 and an area under the curve (AUC) of 0.789 (95% CI: 0.758 – 0.819; p = 0.01). The sensitivity became >80% from a TST cut-off value ≥9 mm to 5 mm (see Table 3, Figure 2A). Using Fagan’s nomogram for the best selected threshold value of TST (≥11 mm), positive and negative predictive values were determined with values of 3.11% and 99.52%, respectively (see Figure 3).
(A) All 1050 participants. (B) 180 cases of lymph node tuberculosis. (C) 121 cases of pulmonary tuberculosis.
Depending on the sex distribution, the best cut-off value of TST among male participants (sensitivity: 77.2%; specificity: 75.2%; Youden Index: 0.524) and female participants (sensitivity: 70.7%; specificity: 77.9%; Youden Index: 0.486) was also ≥11 mm for both sex groups. The sensitivity became >80% from a TST cut-off value ≥10 mm to 5 mm and a TST cut-off value ≥8 mm to 5 mm, for male participants and female participants, respectively (see Table 4).
However, depending on the infection localisation, the best cut-off value for pulmonary TB (121 cases versus 714 controls) and lymph node TB (180 cases versus 714 controls) was also ≥11 mm, with an area under the curve (AUC) of 0.778 (95% CI: 0.733-0.822; p = 0.02) and 0.814 (95% CI: 0.777-0.851; p = 0.01), respectively (see Table 5, Figure 2B and C). The sensitivity became >80% from a TST cut-off value ≥8 mm to 5 mm and a TST cut-off value ≥10 mm to 5 mm, for pulmonary TB and lymph node TB, respectively (see Table 5).
Depending on age groups, the best cut-off value of TST among participants aged <35 years old (n = 148 cases; sensitivity: 79.1%; specificity: 79.8%; Youden Index: 0.589; AUC: 0.822 (95% CI: 0.782 – 0.862; p = 0.02)) and participants aged ≥35 years (n = 188 cases; sensitivity: 70.2%; specificity: 71.3%; Youden Index: 0.415; AUC: 0.750 (95% CI: 0.703 – 0.798; p = 0.02)) was also ≥11 mm for both groups.
TB remains a serious public health threat. In this study, we evaluated the performance of TST in a diagnostic situation and we measured its different accuracy metric indicators for different possible cut-off points, among a large size population, with 1050 participants. The best discriminating cut-off value was chosen based on the best couple sensitivity and specificity with the highest Youden Index value.
The TST is widely used for the detection of a latent TB form and for screening the contacts of a TB patient. This test has the advantage of being easy, rapid and safe to conduct, and low-cost.20
In view of our results, it may be concluded that the TST can be used for active TB diagnosis with a moderate global performance (AUC = 0.789),19,21 and a good couple of sensitivity and specificity at the cut-off point of 11 mm.
The results of the positive (3.1) and negative (0.34) LR of the selected cut-off value also indicates a moderate performance and utility of the TST.22
There is no international agreement about the best cut-off value of the TST which clearly distinguishes between a positive and negative test result. Different suggestions were proposed varying between 5mm and 10mm diameter of induration. But it depends on many epidemiological and immunological patient risk factors.23
Considering our results, the sensitivity improved (>80%) from a TST cut-off value ≥9 mm to 5 mm. So, the induration diameter threshold of TST can be reduced from 11mm (the best cut-off value) to 9mm to enhance the sensitivity of TST and consequently, reduce the frequency of false negatives and limit the miss-diagnosis of a tuberculosis infection case, regardless of its pulmonary or extra-pulmonary form.
Moreover, the challenging difficulties encountered in diagnosing extrapulmonary TB, which is requiring invasive diagnostic procedures, poses a diagnostic problem for clinicians.24
In Tunisia, diagnostic confirmation of lymph node TB requires needle aspiration or surgical excision with cytological or pathological and bacteriological study (direct examination, Xpert MTB/RIF test and culture). Generally, the lymph node TB lesions are paucibacillary with a rarely positive direct examination (about 10% of positive rate only).11
Since lymph node tuberculosis is usually more difficult to clinically diagnose than pulmonary tuberculosis, it is thus possible, in case of lymph node tuberculosis suspicion, to use the 10 mm threshold to guarantee a good sensitivity (>80%) of the test.
However, the TST has some limitations. the skin induration measurement is operator-dependent and remains subjective; its interpretation depends on many other factors, like the immunological status, co-morbid conditions, and prior Calmette Guerin (BCG) vaccination.20,25
Indeed, the TST is not a specific test that distinguish between TB infection and post-vaccination allergy. The cross-reaction of TST with antigens of other non-tuberculous mycobacteria and with the antigen used for the Calmette Guerin (BCG) vaccine generates false positives, inducing an apparent false increase in sensitivity and a decrease in specificity.3,26
Certainly, progress has been made in developing other rapid tests for TB screening like Xpert MTB/RIF, Xpert MTB/RIF Ultra2 and Interferon-gamma release assay (IGRA) tests. However, they are still often not available and expensive, especially in low- and middle-income countries, which have the highest TB burden.3,27 That is why these novel tests are not an accessible and available option for TB diagnosis in these unfavorable areas.27
The strengths of our study include the large sample size of included participants (1050), the easy application of the TST and the clear and detailed description of the diagnostic test evaluation methodology, for further repeatability. As any epidemiological study, there were some limitations, essentially including the sample of controls which may not be representative of all non-TB participants the real life, and the non-blinded administration of the test, which may influence the operator interpretation of the test result.
Therefore, for more suitable evaluation of the TST performance for diagnosing active TB, further prospective research studies are needed. We propose to follow-up for a minimum two years, and a sample of closely exposed contacts of TB patients, for prospective detection of incident newly infected cases. At the end of the follow-up period, the contact subjects will be divided into newly diagnosed active cases and controls, which represent a nested case-control study. This method will minimize the selection bias for a good representativeness of included cases and controls. All included participants will receive the TST and the IGRA test, at the same time. It will be therefore possible to compare the performance of the two tests by comparing their corresponding ROC curves.
A progressive diagnostic approach based on a flowchart decision strategy using consecutive tests can also be a good alternative. The TST can be used as the first line test because it is the easiest and cheapest test to apply at a large scale (with good sensitivity); then, considering TST results, the IGRA test can be used in second line for more specific results.28–30
Economic evaluation of the cost-effectiveness of this diagnostic approach must be conducted to support the usefulness and the rational application of this strategy for public health purposes.
Finally, the ROC curve certainly has an important role in determining the best TST cut-off for the TB diagnosis or screening. However, it is important to consider that the interpretation of the TST characteristics is based on probabilities. Also, its predictive values cannot be properly assessed without considering all other risk factors. So, there is a need to develop a predictive clinical score for TB based on a range of epidemiological (TB prevalence, demographic characteristics, notion of contagion), clinical (individual risk factors, TB symptoms) and paraclinical examinations (chest X-ray, among others) factors. This approach can optimize and guide the use of the various diagnostic methods for TB. Unnecessary invasive examinations and procedures will be, therefore, avoided while reducing TB diagnosis cost.
The TST is simple to perform and a low-cost test. It can be used for active TB detection with a moderate global performance and accepted sensitivity and specificity at the cut-off point of 11mm. However, it has some disadvantages, especially its low specificity regarding high rates of false positives in areas with mass BCG vaccination. The association of the TST with another test such as the IGRA test would be a good alternative for early and accurate diagnosis of TB.
Harvard Dataverse: Tuberculin Skin Test, https://doi.org/10.7910/DVN/630CJG
This project contains the following underlying data:
• TST-final.tab (anonymised underlying data collected from 1050 included patient)
Data are available under the terms of the Creative Commons Zero “No rights reserved” data waiver (CC0 1.0 Public domain dedication).
The authors would like to thank all TB center coordinators involved in this study for their contribution and participation in facilitating the data collection process for the investigators.
Views | Downloads | |
---|---|---|
F1000Research | - | - |
PubMed Central
Data from PMC are received and updated monthly.
|
- | - |
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Medical Statistician, with limited experience of TB research
Competing Interests: No competing interests were disclosed.
Is the work clearly and accurately presented and does it cite the current literature?
Yes
Is the study design appropriate and is the work technically sound?
Yes
Are sufficient details of methods and analysis provided to allow replication by others?
Yes
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Yes
Are the conclusions drawn adequately supported by the results?
Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Pediatric Infectious and Tropical Diseases
Is the work clearly and accurately presented and does it cite the current literature?
Partly
Is the study design appropriate and is the work technically sound?
Partly
Are sufficient details of methods and analysis provided to allow replication by others?
Partly
If applicable, is the statistical analysis and its interpretation appropriate?
Yes
Are all the source data underlying the results available to ensure full reproducibility?
Partly
Are the conclusions drawn adequately supported by the results?
Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Medical Statistician, with limited experience of TB research
Alongside their report, reviewers assign a status to the article:
Invited Reviewers | ||
---|---|---|
1 | 2 | |
Version 2 (revision) 01 Jul 24 |
read | read |
Version 1 10 Oct 23 |
read | read |
Provide sufficient details of any financial or non-financial competing interests to enable users to assess whether your comments might lead a reasonable person to question your impartiality. Consider the following examples, but note that this is not an exhaustive list:
Sign up for content alerts and receive a weekly or monthly email with all newly published articles
Already registered? Sign in
The email address should be the one you originally registered with F1000.
You registered with F1000 via Google, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Google account password, please click here.
You registered with F1000 via Facebook, so we cannot reset your password.
To sign in, please click here.
If you still need help with your Facebook account password, please click here.
If your email address is registered with us, we will email you instructions to reset your password.
If you think you should have received this email but it has not arrived, please check your spam filters and/or contact for further assistance.
Comments on this article Comments (0)