Predicting acute uncomplicated urinary tract infection in women: a systematic review of the diagnostic accuracy of symptoms and signs

Background Acute urinary tract infections (UTI) are one of the most common bacterial infections among women presenting to primary care. However, there is a lack of consensus regarding the optimal reference standard threshold for diagnosing UTI. The objective of this systematic review is to determine the diagnostic accuracy of symptoms and signs in women presenting with suspected UTI, across three different reference standards (102 or 103 or 105 CFU/ml). We also examine the diagnostic value of individual symptoms and signs combined with dipstick test results in terms of clinical decision making. Methods Searches were performed through PubMed (1966 to April 2010), EMBASE (1973 to April 2010), Cochrane library (1973 to April 2010), Google scholar and reference checking. Studies that assessed the diagnostic accuracy of symptoms and signs of an uncomplicated UTI using a urine culture from a clean-catch or catherised urine specimen as the reference standard, with a reference standard of at least ≥ 102 CFU/ml were included. Synthesised data from a high quality systematic review were used regarding dipstick results. Studies were combined using a bivariate random effects model. Results Sixteen studies incorporating 3,711 patients are included. The weighted prior probability of UTI varies across diagnostic threshold, 65.1% at ≥ 102 CFU/ml; 55.4% at ≥ 103 CFU/ml and 44.8% at ≥ 102 CFU/ml ≥ 105 CFU/ml. Six symptoms are identified as useful diagnostic symptoms when a threshold of ≥ 102 CFU/ml is the reference standard. Presence of dysuria (+LR 1.30 95% CI 1.20-1.41), frequency (+LR 1.10 95% CI 1.04-1.16), hematuria (+LR 1.72 95%CI 1.30-2.27), nocturia (+LR 1.30 95% CI 1.08-1.56) and urgency (+LR 1.22 95% CI 1.11-1.34) all increase the probability of UTI. The presence of vaginal discharge (+LR 0.65 95% CI 0.51-0.83) decreases the probability of UTI. Presence of hematuria has the highest diagnostic utility, raising the post-test probability of UTI to 75.8% at ≥ 102 CFU/ml and 67.4% at ≥ 103 CFU/ml. Probability of UTI increases to 93.3% and 90.1% at ≥ 102 CFU/ml and ≥ 103 CFU/ml respectively when presence of hematuria is combined with a positive dipstick result for nitrites. Subgroup analysis shows improved diagnostic accuracy using lower reference standards ≥ 102 CFU/ml and ≥ 103 CFU/ml. Conclusions Individual symptoms and signs have a modest ability to raise the pretest-risk of UTI. Diagnostic accuracy improves considerably when combined with dipstick tests particularly tests for nitrites.


Background
Acute uncomplicated urinary tract infections (UTI) are one of the most common bacterial infections among women presenting to primary care, with an annual incidence of 7% for all ages of women peaking at 15-24 years and women older than 65 [1]. Approximately one third of all women have had at least one physician-diagnosed uncomplicated UTI by the age of 26 years [2].
The original reference standard for diagnosing UTI was the presence of significant bacteriuria, defined as the isolation of at least 10 5 colony-forming units (CFU) of a single uropathogen, in a clean catch or catherised urine specimen [3]. However, this cut-off limit has been debated in recent years resulting in the use of reduced diagnostic thresholds ranging from 10 2 [4][5][6][7] and 10 3 [8][9][10][11].
The pre-test probability of asymptomatic bacteriuria in women of reproductive age is approximately 5% [12,13]. However, the pre-test probability of an uncomplicated UTI is shown to increase from 5% to 50% among women presenting with at least one symptom of an uncomplicated UTI [14]. Symptoms of an uncomplicated UTI include dysuria (painful voiding), frequency (frequent voiding of urine), urgency (the urge to void immediately), and hematuria (presence of blood in urine). In contrast, patients presenting with vaginal discharge or irritation have a decreased risk of an uncomplicated UTI [14]. The presence or absence of symptoms function as useful diagnostic tests. Near patient testing in the form of urinary dipsticks are also commonly used in Primary Care to improve the precision of UTI diagnosis, providing immediate results which can be interpreted alongside patient symptoms.
Although empirical treatment of UTI is most costeffective [15,16], prescribing without confirmation of diagnosis contributes to the growing problem of resistance against uropathogens in primary care [17].
A previous systematic review established the diagnostic accuracy of symptoms and signs for UTI [14], however, it remains unclear whether the diagnostic accuracy of symptoms and signs varies when alternative reference standards are applied. The aim of this systematic review is to determine the diagnostic accuracy of symptoms and signs of UTI in adult women across three different reference standards, 10 2 , 10 3 and 10 5 CFU/ml. In addition, we aim to determine the diagnostic accuracy of symptoms and signs combined with dipstick test results.

Methods
The PRISMA guidelines for reporting on systematic reviews and meta-analysis were followed to conduct this review (Additional file 1).

Search strategy
We performed a systematic search of three online databases, Pubmed (1966 to April 2010), Embase (1973 to April 2010) and the Cochrane Library (1973 to April 2010). A combination of MeSH terms and text words were used including: 'urinary tract infection/pyelonephritis/cystitis/urethritis', 'physical examination/medical history taking/professional competence', 'sensitivity and specificity', ' reproducibility of results/diagnostic tests, routine/decision support techniques/bayes theorem/predictive value of tests'. All combinations were restricted to 'women and female'. This search was supplemented by checking references of filtered papers and searching Google Scholar [18]. No restrictions were placed on language.

Study selection
To be eligible for inclusion, the studies had to fulfil the following criteria: 1) Have a study population of adult symptomatic women with suspected uncomplicated UTI presenting to a primary care setting. 2) Use a cohort or cross-sectional study design. Case control studies were excluded. 3) Investigate the diagnostic accuracy of symptoms and signs of UTI using a urine culture from a cleancatch or catherised urine specimen as the reference test, with a diagnostic threshold of at least ≥ 10 2 CFU/ml. 4) Include sufficient data to allow for the calculation of sensitivity, specificity, negative and positive predictive values and the prevalence of uncomplicated UTI.

Data extraction
The number of true positives, false positives, true negatives and false negatives for each sign and symptom were extracted from each of the studies and a 2 × 2 table was constructed. Discrepancies were resolved by discussion between the two reviewers (LG and GC). Authors were contacted to provide further information when there was insufficient detail in an article to construct a 2 × 2 table.

Quality assessment
The methodological quality of the selected studies was evaluated independently by two reviewers (LG and GC) using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool, a validated tool for the quality assessment of diagnostic accuracy studies [19]. This tool was modified to ensure appropriateness to the present study and included twelve questions from the QUADAS tool with two additional questions extracted from a different review [20]. If no consensus was achieved, studies were evaluated by a third independent reviewer (TF).

Data synthesis and analysis Summary estimates across different reference standards
We used the bivariate random effects model to estimate summary estimates of sensitivity and specificity and their corresponding 95% confidence intervals. This approach was used as it preserves the two-dimensional nature of the original data and takes into account both study size and heterogeneity beyond chance between studies [21]. In addition, the bivariate model estimates and incorporates the negative correlation which may arise between the sensitivity and specificity of a given sign or symptom as a result of differences in reference standards used in different studies. These alternative thresholds are important when attempting to understand the diagnostic accuracy of symptoms and signs predicting uncomplicated UTI as studies have used different thresholds ranging from ≥ 10 2 CFU/ml, ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ml. However, pooled estimates cannot be calculated using the bivariate model with less than 4 studies.
We plotted the individual and summary estimates of sensitivity and specificity for each symptom and sign at the different threshold levels in a receiver operating characteristic graph, plotting a symptom's sensitivity (true positive) on the y axis against 1-specificity (false negative) on the x axis. We also plotted the 95% confidence region and 95% prediction region around the pooled estimates to illustrate the precision with which the pooled values were estimated (confidence ellipse around the mean value) and to illustrate the amount of between study variation (prediction ellipse). We assessed heterogeneity visually using the summary ROC plots and statistically by using the variance of logit transformed sensitivity and specificity, with smaller values indicating less heterogeneity among studies.

Bayesian analysis and near patient testing (dipstick)
To examine the influence of threshold effects when considering alternative reference standards we conducted subgroup analysis across the three different thresholds: ≥ 10 2 CFU/ml, ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ml. Using Bayes theorem the post-test odds of a UTI were estimated by multiplying the pretest odds by the likelihood ratio, where pre-test odds is calculated by dividing the pre-test probability by (1-pre-test probability) and the post-test probability equals post-test odds divided by (1 + post-test odds) [22]. Finally, the diagnostic accuracy of individual symptoms and signs combined with dipstick test results for nitrites, leucocyte-esterase and combined nitrites and leucocyte-esterase, was determined using data synthesised in a previous high quality systematic review regarding the diagnostic accuracy of dipstick urinalysis [23].
We used Stata version 10.1(StataCorp, College Station, Tx, USA), particularly the metandi commands, for all statistical analyses

Characteristics of included studies
The sixteen studies included 3,711 patients and were carried out in a primary care setting. One study was based in the USA [39], two in Canada [4,6], one in New Zealand [38], eight in the UK [8,9,24,25,37,40,41,43] and four in other European countries [7,10,26,42]. The mean weighted prior probability is 65.1 using a reference test of ≥ 10 2 CFU/ml. The mean weighted prior probability using a reference test of ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ ml is 55.4% and 44.8% respectively. Summary characteristics of each included study are presented in Table 1.

Quality assessment
The summary diagram of the quality assessment is shown in Figure 2. The overall quality of the included studies ranges from moderate to good. It is important to note that several studies were conducted before the introduction of standards for reporting diagnostic accuracy studies [37][38][39][40][41]. Spectrum bias is identified as a potential source of bias across certain studies, with studies including both complicated and uncomplicated patients [7,38] or failing to clearly report whether the study was focusing on complicated or uncomplicated UTI [26,40]. Partial verification bias is also noted in two studies whereby only a selected sample of patients' symptoms are verified by the reference test [24,41]. Furthermore, the presence of un-interpretable test results and blinding of symptoms and signs and reference test results are poorly reported.

Diagnostic test accuracy of symptoms and signs Summary estimates across different reference standard thresholds
Sixteen studies examined the accuracy of ten different symptoms and signs of UTI. The pooled sensitivities, specificities and the respective variance of the logittransformed sensitivity and specificity, for individual symptoms and signs at each of the three reference standard threshold levels are presented in Tables 2, 3 and 4 respectively. Furthermore, the summary estimates of positive and negative likelihood ratio's for individual symptoms and signs at each of the three threshold levels are presented in Table 5. Six symptoms are identified as having useful diagnostic value at a reference standard threshold of ≥ 10 2 CFU/ml, as their 95% confidence interval values do not cross the line of no effect. Presence of dysuria, frequency, hematuria, nocturia and urgency are found to increase the probability of UTI. Presence of vaginal discharge decreases the probability of UTI. Presence of hematuria in urine has the highest diagnostic utility (LR+ 1.72), with a specificity of 0.85 and a sensitivity of 0.25, thus hematuria when present is more useful in 'ruling in' UTI. In contrast, all other significant symptoms are identified as being more useful in 'ruling out'. A similar pattern of results emerge using a higher reference standard threshold ≥ 10 3 CFU/ml. Consistent with lower threshold effects dysuria, frequency and urgency remain significant symptoms for ruling out a urinary tract infection at ≥ 10 5 CFU/ml. The individual and summary estimates of sensitivity and specificity, the 95% confidence region and 95% prediction region for each symptom and sign at each of the threshold levels are presented in a receiver operating characteristic graph in figures 3, 4 and 5. The 95% confidence region remains large for several symptoms and signs across the different diagnostic thresholds, with the exception of dysuria, frequency and hematuria. This indicates greater precision of the pooled estimates for dysuria, frequency and hematuria. The 95% prediction    (Lawson 1973). e One study reported flank pain as loin pain (Lawson 1973

Principal findings
Individual symptoms and signs suggestive of a UTI have modest diagnostic discriminative value when assessed against three reference standard thresholds for UTI. Dysuria, frequency and urgency have a higher sensitivity than specificity and are more useful in ruling out a UTI diagnosis when absent across all three reference standard thresholds ≥ 10 2 CFU/ml, ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ml. In contrast, hematuria has a higher specificity than sensitivity and is more useful in ruling in a diagnosis of UTI when present across the reference standard thresholds ≥ 10 2 CFU/ml and ≥ 10 3 CFU/ml. Combining positive dipstick test results, particularly tests for nitrites, with symptoms increases post-test probability of a UTI. In particular, presence of hematuria combined with a positive dipstick test result for nitrites increases

Context of previous studies
The findings of this systematic review are consistent with a previous systematic review which concluded that no sign or symptom on its own is powerful enough to 'rule in' or 'rule out' the diagnosis of UTI [14]. However, the relative diagnostic importance of individual symptoms and signs varies between this review and the previous systematic review [14]. The previous systematic review found that presence of dysuria, frequency, hematuria, back pain and costovertebral angle tenderness increase the probability of UTI using a diagnostic threshold ranging from between ≥ 10 2 CFU/ml and ≥ 10 5 CFU/ml, also history of vaginal discharge, history of vaginal irritation and vaginal discharge on examination decrease the probability of a UTI. In this systematic review we found that dysuria and frequency increase the probability of UTI across different reference standard thresholds ≥ 10 2 CFU/ml, ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ml. Hematuria is   also significant in the present study using a diagnostic threshold of ≥ 10 2 CFU/ml and ≥ 10 3 CFU/ml. However, in contrast to the previous systematic review back pain is not significantly associated with UTI across the different reference standard thresholds. Vaginal discharge is identified as an important symptom for decreasing the probability of UTI in the present study. Such differences may be an artefact of different methodological approaches taken. Firstly, the previous systematic review pooled all studies irrespective of the reference standard threshold used, whereas the present study sought to determine the importance of individual symptoms and signs at different reference standard thresholds. In addition, our inclusion criteria was more conservative, excluding studies which involved self-diagnosis, case-control study designs and different healthcare settings (i.e. not primary care settings) where the prevalence of symptoms may differ and increase the chance of spectrum bias.

Strengths and limitations of this study
The systematic search, the conservative inclusion criteria, the inclusion of additional data from authors, and the quality assessment of the included studies can be seen as strengths of this study. In addition, given the lack of consensus regarding reference standard thresholds for UTI, the current study is the first study to determine the diagnostic accuracy of symptoms and signs across the three thresholds ≥ 10 2 CFU/ml, ≥ 10 3 CFU/ml and ≥ 10 5 CFU/ml. Lastly, this study highlights the additional importance of using dipstick test, particularly tests for nitrites, as an additional diagnostic tool when ruling in a UTI diagnosis based on particular symptomatology.
We acknowledge that this review has limitations. Variability of diagnostic accuracy estimates across studies is high. This may be due to the fact that we did not restrict the age of women included in the meta-analysis. It is known that the prevalence of UTI differs across age groups, peaking at 15-24 years and greater than 65 years [45]. In addition definitions used to describe individual symptoms and signs vary across studies. For example, 'lower abdominal pain' has been defined as 'suprapubic pain' [4,6,7], 'suprapubic pressure' [42] or 'abdominal pain' [24]. Furthermore, as the bivariate random effects model is used symptoms and signs are analysed, when at least 4 studies are included. Therefore few symptoms and signs are excluded from our meta-analysis particularly at the higher reference standard threshold of ≥ 10 5 CFU/ml. Finally, while the probability of UTI increases when the presence of certain symptoms are combined with positive dipstick test results, it is important to acknowledge that the predictive value of the dipstick test result, was taken from a meta-analysis which included men and pregnant women [23].

Implications for practice
Individual symptoms and signs will modestly increase the post-test probability and cannot accurately 'rule in' Figure 4 Receiver operating characteristic graphs with 95%-confidence region and 95%-prediction region for each sign and symptom (10 3 ). or 'rule out' the diagnosis of a UTI. Subgroup analysis shows improved diagnostic accuracy using lower reference standards of ≥ 10 2 CFU/ml and of ≥ 10 3 CFU/ml. In addition, combining nitrite dipstick test results with clinical symptoms and signs is useful for ruling in a UTI diagnosis and deciding on the optimal patient management strategy. More recently, formal clinical prediction rules for UTI that incorporate the independent effects of symptoms and signs into a "risk score" have been proposed as an alternative management strategy when considering antibiotic treatment [8]. This approach appears to be equivalent to alternative management strategies for UTI in women including empirical immediate antibiotics, empirical delayed antibiotics, or use of near-patient testing with a dipstick in terms of duration or severity of symptoms. However, in terms of antibiotic usage, use of a dipstick results in fewer antibiotics being prescribed when compared to immediate empirical antibiotics or use of a UTI "risk score" [46].

Future studies
The current approach of evaluating symptoms and signs as a diagnostic test is in general two-dimensional, and ignores symptom severity [8,9,28] In the future, focusing on severity of symptoms may provide more valuable diagnostic information.

Conclusions
Individual symptoms and signs, independent of reference standard threshold have a modest ability to 'rule in' or 'rule out' the diagnosis of UTI. Use of a dipstick test enhances diagnostic utility and reduces the chance Figure 5 Receiver operating characteristic graphs with 95%-confidence region and 95%-prediction region for each sign and symptom (10 5 Table 9 Post-test probability of significant symptoms with a positive (LR 2.57) or negative dipstick (LR 0.15) test for nitrites and leucocyte-esterase combined [23] Symptom of prescribing unnecessary antibiotics. Future studies should focus on the refinement of diagnosis utilising information on severity and duration of symptoms, alone, in combination and alongside dipstick testing.

Additional material
Additional file 1: PRISMA checklist. Guidelines for the reporting on systematic reviews and meta-analyses