Detecting depressive and anxiety disorders in distressed patients in primary care; comparative diagnostic accuracy of the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS)

Background Depressive and anxiety disorders often go unrecognized in distressed primary care patients, despite the overtly psychosocial nature of their demand for help. This is especially problematic in more severe disorders needing specific treatment (e.g. antidepressant pharmacotherapy or specialized cognitive behavioural therapy). The use of a screening tool to detect (more severe) depressive and anxiety disorders may be useful not to overlook such disorders. We examined the accuracy with which the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS) are able to detect (more severe) depressive and anxiety disorders in distressed patients, and which cut-off points should be used. Methods Seventy general practitioners (GPs) included 295 patients on sick leave due to psychological problems. They excluded patients with recognized depressive or anxiety disorders. Patients completed the 4DSQ and HADS. Standardized diagnoses of DSM-IV defined depressive and anxiety disorders were established with the Composite International Diagnostic Interview (CIDI). Receiver Operating Characteristic (ROC) analyses were performed to obtain sensitivity and specificity values for a range of scores, and area under the curve (AUC) values as a measure of diagnostic accuracy. Results With respect to the detection of any depressive or anxiety disorder (180 patients, 61%), the 4DSQ and HADS scales yielded comparable results with AUC values between 0.745 and 0.815. Also with respect to the detection of moderate or severe depressive disorder, the 4DSQ and HADS depression scales performed comparably (AUC 0.780 and 0.739, p 0.165). With respect to the detection of panic disorder, agoraphobia and social phobia, the 4DSQ anxiety scale performed significantly better than the HADS anxiety scale (AUC 0.852 versus 0.757, p 0.001). The recommended cut-off points of both HADS scales appeared to be too low while those of the 4DSQ anxiety scale appeared to be too high. Conclusion In general practice patients on sick leave because of psychological problems, the 4DSQ and the HADS are equally able to detect depressive and anxiety disorders. However, for the detection of cases severe enough to warrant specific treatment, the 4DSQ may have some advantages over the HADS, specifically for the detection of panic disorder, agoraphobia and social phobia.


Background
Depressive and anxiety disorders are prevalent in primary care patients, although these disorders are often not recognized as such by the general practitioner (GP) [1][2][3]. One explanation for non-recognition of psychiatric disorders concerns the time constraints in general practice, and the competing demands put upon the GP [4]. A special situation occurs in patients with an overt psychosocial presentation. These distressed patients present with psychological complaints, such as nervousness or feeling depressed, or psychosocial problems, such as occupational or marital problems, which are readily available for discussion with the GP [5]. Because these complaints and problems often arise in the context of 'life stress', the discussion usually remains exclusively focused on psychosocial problems, not on complaints and symptoms that might indicate a specific psychiatric disorder. In order to diagnose such disorders, GPs need to make a clinical assessment of certain symptoms. However, it is not always easy for the GP to change the subject of the discussion away from the (presumed) causes of the symptoms, towards an assessment of the symptoms themselves. Therefore, whereas GPs are fully aware of the presence of psychological problems, they may fail to establish specific psychiatric diagnoses of depressive and anxiety disorders in many cases where such diagnoses are justified [6]. This is especially problematic in more severe psychiatric disorders since these often require specific treatment. The use of questionnaires can be a useful strategy to detect (more severe) depressive and anxiety disorders in patients with relatively high risks, i.e. in patients presenting with distress in general practice [7].
In this study we examined and compared the diagnostic accuracy of two questionnaires, the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS), with respect to the detection of DSM-IV defined depressive and anxiety disorders in a sample of distressed patients in primary care with and without such disorders. In distressed patients we do not need questionnaires to detect mild depressive or anxiety disorders (let alone depressive or anxiety symptoms) since these patients often need no other interventions than non-specific counselling or problem-solving focused on how to cope with the stress in the patients' lives [8]. On the other hand, it is specifically important to identify patients with more severe disorders who may be in need of specific treatment (e.g. antidepressant pharmacotherapy or specialized cognitive behavioural therapy). Therefore, we did not only look at how well the 4DSQ and the HADS detected any DSM-IV defined depressive and anxiety disorders but we also analyzed, more specifically, how well these questionnaires detected moderate and severe major depressive disorders, panic disorder, agoraphobia and social phobia.

Patients and procedure
We analyzed cross-sectionally baseline data collected for a randomized controlled trial concerning an activating social work intervention for distressed patients in general practice [9]. Patients were included in the study if they were aged 18-60, had a 'nervous breakdown' according to the GP, were employed, and had been on sick leave for no longer than three months. Dutch GPs use the diagnostic label 'nervous breakdown' (or 'overstressed', in Dutch 'overspanning') for a syndrome that is associated with too much stress to the extent that the patient cannot cope anymore [10,11]. This condition is characterized by psychological distress, failure to cope, and social dysfunctioning (i.e. sick leave in employed people) [12]. The GPs were instructed not to include patients with obvious depressive and anxiety disorders (i.e. patients in whom they had clinically diagnosed such disorders). Adequate command of the Dutch language, and no current psychological treatment were additional inclusion criteria. Between August 2001 and July 2003, 70 GPs in the city of Almere, the Netherlands, assessed and referred 370 patients to the study centre. The patients were contacted by telephone, information about the study was given and inclusion criteria were checked. Thirty patients were unwilling to participate, and a further 33 patients did not meet the inclusion criteria. With the remaining patients an appointment was made for a baseline interview in their homes and they were sent written information and baseline questionnaires by post. The home visit took place on average 5.7 days after the initial telephone contact. After the patients had been fully informed about the study, written informed consent was obtained. The participating patients (n = 307) were then interviewed, and subsequently their questionnaires were checked for missing values by the interviewer. Unfortunately, the date at which the questionnaires were completed was not recorded. Therefore, the exact time interval between the questionnaires and the interviews is unknown, but we estimate it to be on average between 2 and 5 days. Because 12 patients failed to complete the questionnaires, the sample for the present study comprised 295 patients. Prior to the start of the study, approval was obtained from the ethical committee of the Netherlands Institute of Mental Health and Addiction.

Measurements
The interview comprised the mood and anxiety disorder sections of the Composite International Diagnostic Interview (CIDI), a standardized diagnostic interview, developed to be applied by trained lay interviewers, resulting in psychiatric diagnoses according to DSM-IV and ICD-10 criteria [13]. The specific phobia section was omitted because these problems are (if not accompanied by other mental disorders) associated with relatively little disabil-ity and impairment [14,15] and, therefore, isolated (noncomorbid) specific phobia appears to be of relatively little importance in patients with a 'nervous breakdown'. The obsessive-compulsive disorder section was omitted because of the low prevalence of this disorder [14,16]. Accordingly, the present study used the following current DSM-IV [17] diagnoses: major depressive disorder, dysthymia, bipolar disorder, generalized anxiety disorder, panic disorder with and without agoraphobia, agoraphobia without panic disorder, and social phobia. The CIDI was administered by five interviewers who received a training course at the Dutch WHO-CIDI Training and Reference Centre at Amsterdam, after which they were certified to deliver the fully structured CIDI interview. They used a computer-assisted version in which the questions were presented according to diagnostic algorithms and responses were entered directly into the computer. None of the interviewers had any particular expertise in psychology or psychiatry. The interviewers remained ignorant of the CIDI-diagnoses as such.
The questionnaires encompassed the Four-Dimensional Symptom Questionnaire (4DSQ) and the Hospital Anxiety and Depression Scale (HADS). The 4DSQ is a 50-item self-rating questionnaire measuring 'distress', 'depression', 'anxiety' and 'somatization' [18]. The 4DSQ assesses psychological and psychosomatic symptoms experienced during the past seven days. The distress scale (16 items, score range 0-32) measures symptoms of general psychological distress, which is conceptualized as the most general, most basic expression of human psychological suffering [18,19]. The depression scale (6 items, score range 0-12) measures severe anhedonia and depressive cognitions (including suicidal ideation), symptoms considered to be characteristic of depressive disorder [20,21]. The anxiety scale (12 items, score range 0-24) measures irrational fears, panic and avoidance, characteristic features of most anxiety disorders [17]. The somatization scale (16 items, score range 0-32) measures a range of 'psychosomatic' symptoms, characteristic of bodily dis-tress and somatoform disorders [22]. For all 4DSQ scales, higher scores represent higher symptom levels. In this study, we examined the 4DSQ distress scale for its ability to detect any depressive or anxiety disorder, and the 4DSQ depression and anxiety scales for their abilities to detect depressive and anxiety disorders. The 4DSQ depression and anxiety scales are supposed to detect depressive and anxiety disorders severe enough to consider specific treatment. For the 4DSQ scales two cut-off points are recommended, dividing the scales into low, moderate and high scores ( Table 1). High scores indicate a relatively high probability of caseness, prompting an immediate clinical diagnosis. On the other hand, low scores indicate the probable absence of a clinically relevant disorder. Moderate scores indicate a relatively low probability of caseness, warranting follow-up and reassessment after a few weeks [7].
The HADS is a 14-item self-rating questionnaire measuring 'depression' and 'anxiety' [23][24][25][26]. Like the 4DSQ, the HADS uses seven days as reference period. The depression scale (7 items, score range 0-21) measures mostly anhedonia, a phenomenon considered to be the central characteristic of major depressive disorder [21]. The anxiety scale (7 items, score range 0-21) measures mostly symptoms of generalized anxiety disorder [27]. For both HADS scales higher scores represent higher symptom levels. The depression and anxiety scales are intended to detect depressive and anxiety disorders in general medical settings. The depression scale is specifically intended to select those depressed patients which may be helped by the prescription of an antidepressant drug [21]. Like the 4DSQ, the HADS scales employ two cut-off points, one for the detection of 'possible' and one for the detection of 'probable' depressive or anxiety disorder (Table 1). Because the HADS total score is sometimes recommended as a measure of general psychological distress [28,29], we examined the total score for its ability to detect any depressive or anxiety disorder. There is a noteworthy difference in item content between the 4DSQ and the HADS. Unlike the 4DSQ, the HADS contains six positively worded items, of which five belong to the depression scale (e.g. 'I can laugh and see the funny side of things'; 'I feel cheerful'). These five depression items are assumed to measure anhedonia (i.e. loss of the ability to enjoy ordinary things in life) if the scores are reversed. In contrast, the 4DSQ depression scale heavily rests on depressive cognitions, including suicidal ideation (e.g. 'did you feel that everything is meaningless?'; 'did you ever think "If only I was dead"?'). At first glance, the 4DSQ depression scale seems to tap more severe depressive symptoms than the HADS depression scale. Regarding anxiety, unlike the HADS anxiety scale, the 4DSQ anxiety scale contains items on phobic fears and avoidance behaviour (e.g. 'were you afraid to travel on busses, trains or trams?'; 'were you afraid of becoming embarrassed when with other people?'; 'did you have to avoid certain places because they frightened you?'). In contrast, the HADS anxiety scale contains a few items of which the equivalents are included in the 4DSQ distress scale (e.g. 'feeling tense', 'worrying', 'feeling restless'). Again, at first glance, the 4DSQ anxiety scale seems to tap more severe symptoms than the HADS anxiety scale.
The interviewers collected the questionnaires after the CIDI-interview and checked the questionnaires for missing values. They did not have the knowledge, means or time to calculate scale scores. The questionnaire scores were later entered into a database by research assistants who were ignorant of the CIDI diagnoses.

Description of the sample
The sample consisted of 178 women and 117 men with a mean age of 39.5 years (SD 9.2). Dysthymia was diagnosed only in seven patients of whom six also fulfilled criteria for major depressive disorder. Bipolar disorder was diagnosed only in five patients who all had a 'mild manic episode'. Because of the small numbers, we decided not to include dysthymia and bipolar disorder in the further analyses (but we did not exclude the patients). Most patients with a major depressive disorder had a single episode; only 11% were diagnosed as having a recurrent disorder. Table 2 provides details about the distribution of the CIDI/DSM-IV diagnoses, as well as the internal consistency of the questionnaire scales, and the mean scores of the scales per diagnostic category. Not surprisingly, patients with the diagnosis of severe major depression had the highest scores on the 4DSQ depression and HADS depression scales. Patients with panic disorder with agoraphobia had the highest scores on the 4DSQ anxiety scale, while patients with panic disorder without agoraphobia had the highest scores on the HADS anxiety scale. Furthermore, it is shown that patients with any anxiety disorder scored about the same as patients with major depression on both depression scales, but they scored higher than depressed patients on both anxiety scales. These observations can partly be attributed to overlap of the diagnostic categories (i.e. comorbidity). Of those patients with a depressive disorder diagnosis 45% had one or more comorbid anxiety disorder diagnoses. Of those patients with an anxiety disorder diagnosis 66% had a comorbid depressive disorder diagnosis. Table 3 presents the results of the ROC analyses. For four of the five comparisons there was no statistically significant difference between the 4DSQ and HADS scales. Only with respect to the ability to detect panic disorder, agoraphobia and social phobia, the 4DSQ anxiety scale performed significantly better than the HADS anxiety scale. Table 4 shows the diagnostic performance of the 4DSQ distress score and the HADS total score with respect to the detection of any depressive or anxiety disorder. The 'optimal' cut-off point of ≥ 24 on the 4DSQ distress scale, where there is a balance between sensitivity and specificity, is found in the upper third of the scale. This indicates that severe distress is a hallmark of any CIDI/DSM-IV depressive or anxiety disorder. A lower cut-off point of ≥20 with a relatively high sensitivity of 0.90 can be used to exclude any depressive or anxiety disorder. The optimal cut-off point on the HADS total scale was ≥22 with a sensitivity of 0.76 and a specificity of 0.75. A lower cut-off point of ≥20 with a sensitivity of 0.86 can be used to exclude disorders (NPV 0.74), while a higher cut-off point of ≥25 with a specificity of 0.85 can be used to include disorders (PPV 0.85).

Sensitivity, specificity, PPV and NPV
The diagnostic performance of the 4DSQ and HADS depression scales are shown in Table 5 for any major depressive disorder and Table 6 for moderate or severe major depressive disorder. The optimal 4DSQ depression cut-off point for any major depressive disorder was ≥4 while for moderate or severe major depressive disorder this cut-off point was ≥5. Lower cut-off points with sensitivity values of at least 0.85 to exclude caseness were ≥2 for any and ≥3 for moderate or severe major depressive disorder. A higher cut-off point with a specificity value of at least 0.85 to include caseness was ≥7 for both severity levels. The optimal HADS depression cut-off point for any major depressive disorder was ≥11 while for moderate to severe major depressive disorder this cut-off point was ≥12. Lower cut-off points with sensitivity values of at least 0.85 to exclude caseness were ≥9 for any and ≥10 for mod-erate or severe major depressive disorder. Higher cut-off points with specificity values of at least 0.85 to include caseness were ≥14 and ≥15.
The diagnostic performance of the 4DSQ and HADS anxiety scales are shown in Table 7 for any anxiety disorder and Table 8 for panic disorder, agoraphobia and social phobia. The optimal 4DSQ anxiety cut-off point for any anxiety disorder was ≥6 while for panic disorder, agoraphobia and social phobia the optimal cut-off point was ≥8. Lower cut-off points with sensitivity values of at least 0.85 to exclude caseness were ≥4 for any anxiety disorder and ≥6 for panic disorder, agoraphobia and social phobia. Higher cut-off points with specificity values of at least 0.85 to include caseness were ≥9 and ≥10. The optimal HADS anxiety cut-off point for any anxiety disorder as well as for panic disorder, agoraphobia and social phobia was ≥13. Lower cut-off points with sensitivity values of at least 0.85 to exclude caseness were ≥10 for any anxiety disorder and ≥11 for panic disorder, agoraphobia and social

Main findings
We examined and compared the abilities of the 4DSQ and the HADS to detect CIDI/DSM-IV defined depressive and anxiety disorders in distressed patients in primary care. Regarding any major depressive or any anxiety disorder, we found that the 4DSQ and the HADS depression and anxiety scales yielded comparable and satisfactory results with AUC values well above 0.70. Moreover, the 4DSQ distress scale turned out to be as effective as the HADS total score in detecting any depressive or anxiety disorder, which is in agreement with the conceptualization of distress as the most basic expression of human psychological suffering. However, in distressed patients, it is specifically important to identify the more severe disorders as these patients may be in need for specific treatment. Therefore, we looked at how well the 4DSQ and the HADS detected moderate and severe major depressive disorder and panic disorder, agoraphobia and social phobia. As it turned out, the 4DSQ anxiety scale performed significantly better than the HADS anxiety scale while the 4DSQ depression scale performed a little bit better than the HADS depression scale but the difference was statistically not signifi-cant. Given the content difference between the 4DSQ and HADS scales, and the assumption that the 4DSQ items refer to more severe symptoms than the HADS items, these findings are not particularly surprising.
Furthermore, the cut-off points that we established for the 4DSQ depression scale compared reasonably well with the recommended cut-off points of ≥3 en ≥6. However, the cut-off points that we established for the 4DSQ anxiety scale appeared to be substantially lower than the recommended cut-off points of ≥8 en ≥13, and, therefore, the cut-off points of the 4DSQ anxiety scale may need downward revision. In contrast, the cut-off points that we established for the HADS depression and anxiety scales proved to be substantially higher than the recommended cut-off points of both HADS scales. Therefore, the HADS cut-off points may need upward revision, perhaps to ≥9 and ≥14 for depression, and ≥10 and ≥16 for anxiety.

Comparison with other studies
The present 4DSQ results can be compared with the only previous 4DSQ validation study published [18] in which the performance of the depression scale was studied in distressed primary care patients. This yielded an AUC value of 0.83 and an optimal cut-off point of ≥6. In the present study we found slightly lower AUC values and lower optimal cut-off points, which is probably due to a

Limitations and strengths
Our study sample was limited by the selection of distressed general practice patients (considered to have a 'nervous breakdown'), aged 18-60 years, being employed, and being on sick leave. Therefore, some caution should be observed in generalizing our results to distressed older patients (>60 years), distressed patients without paid employment, and employed distressed patients who are not on sick leave. Moreover, even more caution should be observed in generalizing our results to patients who do not disclose their distress to their GP. Such patients, who nevertheless have a relatively high risk for depressive and anxiety disorders, are, in particular, patients with medically unexplained physical symptoms [41] and patients with chronic somatic diseases [42]. Future research should clarify whether the results of the present study can be generalized to these important patient groups.
The GPs were instructed to exclude patients with obvious (i.e. already diagnosed) depressive or anxiety disorders.
Recognizing that case mix differences may impact the performance characteristics of diagnostic tests [39], let us consider how our results could have looked like if we had not excluded these patients. Patients with GP diagnosed depressive and anxiety disorders would probably have had more severe symptoms and disorders than those included in the present study sample. Therefore, they would have had a relatively high probability of fulfilling the DSM-IV criteria for caseness (according to the CIDI interview), and a smaller chance of obtaining false-negative test results than the present sub-sample of CIDI-DSM-IV cases in the study sample. Addition of such a more severe group of patients to our sample would, therefore, have resulted in increased sensitivity values as relatively more cases would have been correctly classified. In addition, because of the rise in prevalence, the positive predictive values would have increased and the negative predictive values would have decreased. Moreover, due to an increase in contrast between cases and non-cases, the diagnostic accuracy (AUC values) would probably have increased, and due to a shift in the severity spectrum, the optimal cut-off points would probably have shifted to higher values. In other words, it is unlikely that the exclusion of patients with obvious depressive and anxiety disorders in our study has resulted in an overestimation of the diagnostic accuracy of the questionnaires.
The exclusion of patients with obvious depressive and anxiety disorders can also be considered to be a strength of the study. Considering that a diagnostic instrument is meant to be used in situations where there is a diagnostic problem, diagnostic research should be performed in relevant patient populations (the 'relevant domain') [40]. With respect to detecting depressive and anxiety disorders,   there is no diagnostic problem in patients in whom depressive and anxiety disorders are clearly discernable. When the patients' depressive and anxiety disorders are obvious, the application of tools to detect these disorders is superfluous. In the present study, we have selected a relevant population of distressed patient, in whom depressive and anxiety disorders often remain undetected as such [6].

Further considerations
In our study, GPs were requested to recruit 'nervous breakdown' patients on sick leave, and to exclude patients with readily diagnosable depressive and anxiety disorders. Nevertheless, when the patients were subsequently subjected to a standardized diagnostic interview (the CIDI), one third of them (34%) received a diagnosis of moderate or severe major depressive disorder and/or panic disorder, agoraphobia or social phobia. As much as 61% of the patients received any DSM-IV depressive or anxiety disorder diagnosis. Our results thus illustrate that, when patients present themselves with a story of a 'nervous breakdown', GPs tend to overlook depressive and anxiety disorders, even those that are severe enough to justify a specific diagnosis and treatment. Our concerns explicitly regard the latter group, the 26% of the patients with a moderate or severe major depressive disorder and the 19% of the patients with a panic disorder, agoraphobia or social phobia (together 34% of the sample). We do not worry about the patients with mild major depressive disorders as they probably would not benefit from a specific antidepressant treatment [43,44]. We do recognize that these 'mild' patients need their distress to be recognized and managed, but they do not need to be given a label of 'major depressive disorder'. We do not worry either about patients with 'generalized anxiety disorder' as the essence of this diagnosis is distress for a period of more than six months. In a sample of distressed patients not all patients with distress for more than six months are chronic worriers; some or perhaps most of them are just confronted with long-lasting social difficulties and stressors, and, although we recognize that these patients need their distress to be recognized and taken care of, most of them are unlikely to benefit from a diagnosis of 'anxiety disorder'.
In addition, there is one more argument in favour of refraining from screening for just any depressive or anxiety disorder (as opposed to more severe disorders only) in a sample of relatively severely distressed patients (who have gone on sick leave) in primary care. The argument is that, in this particular case, screening is not very successful in separating out patients with probabilities for caseness small enough to justify no further investigation or followup. We have seen that, in this sample with a prevalence of 61% for any depressive or anxiety disorder (49% for any major depressive disorder and 34% for any anxiety disor- * For each scale three cut-off points are highlighted in bold type: the optimal cut-off point with the best balance between sensitivity and specificity, the best lower cut-off point with a sensitivity ≥0.85, and the best higher cut-off point with a specificity ≥0.85 4DSQ = Four-Dimensional Symptom Questionnaire HADS = Hospital Anxiety and Depression Scale PPV = positive predictive value NPV = negative predictive value der), it is difficult to exclude caseness because of relatively low predictive values of negative scores (NPV; see Table  4). If, for some reason, one insists on diagnosing any DSM-IV depressive or anxiety disorder, one virtually has to investigate every patient in a situation where the prior probability of being a case is as high as 61%. On the other hand, if one is dealing with a sample comprising more mildly, and perhaps less obviously distressed patients, one might be interested in detecting any DSM-IV depressive or anxiety disorder in order to find patients distressed severely enough to warrant special attention (not specific treatment), then the HADS (both the depression and anxiety scales, or the total scale) can be used for that purpose, as well as the 4DSQ distress scale. The distress scale measures 'normal' everyday distress in the score range 0-10. From the score of 11 upwards the distress scale measures moderate to severe distress that is associated with (occupational) stress, social dysfunctioning and sickness absence [45,46]. Our findings suggest that in the score range 11-20 there is distress with a relatively small probability of having any DSM-IV defined depressive or anxiety disorder, while in the range 21-32 there is an increasing probability for such a disorder.

Conclusion
In a sample of distressed general practice patients, the 4DSQ and the HADS are both able to detect any DSM-IV depressive or anxiety disorder. However, when one is specifically interested in detecting cases severe enough to warrant specific treatment (e.g. antidepressant pharmacotherapy or specialized cognitive behavioural therapy), the 4DSQ has some advantage over the HADS, especially for the detection of panic disorder, agoraphobia and social phobia. Future research should focus on the diagnostic accuracy in high-risk groups, in particular in patients presenting their distress to their GP, patients with medically unexplained physical symptoms and patients with chronic physical conditions. Special attention needs to be given to optimal cut-off points in these groups as these may very well differ from those determined in unselected primary care patients.