Reliability of DSM and empirically derived prototype diagnosis for mood, anxiety and personality disorders
Introduction
The need to classify psychiatric disorders led to the development of numerous classification systems, of which most prominent for clinical and research purposes are DSM-5 and ICD-10. In both frameworks, the method of conducting a psychiatric assessment requires evaluating each of several dozen symptoms for their presence or absence and then applying idiosyncratic rules for combining them to determine a categorical diagnosis. Although these systems have shown sufficient reliability in research protocols, they have not done so in clinical practice, where cumbersome lists of symptoms with complex coding algorithms that vary by disorder have proven largely untenable, resulting in low inter-rater reliability [1, 2].
Related to these problems, is the evidence that psychiatric disorders, mainly personality disorders (PDs), are distributed continuously rather than categorically in nature, suggesting the importance of considering dimensional approaches for diagnosis [[3], [4], [5]]. An urging question, however, is how to implement dimensional scoring. One possibility, which has become the norm in PD research, is to sum the number of diagnostic criteria met for each disorder. The advantage of dimensionalizing current criteria is continuity with the current diagnostic approach. The disadvantage is that clinicians find DSM diagnosis cumbersome already [6, 7]. Clinicians rarely use the DSM diagnostic system in the way it was intended and often fall short of collecting sufficient diagnostic information resulting in diagnostic bias [[6], [7], [8]].
An alternative approach to classification of psychiatric disorders is prototype matching. This method takes into account cognitive processing parameters of clinicians and naturally fits the way humans categorize [9, 10]. Furthermore, the system presents the advantages of dimensional scoring (patients are rated on a continuum assessing the extent to which they match the disorder rather than on a yes/no binary symptom checklist), while maintaining the advantages of standard categorical diagnosis (e.g. a patient is categorized as either having a disorder or not). To date few studies examined the reliability of prototype matching diagnostic systems. Studies so far have largely focused on assessment of PDs and did not include a comparison to a gold standard diagnosis (e.g., Structured Diagnostic Interview). For example, a study by Westen et al. [11] reported high interrater reliability (median r = 0.72) for personality prototype diagnosis made by clinicians and clinically trained independent observers, in a sample of adult patients.
Yet, prototype matching approach may carry significant shortcomings. The absence of clear guidelines or rules as to what features need to be considered and how they should be integrated may result in clinicians focusing on different components of the narrative. This could open the door to clinicians' subjective judgments and cognitive heuristics that are likely biased in matching what they know about a patient in an unstructured, idiosyncratic manner [[12], [13], [14], [15]]. Importantly, more rigorous empirical research is needed to examine the reliability of prototype matching approach to diagnosing mood, anxiety and personality disorders.
In the current study, we examined reliability of assessing mood, anxiety and personality disorders using a multi-method multi informant approach. Specifically, treating clinicians completed three measures of diagnostic assessment (DSM-IV symptoms, prototypes based on DSM and empirically derived prototypes) based on their practice as usual, while trained independent clinicians completed the same assessment measures following administration of the Clinical Interview for DSM-IV Axis I and Axis II Disorders (SCID). We hypothesized that dimensional ratings, either based on the DSM or empirically derived will result in better interrater reliability compared with DSM categorical diagnosis or symptoms count. Since this is the first study to assess interrater reliability of mood, anxiety and personality disorders using both categorical diagnosis and dimensional scoring methods across multiple independent raters, we cannot make specific predictions regarding the comparison of the of two prototype dimensional ratings.
Section snippets
Setting and sample
The study was conducted in eight community mental health clinics in Israel. All participating clinics offer mental health services to an ethnically and socio-economically diverse adult patient population. A convenience sample of clinicians (N = 80) and patients (N = 170) participated in the study. We imposed minimal exclusion criteria for patient participation to maximize generalizability (i.e., actively suicidal and psychotic patients). The patients were adult males (n = 69, 40.6%) and females
Results
Degree of agreement between raters (treating clinician and independent interviewer) was assessed using two approaches (following Stolarova et al. [24]): (1) To assess the accuracy of the rating process, we calculated inter-rater reliability for treating clinician and independent interviewer using ICC coefficient for continuous ratings (i.e., DSM symptom count, DSM prototype and empirically derived prototype ratings), and using Cohen's Kappa for dichotomous ratings (i.e. categorical DSM,
Discussion
We assessed the interrater reliability for mood, anxiety and personality disorders based on prototype diagnostic systems (based on DSM and empirically derived) and based on DSM-IV categorical diagnostic system using a multimethod multi-informant study design in practice as usual settings. Overall, dimensional scoring, either based on DSM symptoms count or prototype matching approaches, for both axis I and axis II disorders seem to result in better interrater reliability compared to categorical
Conclusions
Prototype matching diagnostic systems based on DSM and empirically derived yield similar interrater reliability for both mood, anxiety and personality disorders. Our study provides further support for using dimensional psychiatric scoring over the categorical diagnosis. By using a dimensional approach that also preserves the advantages of categorical system (presence/absence of disorder) it offers a richer and more complex data on symptoms and diagnosis.
The following are the supplementary data
Financial support
This study was supported by the United States-Israel Binational Science Foundation (# 2011163 to Nakash and Westen).
Role of sponsor
The sponsor had no role in the study design or conduct of the study; in the collection, analysis, and interpretation of the data; or in the preparation or approval of the manuscript.
Conflict of interest
None reported.
Acknowledgment
The authors gratefully acknowledge Dr. Daphne Bentov-Gofrit, Dr. Evelyn Stiener, Dr. Revital Amiaz, Dr. Shaol Lev Ran, Dr. Eli Danilovich, Dr. Ido Lurie, Dr. Geva Shenkman Prof. Shlomo Fening, Dr. Keren Neeman, Dr. Eti Berant, and Ms. Ruth Riesel for their support during data collection as well as all participating patients and therapists.
References (43)
- et al.
Classification, assessment, prevalence, and effect of personality disorder
Lancet
(2015) - et al.
DSM-5 field trials in the United States and Canada, part II: test-retest reliability of selected categorical diagnoses
Am J Psychiatry
(2013) - et al.
DSM-5: how reliable is reliable enough?
Am J Psychiatry
(2012) - et al.
Categories versus dimensions in personality and psychopathology: a quantitative review of taxometric research
Psychol Med
(2012) - et al.
Clinical utility of 5-dimensional systems for personality diagnosis: a “consumer preference” study
J Nerv Ment Dis
(2008) - et al.
How missing information in diagnosis can lead to disparities in the clinical encounter
J Public Health Manag Pract
(2008) - et al.
Assessment of diagnostic information and quality of working alliance with clients diagnosed with personality disorders during the mental health intake
J Ment Health
(2017) - et al.
The clinical utility of DSM categorical diagnostic system during the mental health intake
J Clin Psychiatry
(2015) - et al.
Multidimensional rule, unidimensional rule, and similarity strategies in categorization: event-related brain potential correlates
J Exp Psychol Learn Mem Cogn
(2004) Concepts and conceptual structure
Am Psychol
(1989)