Study inclusion criteria
The studies to be included must fulfill the following criteria:
We will include studies recruiting unselected adult patients presenting with chest pain in office-based primary care practice. Studies are not eligible if the patients have been recruited in emergency departments of hospitals, by paramedics, or if the patients have been pre-selected by PCPs or other health professionals based on the likelihood of an underlying CHD. If patients of all age groups are included in the original study we will exclude patients aged < 18 years in the analysis.
The diagnostic tests under evaluation will include items of history and physical examination like symptoms, signs, age, sex, coronary risk factors. We will not exclude studies or discard individual items because the studies defined the items differently or used different wording of the questions in the standard data set.
Target disease/ reference standard
CHD is the reference condition. We will not exclude studies because of differing case definitions (e.g. ACS vs. stable CHD). Coronary angiography is supposed to be the definite reference standard in diagnosing CHD. However, since the likelihood of CHD in most patients presenting in primary care is low, this reference test is too invasive. Knottnerus and colleagues suggest that follow up of the clinical course during an appropriate period is a good alternative. They refer to this design as delayed-type cross-sectional design . However, we will not exclude studies because of differing method for establishing the reference diagnosis (e.g. independent reference panel; diagnosis by the GP that cared for the patient).
We will exclude studies in which the clinical findings were obtained retrospectively from medical records.
Using the synopsis of the variables we will recode the original data sets. We will then check the recoded data sets whether values seem to be plausible, realistic, consistent, and at least similar to the results published by the researchers. Additionally, the individual investigators of the each of the original studies will validate the results of the translation and recoding process.
Cases aged < 18 years and cases with missing values in the diagnostic outcome variable will be excluded. If results on individual index tests are available on study level but are missing on individual patient level we will consider this as missing at random and impute the missing values. According to the well-established approach of multiple imputation we will create five imputed data sets for each original study [17, 18] and merge them into five imputed meta data sets. Translation and recoding will be a potential source of between-study heterogeneity. If substantial between-studies heterogeneity will occur in the analyses of single symptoms and signs, translation and matching of the respective variables will have to be discussed by the study team.
Primary objectives of the statistical analyses are to investigate the diagnostic accuracy of symptoms and signs and to explore possible associations between study- or patient-level characteristics and measures of diagnostic accuracy. In order to achieve these aims we will use different approaches.
In the bivariate analysis we will investigate the diagnostic accuracy of single symptoms and signs. In a first study-specific step we will calculate values of sensitivity, specificity and likelihood ratios for the single index tests within the individual studies. All statistical models in this step will be formulated as logistic regression models, with test status as response, and the disease status as explanatory variable . To deal with sparsity (small number of cases in certain combinations of CHD status and symptoms and signs), we will apply Firth’s method to correct for small sample bias . We will combine the results of the imputed data sets within the individual studies using recommended techniques of multiple imputation [17, 18]. We will plot the results for each clinical feature in forest plots. It is possible to extend the model to evaluate the impact of age and gender within the individual studies. However, the ultimate goal of this approach is to pool the results for a single symptom and sign across studies while accounting for individual patient (age and gender) and study characteristics. The feasibility of the bivariate random effects meta-analysis framework that Riley and colleagues recommended for this task largely depends on the number of studies . Since the studies do not provide data on the all index tests, the number of studies considered in the individual analyses will vary between 2 and n, where n is the number of all included studies.
In the multivariate
approach we will determine optimal combinations of symptoms and signs. Similar to the bivariate analysis we will use a stepwise approach. In a first step, the study-specific multivariate analysis
, we will determine optimal combinations of symptoms and signs within the individual studies. In order to achieve that aim we will use a random forest algorithm to identify the most important index tests in each study. The random forest algorithm is a powerful data-driven method for identifying important variables [21
]. Next, we will fit a logistic regression model with all index tests selected in this way and their two way interaction terms. We will perform these steps separately for each imputed data set of each original study. All index tests which will be significant (α = 5%) in at least one of the study-specific imputed data sets will be included in a candidate list. We will combine the results of the imputed data sets within each individual study according to the inferential methodology of multiple imputation [17
]. As a result of this step we will identify study-specific models which only include symptoms and signs that are independently associated with the presence of CHD. These study-specific analyses will provide important explorative insights in the heterogeneity between different studies. We expect differences in study design, slightly varying definitions of variables and varying population characteristics to play a role here. Results will guide the model building process for the next step, the multivariate meta-analysis.
In this approach we will merge the study-specific imputed data sets to five imputed meta data sets. We will enter all index tests included in the candidate list into the model and fit the logistic regression model (1) to each imputed meta data set:
is the outcome variable for patient j in study i (0 indicates that CHD is absent, while 1 indicates that CHD is present), X
represents the result of the kth index test, β
is the study-specific intercept for the ith study, β
is the coefficient for the index test X
, Iki is the study indicator for the kth index test (1 if the test result is available in individual study i, or 0 if the test result is not available in individual study i), P is the number of index tests, n
is the number of patients in study i, and n is the number of studies. Study-specific intercepts will account for heterogeneity across studies and for the fact that the individual studies will contribute data on different sets of index tests. However, model (1) assumes that the effect of a single index test is fixed across studies. A random effects model is possible, but more complex and might be not feasible if the number of studies is small. The multivariate meta-analysis will result in a clinical prediction rule with optimal diagnostic accuracy characteristics. The unique linear combination of selected symptoms and signs with their estimated regression coefficients from the final model defines a “clinical chest pain score”. If the score exceeds a threshold value, the patient is labeled CHD positive; if the score is less than a threshold value, the patient is labeled CHD negative. Such a prediction rule is “personalized” on the one hand and applicable to individuals in a broad community covering several countries or regions and thus exceeding the validity of individual studies on the other hand.
All statistical models considered in the multivariate approach will be formulated as logistic regression models, with CHD as response, and the symptoms and signs as explanatory variables. In order to describe the discriminatory power of the study-specific models and the meta model we will calculate the area under the receiver characteristic curve (AUC). In addition, we will calculate the values of sensitivity, specificity and likelihood ratios for different thresholds of the clinical prediction rule.
Since the maximum likelihood (ML) estimates of a logistic regression model are not necessarily maximizing the AUC, we will compare the ML results with the alternative approach of Pepe et al. . In this approach the estimates are directly maximizing the AUC. By definition the “clinical chest pain score” based on the AUC-maximizing estimates will have improved diagnostics accuracy characteristics, but the improvement might be so negligible that this more computationally demanding approach is not worthwhile. We will examine both approaches, and the final choice will depend on the magnitude and the practical relevance of the improvement. Furthermore, we will apply internal validation techniques in order to validate the “clinical chest pain score” from the final model of the multivariate meta-analysis.
A secondary objective of the statistical analysis is the external validation of clinical prediction rules (CPR) aimed to support PCPs in diagnosing myocardial ischemia in patients with chest pain. We are aware of several rules developed or validated in a primary care setting and based on items of the medical history and clinical examination. [8, 23–26]. These CPRs used different predictor variables and definitions of myocardial ischemia, e.g. any CHD, or myocardial infarction. If a study provides data on the required predictors of one or more CPR, we will calculate measures of overall discrimination (area under the curve) and diagnostic accuracy (sensitivity, specificity, likelihood ratios) for recommended thresholds within the individual data set. We will present the results separately for each study and CPR. If the number of studies providing data on one single CPR is sufficient we will calculate pooled estimates of sensitivity and specificity across studies using the approach recommended by Riley et al. .