GPAQ-R: development and psychometric properties of a version of the General Practice Assessment Questionnaire for use for revalidation by general practitioners in the UK

Background The General Practice Assessment Questionnaire (GPAQ) has been widely used to assess patient experience in general practice in the UK since 2004. In 2013, new regulations were introduced by the General Medical Council (GMC) requiring UK doctors to undertake periodic revalidation, which includes assessment of patient experience for individual doctors. We describe the development of a new version of GPAQ – GPAQ-R which addresses the GMC’s requirements for revalidation as well as additional NHS requirements for surveys that GPs may need to carry out in their own practices. Methods Questionnaires were given out by doctors or practice staff after routine consultations in line with the guidance given by the General Medical Council for surveys to be used for revalidation. Data analysis and practice reports were provided independently. Results Data were analysed for questionnaires from 7258 patients relating to 164 GPs in 29 general practices. Levels of missing data were generally low (typically 4.5-6%). The number of returned questionnaires required to achieve reliability of 0.7 were around 35 for individual doctor communication items and 29 for a composite score based on doctor communication items. This suggests that the responses to GPAQ-R had similar reliability to the GMC’s own questionnaire and we recommend 30 completed GPAQ-R questionnaires are sufficient for revalidation purposes. However, where an initial screen raises concern, the survey might be repeated with 50 completed questionnaires in order to increase reliability. Conclusions GPAQ-R is a development of a well-established patient experience questionnaire used in general practice in the UK since 2004. This new version can be recommended for use in order to meet the UK General Medical Council’s requirements for surveys to be used in revalidation of doctors. It also meets the needs of GPs to ask about patient experience relating to aspects of practice care that are not specific to individual general practitioners (e.g. receptionists, telephone access) which meet other survey requirements of the National Health Service in England. Use of GPAQ-R has the potential to reduce the number of surveys that GPs need to carry out in their practices to meet the various regulatory requirements which they face.


Background
Patient experience surveys have increasingly been used to assess the quality of care in general practice. In the UK, these were first used on a wide scale as part of the Quality and Outcomes Framework, a pay for performance scheme introduced in 2004 [1]. At the time, doctors were given a financial incentive to carry out patient surveys, and two surveys were approved for the purpose, the General Practice Assessment Questionnaire (GPAQ) [2] and Improving Practice Questionnaire (IPQ ) [3]. The development and validation of GPAQ from an earlier version of the survey (GPAS) has been described elsewhere [4][5][6] along with research carried out using GPAQ data [7][8][9][10][11][12].
In 2008, the financial incentive to carry out patient surveys using GPAQ and IPQ was removed following the introduction of a new national survey, the General Practice Patient Survey (GPPS) [13]. However, financial incentives attached to GPPS were subsequently withdrawn partly as a result of large random variations in the payments associated with patient experience scores [14]. In 2011, responsibility for conducting surveys was returned to practices, and practices against received payments for carrying out and acting on the results of patient surveys [15]. This time, there was no restriction on the questionnaires that practices could use, but many practices returned to using GPAQ.
In 2012 the UK General Medical Council (GMC) introduced a requirement for all doctors in the UK to undertake periodic revalidation. The supporting evidence for revalidation includes a requirement for patient experience to be assessed periodically at individual doctor level. The GMC has published its own questionnaire that can be used for revalidation [16] with associated publications on the development and validation of the survey [17][18][19]. However because the GMC survey has been designed to be used by all doctors (GPs and hospital doctors), it does not meet the needs of GPs for surveys in their own practices which include capturing patients' views on a wider range of aspects of care, e.g. ease of getting appointments, ability to get through on the phone etc. A number of other questionnaires, including GPAQ-R, have been approved by the Royal College of General Practitioners for use in revalidation [20].
The GMC has published guidance on the development of surveys that would be approved for use in revalidation, [21] and we used this guidance to develop a new version of GPAQ (GPAQ-Revalidation or GPAQ-R) that would be suitable both for revalidation and for a range of other NHS purposes such as the incentive given to GPs to carry out patient surveys and engage patients in planning improvements based on the results [22]. The aim of our approach was to reduce the number of different surveys that GPs and practices might need to use. We describe the development and psychometric properties of this new instrument.

Development of the new questionnaire
GPAQ-R was developed from the existing current version of GPAQ (V3 [2]) with the following steps: 1. Moving the questions relating to the doctor patient consultation to the front of the questionnaire. These are the items necessary for revalidation. The purpose of this was so that the survey could be used with the front page alone if other items relating to wider practice organisation were not required. This also ensured that the instructions to the patient relating to the purpose of the survey were as close as possible to questions relating to the individual doctor's performance. We did not undertake specific consultation with patient or professional groups in developing this version of GPAQ. Items for inclusion in GPAQ were originally based on systematic reviews of aspects of care that are important to patients, updated by a more recent systematic review by one of the authors [24]. In addition, GPAQ questions have been modified over the past 10 years based in part on feedback from patient and practice groups which have used the questionnaire as part of the Quality and Outcomes Framework.
The final tested version of GPAQ-R is shown in Additional file 2. Updated versions and conditions for use are available at www.gpaq.info. In general, GPAQ-R is freely available for individual practices to download and use, but commercial organisations may not use GPAQ without a license.
Practices were instructed to give questionnaires to consecutive patients attending practices in line with GMC guidance on the completion of surveys (instructions to practices are given in Additional file 1). The questionnaires were administered by practice staff but the completed questionnaires were not seen by practice staff and all analyses were carried out independently. Surveys were carried out for GPs who were partners or salaried doctors but trainees were not included. We chose to administer the questionnaires in the way in which they would be used for revalidation in line with GMC guidance and therefore we do not have details of response rates or whether any patients were deliberately excluded from the survey. Data were supplied by CMI Publishing Ltd and Intime Data, two commercial firms with licenses from the University of Cambridge to supply GPAQ services to practices.

Statistical analysis
We described the demographic profile of the patient sample, the frequency distributions of responses to the eleven core doctor-patient communication and confidence items (Q1 to Q11) that related to the elements required by the GMC for revalidation, and the rates of missing or spoilt responses to these items. Valid responses to these items were scored linearly from 0 (least favourable) to 100 (most favourable) ignoring 'Doesn't apply' and 'Don't know' responses. No attempt was made to impute missing values. We calculated the reliability of these core item scores from the intraclass correlation coefficients (ICCs) and estimated the number of patient responses needed to achieve 0.7 or 0.8 reliability for the doctor's mean score on each item.
We conducted an exploratory factor analysis for doctors where data were complete on the eleven core items. The analysis used principal components extraction, applying a Varimax rotation with Kaiser normalisation to improve the interpretability of the solution and retaining factors with eigenvalues greater than one.
For each of the doctors with at least six patient questionnaires 'communication' and 'confidence' scores were calculated as follows. We averaged each of the communication items (Q1 to Q8) and confidence items (Q9 to Q11) across all patients rating the doctor, provided that at least six valid patient responses for that doctor were present. Finally we averaged the mean communication item scores and averaged the mean confidence item scores provided, in each case, that more than half of them were present. The reliabilities of the communication and confidence scores were evaluated using Generalisability Theory [25].
All analyses were conducted in SPSS version 20 except for the generalisability analysis which used G_String_IV. No attempt was made to impute missing values except in the generalisability analysis where missing item scores were replaced by the grand mean in line with accepted practice for generalisability analysis [26].

Results
Data were analysed for questionnaires from 7258 patients relating to 164 GPs in 29 practices (mean 44 responses per GP). The majority of respondents (70.8%) were under 65, with 17.1% and 12.1% of respondents 65-74 and 75 or over respectively. Most (64%) were female and 55.4% recorded that they had a long-standing health condition. 90.9% of respondents recorded their  Table 2 shows the frequency distribution of responses to the questions on communication with the doctor which were the core questions relating to the GMC criteria for revalidation. Responses were, as expected, skewed with many more positive than negative responses. Rates of missing or spoilt responses to the core questions varied between 4.5% and 6.0%, except question 11 ('Would you be completely happy to see this GP again?') where only 87.8% of patients recorded valid responses. 'Spoilt' responses included those where the patient had recorded free text instead of checking one of the boxes, but also included data entry errors.
The factor analysis ( Table 3) used data from 5,569 patients with complete data on the eleven core items. A two-factor solution ( Table 4) explained 66% of the total variance, relating to communication (Qs 1-8, 56% of the variance) and trust / confidence (Qs 9-11, 10% of the variance), with eigenvalues of 6.154 and 1.120 respectively. These correspond to the results of factor analysis previously reported for the GMC patient questionnaire [18]. Table 4 shows the intra-class correlations (ICCs) for the core items along with the number of patient responses per doctor needed to achieve 0.7 or 0.8 reliability for the doctor's mean score for each item. These results suggest that around 35 responses (range 30 to 43 for individual items) would be needed to give reliability of 0.7 on individual communication items.
148 doctors provided data allowing calculation of a mean score for both communication and confidence. The mean (SD) of the mean communication scores was 92.8 (3.90) and of the confidence scores was 98.0 (2.15). Variance component analysis of the communication score identified that 58% of variance was attributable to patients, whilst only 5% was attributable to doctors, and only 1% to items. Corresponding figures for the confidence score were 28%, 2% and 1% respectively. Reliability is thus best improved by increasing the number of patients returning the questionnaire rather than by varying the number of items in the questionnaire. The generalisability analysis showed that a generalisability coefficient (reliability) of 0.70 can be achieved for the communication score with 29 patient questionnaires per doctor, or of 0.8 with 50 questionnaires per doctor. The confidence score was less reliable: the corresponding figures were 81 and 191 questionnaires respectively.

Discussion
The results suggest that GPAQ-R, a development of previous versions of the General Practice Assessment Questionnaire, is suitable to use for revalidation of doctors in the National Health Service, meeting the requirements for survey development set out by the General Medical Council and with psychometric properties similar to those of the GMC's own questionnaire [17]. We recommend that 30 completed questionnaires should be obtained to give sufficiently reliable results for scoring doctor-patient communication for individual doctors. However, where this initial screen raises concern, a survey might be repeated with 50 returned questionnaires to give greater reliability, increasing the reliability coefficient from 0.7 to 0.8. While these numbers give satisfactory levels of reliability for both items and the composite scale for doctor patient communication, they do not for the scale on trust and confidence or for two out of the three individual items on trust and confidence. In particular, for the item 'Would you be completely happy to see this GP again?' where over 99% of patients replied 'Yes', over 300 responses would be needed to achieve reliability of 0.7. Although taken from the GMC questionnaire, this item is unlikely to be discriminating as a screen for poor performance.
The strengths of this questionnaire compared to the GMC's questionnaire are that it intentionally incorporates a range of practice characteristics to be assessed and is therefore suitable for a wider range of uses within the NHS than the GMC questionnaire which focuses solely on items relevant to revalidation. However, because of this, GPAQ-R is also considerably longer than the GMC questionnaire and this could affect response rate. It is important to note that the GMC's recommended methodology (handing out questionnaires after a consultation) does not require response rates to be recorded. For GPs only wishing to use GPAQ-R for revalidation purposes, we have designed the survey so that the front page can be used on its own, which significantly shortens the questionnaire.
The relatively high non-completion rate for one item is of concern, namely the 12% of patients who did not provide valid responses to the question "Would you be completely happy to see this GP again?", although some of these were data entry errors. We do not think this is due to wording of the question as the phrasing of this item is virtually identical to the GMC's own questionnaire where lower non-complete rates have been reported. The high non-completion rate for this item may be in part due to the proximity of space for patients to make free comments about their experience with the GP. Thirty five of the blank items on this question had associated handwritten comments and we have now modified the instruction on the first page to include a comment on the importance of completing all questions. Patients may also be concerned that doctors would see the response to this question, and we note that GMC  guidance is that patients should return questionnaires in a sealed envelope which may increase their confidence in that their answers will remain confidential, and we are not certain that this guidance was always followed in this study. Where patients choose to give a free text comment as an alternative to ticking a box, we believe that this is likely to indicate that the patient regards this as more valuable information, and we have adopted this approach in other research which we are carrying out. We therefore recommend that free text comments should form part of the feedback that doctors receive on their performance. However, if this is done, some comments need to be anonymised before being fed back which substantially increases the costs of processing the questionnaire data. GPAQ-R, like the GMC questionnaire, takes an approach of asking about the quality of communication (e. g. 'How good was the doctor at ….'), sometimes called evaluation questions. This contrasts with some other surveys which focus on whether particular questions were asked (e.g. 'Did the doctor ask you about …'), sometimes called report questions which are sometimes regarded as less subjective and easier to interpret [27]. A commonly cited cognitive model of how patients respond to questionnaire items was developed by Tourangeau [28] who suggests that completion of survey questions requires (1) comprehension of the question, (2) retrieval from memory of the relevant information, (3) use of the information to make a judgment if the question calls for one, and (4) selection and reporting of the response. Although report and evaluation approaches are sometimes contrasted, we believe that the difference between the two is modest provided very specific questions are asked, partly because 'report' items often have an evaluative component implied in their wording and in many circumstances both require a judgement to be made (stage 3 of Tourangeau's model). Items in GPAQ-R ask for the patient's evaluation of very specific aspects of care and do not include questions on general satisfaction.

Conclusions
GPAQ-R is a development of a well-established patient experience questionnaire used in general practice in the UK since 2004. This new version can be recommended for use in order to meet the UK General Medical Council's requirements for surveys to be used in revalidation of doctors. It also meets the needs of GPs to ask about patient experience relating to aspects of practice care that are not specific to individual general practitioners (e.g. receptionists, telephone access) which meet other survey requirements of the National Health Service in England. Use of GPAQ-R has the potential to reduce the number of surveys that GPs need to carry out in their practices to meet the various regulatory requirements which they face.

Ethical consent
Ethical permission was not required for these analyses.