Home » Posts tagged 'socially desirable responding'

Tag Archives: socially desirable responding

The “Friends and Family Test” – lessons from the “Do you masturbate often?” question

July 31, 2013 7:59 am / 1 Comment

kiosk for Friends and Family Test

In whatever way you wish to ask the question from an expert statistician, a reasonable statistician will spit bullets at the methodology of “The Friends and Family Test”. Like Trip Advisor, it is susceptible to a phenomenon called ‘shilling’, where fake respondents bias the sample with their fake appraisals. The Friends and Family Test (FFT) is a single question survey which asks patients whether they would recommend the NHS service they have received to friends and family who need similar treatment or care. Conceptually, it is of course a terrific idea to ask the patient what they think of the NHS, but it is susceptible to too many uncontrolled variables for the result to be particularly meaningful; for example, how long after the clinical event should you ask the patient to ‘rate’ the episode? The responses to the FFT question are used to produce a score that can be aggregated to ward, site, specialty and trust level. The scores can also be aggregated to national level.

Most people in the public are very slightly interested in the geeky way in which statisticians produce the results, The scores are calculated by analysing responses and categorising them into promoters, detractors and neutral responses. The proportion of responses that are promoters and the proportion that are detractors are calculated and the proportion of detractors is then subtracted from the proportion of promoters to provide an overall ‘net promoter’ score. NHS England has not prescribed a specific method of collection and decisions on how to collect data have been taken locally. Each trust has been able to choose a data collection method that works best for its staff and people who use services. The guidance suggests a range of methods that can be adopted including tablet devices, paper based questionnaires and sms/text messages, amongst others. How you collect the data adds a further level of complexity to the meaningless nature of these data.

There is a phenomenon in statistics called the “social desirability responding” (“SDR”), a tendency of respondents to answer questions in a manner that will be viewed favorably by others. It can take the form of over-reporting “good behaviour” or under-reporting “bad,” or undesirable behaviour. The tendency poses a serious problem with conducting research with self-reports, especially questionnaires. This bias interferes with the interpretation of average tendencies as well as individual differences. There might be an age-related effect; older patients have tended to hold the NHS with greater reverence, whatever their political loyalties might be, compared to younger patients who believe in a market and/or believe they are ‘entitled’ to the NHS.

Topics where socially desirable responding (SDR) is of special concern are self-reports of abilities, personality, sexual behaviour, and drug use. When confronted with the question “How often do you masturbate?”, for example, respondents may be pressured by the societal taboo against masturbation, and either under-report the frequency or avoid answering the question. Therefore the mean rates of masturbation derived from self-report surveys are likely to be severe underestimates. Social desirability bias tends to be highest for telephone surveys and lowest for web surveys. This makes web surveys particularly well suited for studies of sexual behaviour, illicit activities, bigotry, or particularly threatening topics. Fundamentally, respondents do not answer questions the same way in person, on the phone, on paper or via the web. Different survey modes produce different results. Robert Groves, in his 1989 book “Survey Errors and Survey Costs“, argues that each survey mode puts respondent into a different frame of mind (a mental “script”). Face-to-face surveys prompt a “guest” script. Respondents are more likely to treat face-to-face interviewers graciously and hospitably, leading them to be more agreeable. Phone interviews prompt a “solicitor” script. Respondents are more likely to treat phone interviews the way they treat calls from telemarketers, making them more likely to go through the motions of answering questions in order to get the interviewer off the phone.

That is why reasonable statisticians take care when comparing the results of surveys conducted by different modes. Humans process language differently when reading, when listening to someone over the phone, or when listening to someone in the same room (when visual cues and body language kick in). It is no surprise that these different modes lead to different behavior by respondents. The lack of a standardised methodology in the FFT means that there are likely to be, what are known as, mode effects. Mode effect is a term used to describe the phenomenon of different methods of administering a survey leading to differences in the data returned. For example, we may expect to see differences in responses at a population level when comparing paper based questionnaires to tablet devices. On a positive note, mode effects do not prevent trusts from comparing their own data over time periods when they have conducted the test in the same way, as any biases inherent in the individual approaches are constant over the period.

Much money has been pumped into the FFT, and one wonders whether the FFT would stop another Harold Shipman or Mid Staffs. Even Trusts with excellent ratings, such as Lewisham, are in the “firing line” for being shut down. So one really is left wondering what on earth is the point of flogging this dead policy horse?