The draft systematic evidence review on the Diagnosis and Treatment of ME/CFS was published online last week. It’s a monster – 416 pages in total. I know many ME/CFS patients may not be able to read this report, so in this post I’m going to focus on three things: the purpose of the report, the lumping of multiple case definitions, and the high quality rating given to the PACE trial. If you read nothing else about this systematic review, then these are the biggest takeaway messages.
The Purpose of the Systematic Review
NIH requested the review for the purposes of the P2P Workshop, and the Agency for Healthcare Research and Quality contracted with the Oregon Health & Sciences University to perform the review for about $350,000.
The primary purpose of the review is to serve as the cornerstone of knowledge for the P2P Panel. The Panel will be made up entirely of non-ME/CFS experts. In order to give them some knowledge base for the Workshop presentations, the Panel will receive this review and a presentation by the review authors (behind closed doors). Until the Workshop itself, this review will be the Panel’s largest source of information about ME/CFS.
But that is not the only use for this report. AHRQ systematic reviews are frequently published in summary form in peer reviewed journals, as was the 2001 CFS review. The report will be available online, and will be given great credence simply because it is an AHRQ systematic review. The conclusions of this review – including the quality rating of the PACE trial – will be entrenched for years to come.
You can expect to see this review again and again and again. In the short term, this review will be the education given to the P2P Panel of non-ME/CFS experts in advance of the Workshop. But the review will also be published, cited, and relied upon by others as a definitive summary of the state of the science on diagnosing and treating ME/CFS.
Case Definition: I Told You So
When the protocol for this systematic review was published in May 2014, I warned that the review was going to lump all case definitions together, including the Oxford definition. After analyzing the review protocol and the Workshop agenda, Mary Dimmock and I wrote that the entire P2P enterprise was based on the assumption that all the case definitions described the same single disease, albeit in different ways, and that this assumption put the entire effort at risk. Some people may have hoped that a systematic review would uncover how different Oxford and Canadian Consensus Criteria patients were, and would lead to a statement to that effect.
Unfortunately, Mary and I were correct.
The systematic review considered eight case definitions, including Oxford, Fukuda, Canadian, Reeves Empirical, and the International Consensus Criteria, and treated them as describing a single patient population. They lumped all these patient cohorts together, and then tried to determine what was effective in diagnosing and treating this diverse group. The review offers no evidence to support their assumption, beyond a focus on the unifying feature of fatigue.
What I find particularly disturbing is that the review did acknowledge that maybe Oxford didn’t belong in the group:
We elected to include trials using any pre- defined case definition but recognize that some of the earlier criteria, in particular the Oxford (Sharpe, 1991) criteria, could include patients with 6 months of unexplained fatigue and no other features of ME/CFS. This has the potential of inappropriately including patients that would not otherwise be diagnosed with ME/CFS and may provide misleading results. (p. ES-29, emphasis added)
But then they did it anyway.
This is inexplicably bad science. How can they acknowledge that Oxford patients may not have ME/CFS and acknowledge that including them may provide misleading results, and then include them anyway? Is it just because Oxford papers claim to be about CFS and include people with medically unexplained fatigue? The systematic review authors clearly believed that this was a sufficient minimum standard for inclusion in analysis, despite the acknowledged risk that it could produce misleading results.
I will have a lot more to say on this topic and the problems in the review’s analysis. For now, the bottom line takeaway message is that the systematic review combined all the case definitions, including Oxford, and declared them to represent a single disease entity based on medically unexplained fatigue.
PACE is Ace
One of the dangers of the review’s inclusion of the Oxford definition and related studies was the risk that PACE would be highly regarded. And that is exactly what happened.
The PACE trial is one of seven treatment studies (out of a total of thirty-six) to receive the “Good” rating, which has a specific technical meaning in this context (Appendix E). In the systematic review, a randomized control trial is “Good” if it includes comparable groups, uses reliable and valid measurement instruments, considers important outcomes, and uses an intention-to-treat analysis. I’m certainly no expert in these issues, but I can spot a couple problems.
First of all, the PACE trial may have used comparable groups within the study, but that internal consistency is different from whether the PACE cohort was comparable to other ME/CFS patients. The systematic review already acknowledged that the Oxford cohort may include people who do not actually have ME/CFS, and in my opinion that is the comparable group that matters.
In terms of important outcomes, the systematic review focused on patient-centered outcomes related to overall function, quality of life, ability to work and measures of fatigue. Yet there is no discussion or acknowledgement that patient performance on a 6 minute walking test at the end of PACE showed that they remained severely impaired. There is also no acknowledgement that a patient could enter PACE with an SF-36 score of 65, leave the trial with a score of 60, and be counted as recovered. That is because so many changes were made to the study in post-hoc analysis, including a change to the measures of recovery. Incredibly, the paper in which the PACE authors admit to those post-hoc changes is not cited in the systematic review. It is also important to point out that much of the discussion of the PACE flaws has occurred in Letters to the Editor and other types of publications, many of which were wholly excluded from the systematic review.
Again, I will have a lot more to say about how the systematic review assessed treatment trials, particularly trials like PACE. For now, the takeaway message is that the systematic review gave PACE its highest quality rating, willfully ignoring all the evidence to the contrary.
Where does this leave us, at the most basic and simple level?
- The review lumped eight case definitions together.
- The review acknowledged that the Oxford definition could include patients without ME/CFS, but forged ahead and included those patients anyway.
- The review included nine treatment studies based on the Oxford definition.
- The review rated the PACE trial and two other Oxford CBT/GET/counseling studies as good.
- The review concluded that it had moderate confidence in the finding that CBT/GET are effective for ME/CFS patients, regardless of definition.
If that does not make sense to you, join the club. I do not understand how it can be scientifically acceptable to generalize treatment trial results from patients who have fatigue but not ME/CFS to patients who do have ME/CFS. Can anyone imagine generalizing treatment results from a group of patients with one disorder to patients with another disease? For example, would the results of a high cholesterol medicine trial be generalized to patients with high blood pressure? No, even though some patients with high blood pressure may have elevated cholesterol, we would not assume the risk of generalizing results from one patient population to another.
But the systematic review’s conclusion is the predictable output of an equation that begins with treating all the case definitions as a single disease entity.
I will be submitting a detailed comment on the systematic evidence review. I encourage everyone to do the same because the report authors must publicly respond to all comments. More detailed info will be forthcoming this week on possible points to consider in commenting.
This review is going to be with us for a long time. I think it is fair and reasonable to ask the authors to address the multitude of mistakes they have made in their analysis.
Edited to add: Erica Verillo posted a great summary of problems with the review, as well.