The Oxford Problem

Today, I’m very pleased to share this guest post by Chris Heppner.

declaration-of-independenceI loved Oxford when there as undergraduate (1951-4)–truly a city of dreaming spires, peaceful libraries, walks in the country to a lovely old pub by a waterfall with peacocks in the gardens–a great place on a warm summer evening. Et in Arcadia Ego–a Greek pastoral on a quiet English river. But I now invite you to join me in declaring independence from a very different Oxford that appeared many years later.

In 1990 a group of doctors, Dr. P.D. White and Dr. S. Wessely prominently among them, met to create a new research definition for what was then called Myalgic Encephalomyelitis in England. They did this because there were then current several definitions, which they alleged caused “Contradictory findings… largely because research has been carried out by investigators trained in different disciplines, using different criteria to define the condition. … A number of clinical syndromes have been described…but differing sufficiently to preclude comparison of published studies.” (“A report-chronic fatigue syndrome: guidelines for research,” Journal of the Royal Society of Medicine, Vol.84, Feb.1991, p.118). The group were set upon changing this situation, and the result was this paper, henceforth called “the Oxford definition.” The undeclared strategy would seem to have been to widen the accepting mouth enough to swallow all the rival definitions–one ring to rule them all (yes, Tolkien was a Professor at Oxford while I was there).

From our perspective, however, the inclusion of studies done under the Oxford definition over the past 23 years, with its rejection of post-exertional malaise as a requirement and its permissive inclusion of depression, has confused attempts to grapple with the dimensions and nature of our disease. In fact, much has been lost– Ramsay and various other researchers had, by 1990 or so, already grasped some key aspects of our disease, which were then swamped by the imperialistic energy and funding available to the psychiatric lobby.

Let us take a close look at this beast. It demands only one “symptom,” prolonged fatigue, which is carefully defined in the Glossary that forms a key part of the document as merely a “subjective sensation,” which is EXPLICITLY not “to be confused with impairment of performance as measured by physiological or psychological testing. The physiological definition of fatigue is of a failure to sustain muscle force or power output.” There may be other “symptoms,” such as “Mood disturbance,” “Myalgia,” Sleep disturbances,” but “fatigue” is the principal and only required “symptom.”

However, there are now (but as the followers of this definition insist on ignoring), multiple trials showing very clearly that we do indeed suffer from “a failure to sustain muscle force or power output.” Our miserable performance on day 2 of a 2 day VO2Max test reveals this vividly–and painfully, as a reading of Jennifer Spotila’s blog on her test will convince. The work by Snell and Stevens has now been replicated by others such as Betsy Keller. We have multiple studies showing vividly the physiological results of even one shot exercise–Julia Newton’s work showing our inability to clear lactic acid build-up in our muscles, cytokine and gene expression studies from various sources, especially the Lights, showing a morbid response to exercise quite different from that of a healthy, or indeed depressed, person. We do not have the mere “symptom” of a “subjective sensation” of fatigue–we have the real thing, with multiple physiological “signs”–a very different word from “symptom” in the vocabulary of this definition. Just exactly what these “signs” indicate by way of etiology and treatment remains to be elucidated, but they are objective, not subjective. We have not only fatigue, but also PEM, and we have it in spades. And our fatigue can be clearly distinguished from the fatigue caused by heart failure, MS, and most other fatiguing conditions.

There is another dimension to fatigue as defined here: “Mental fatigue is a subjective sensation characterized by lack of motivation and of alertness”–and, to repeat, “The symptom of fatigue should not be confused with impairment of performance as measured by physiological or psychological testing.” Here too we now have a fair number of studies; and a very interesting talk by Dane B. Cook on an ongoing, not yet published study, showing exactly that–impairment of aspects of cognitive performance under the stress of fatigue inducing exercise, in such areas as multiple tasking etc. We do not belong inside the Oxford definition.

Another feature of the Oxford definition is the list of exclusionary conditions, which includes “proven organic brain disease.” Once again we now have multiple studies that show things like white spots similar to those in MS, brain hypoperfusion made worse by exercise, and very recently the Stanford study proving brain abnormalities by highly sophisticated imaging techniques; we do indeed have “proven organic brain disease.” On this ground too we are excluded from studies using the Oxford definition, and it is the definition itself that says so.

How do NIH and the AHRQ draft handle the Oxford Definition?

A partial answer seems to be that NIH is permissive, though that may change when results come in from the IOM and P2P projects. We still have multiple definitions, “differing sufficiently to preclude comparison of published studies,” but the NIH appears to find this a good thing. Dr. Shirley told Medscape Medical News that “Multiple case definitions for ME/CFS have been developed to meet clinical and research needs. NIH encourages researchers to use the case definition that best meets their needs for rigorous scientific exploration of the …underlying pathology…” Let a thousand flowers bloom. It seems that in their view multiple case definitions are just fine, including the definition that explicitly excludes all of us who have measurable impairment of physical or cognitive performance.

The AHRQ systematic evidence review commissioned by the NIH for their P2P workshop did the same thing, despite expressing concerns about the Oxford definition. The Executive Summary admits that “Experts consider post-exertional malaise (PEM) and memory or concentration problems critical components.” They return to the issue in the Discussion: “Multiple case definitions have been used to define ME/CFS and those that require the symptoms of PEM and neurological and autonomic manifestations appear to represent a smaller but more involved subset of the broader population.” Right: in fact, they almost stumble into an admission that ME as defined in, say, the Canadian Consensus Criteria, represents a different population than the general simply fatigued and/or depressed populations included by the Oxford.

The review goes on to admit “We elected to include trials using any predefined case definition but recognizing that some of the earlier criteria, particularly the Oxford …, could include patients with 6 months of unexplained fatigue and no other features of ME/CFS. This has the potential of inappropriately including patients that would not otherwise be diagnosed with ME/CFS and may provide misleading results.”

I find this decision to “include trials using any predefined case definition” in spite of the problems identified quite extraordinary; good evidence is provided that some definitions, and in particular the Oxford, probably include patients who do not have the disease being considered for a new definition, that such inclusion will probably skew the results–and then they go ahead and include them anyway.

And not only include them, but make interventions crafted specifically for them into the key recommendation for treatment, albeit with some reservation. The Conclusions to the Discussion section of the AHRQ draft say this: ”Multiple case definitions for ME/CFS exist with those that require symptoms of PEM, neurological impairment, and autonomic dysfunction representing a more severe form of the condition….. Although CBT and GET have shown benefit in some measures of fatigue, function and global improvement…GET appears to be associated with harms in some patients….” They note a problem, but again draw back from the obvious conclusion–that GET will make most patients who genuinely have ME worse, not better, unless they wisely withdraw from the study–the high rate of withdrawal from most studies using GET, most of which are done under the Oxford definition is noted, but again the clear implications are avoided. This was not only unwise, but the review will serve to perpetuate the erroneous application of Oxford based studies to ME patients who may be harmed by them.

It is abundantly clear that this systematic evidence review, unless drastically revised, cannot form the basis for a helpful refinement of the diagnosis and treatment of our disease–call it what you will. Maybe the IOM and/or the P2P will produce a new definition that will please nearly all stakeholders, and allow us to forget bitter feelings aroused when the HHS rejected the request from 50 of the top researchers in the field to accept and use henceforth the CCC definition.

But I propose a Declaration of Independence from the Oxford Definition, as follows:

A Declaration of Independence:

1) Whereas the current use of multiple definitions has had the effect of drowning out the evidence that shows the serious condition of most patients who have ME according to more specific definitions;

2) And whereas the Oxford definition demands only one “symptom,” prolonged fatigue, which is carefully defined in the Glossary as a “subjective sensation” which is explicitly not “to be confused with impairment of performance as measured by physiological or psychological testing. The physiological definition of fatigue is of a failure to sustain muscle force or power output.”

3) And whereas there are now many published studies showing from various perspectives both a physiological inability to “maintain muscle force or power output” and also impairment of cognitive performance accompanied by dysfunction visible on proper scanning;

4) And whereas the list of “exclusionary conditions” lists “organic brain disease,” and evidence is is accumulating of such signs as white spots, brain hypoperfusion, and brain dysfunction;

We therefore declare that the claimed results of studies done under the Oxford definition cannot be represented or interpreted as addressing patients with ME, but only as covering patients with unexplained fatigue that may legitimately be interpreted as “subjective sensations.”

We sufferers from ME further declare that the Oxford definition be interpreted as written: that it explicitly excludes us from participation in both studies done under its name, and from reviews that include such studies. We request relief from the suffering that has been inflicted upon us by this careless and erroneous practice.

Oxford, besides its dreaming spires, has also been known as the “last home of lost causes,” and the tag “Et in Arcadia Ego” has been also interpreted as meaning “I, death, am to be found even in Arcadia,” as in Poussin’s painting. Oxford has been death to us—we demand to be freed from its control.

10 Responses to The Oxford Problem

  1. emma594 says:

    When I was diagnosed, we used the Canadian Criteria. Are they not used any more?

    • Jennie Spotila says:

      I was diagnosed using Fukuda! Canadian Criteria are very much in use, and preferred by many experts. Oxford is more than 20 years older than Canadian. I heard one expert describe Oxford as an embarrassment. But the P2P is incorporating Oxford studies as applicable to all ME/CFS patients.

  2. Lisa Petrison, Ph.D. says:

    My own Ph.D. studies (Kellogg School of Management, Northwestern University, 1998) and subsequent research work focused on learning and using methodologies used in psychology work. However, I do not recall ever seeing an article in a higher-tier journal with as many serious flaws as the PACE study (a study designed to eliminate objections to the previous papers from that group).

    The idea that the government would be classifying this study as “good-quality” is therefore remarkable to me. In order to make the problems easily understandable to non-academics as well as academics, I have presented them in this summary.

  3. N A Wright says:

    For those having problems with hyperlink, the original article is here:

    The name “Oxford Definition” has taken on fetishistic meaning in patient circles, far beyond any significance of the original paper, in my view this is very much to the detriment of effective advocacy. There’s a danger of back projecting the relevance of Sharpe et al (JRSoc Med 1991) simply because it was used in PACE and its associated studies of the last decade, with the failings of PACE confused for the failings of Sharpe et al 1991.

    To set as Chris Heppner does, the context of Sharpe et al 1991 as “In 1990 a group of doctors, Dr. P.D. White and Dr. S. Wessely prominently among them” is to fall into the trap of back projecting. In 1991 Sharpe, Wessely and White who all later became closely associated with psychological approaches to ME/CFS, were far from prominent in their field, indeed they were insignificant in the company of those attending the Oxford meeting, which included 5 professors and a departmental head. And that is to ignore the 6 other named non attending contributors which included two other Professors, notably Peter Behan*, whose work on PVFS and (with Chaudhuri) on ME/CFS, remains important if largely unreplicated.

    It is true that the psychiatrist representation at the Oxford meeting was proportional large, but it was still out numbered 2:1 by other specialisms. The Oxford Definition may have been flawed but it was a flawed across a range a perspectives, not as a product of deliberate psychologisation. This is important to understand because it goes to the heart of why the Oxford Definition should (as noted at the P2P meeting) be retired. Sharpe et al 1991, provides what should now be a ‘null’ position in terms of research criteria; as Jenny points out in her latest post– science is an iterative process, in which denigrating past efforts isn’t necessary, it is only required that an improved position is demonstrable.

    A final point on prominence amongst the Oxford contributors. In 1991 of the eight psychiatric contributors, the name Clare was by far the most notable. Anthony Clare, more than anyone else in the third quarter of the 20th century, through his television and radio broadcasts made psychiatry an open and accessible issue in Britain. It was Clare sometime in the mid 80s, who was one of the first broadcasters to actually cover ME seriously, identifying it as crossing medical boundaries – his ideas may not be acceptable 30 years on but if one wants to look for a motive force behind the Oxford Definition, then Clare would be a more substantial candidate than any of his then juniors.

    For any art pedants et in arcadia egocomes from Guercino’s 1618 painting of that name Poussin rather jollied up the scene in his version of 20 years later.


  4. Rivka says:

    thank you, Chris Heppner, for this good work!

  5. Ren says:

    “A long habit of not thinking a thing wrong, gives it a superficial appearance of being right, and raises at first a formidable outcry in defense of custom. But the tumult soon subsides. Time makes more converts than reason.”

  6. Mary Dimmock says:

    @N A Wright

    Based on my reading, Chris is not back projecting Wessely’s relevance, importance or views onto 1991 when Oxford was published. Based on sources from that time, we know that by at least the late 1980s, Wessely had emphasized psychological issues and promoted the idea that the symptoms were the result of inactivity.

    Regarding his importance… in 1991, NIH held a conference to discuss the issues in the case definition and make forward recommendations (which included boadening the criteria that happened in Fukuda). NIH invited participants from the Holmes, Australian and Oxford definition efforts.

    Wessely attended that meeting and as far as I can tell was the only Oxford author to do so. I cant tell whether his presence reflected prominence or simple availability but the resultant conference report reflects a strong psychiatric focus that he undoubtedly endorsed.

    Regarding “science is iterative”
    What is remarkable to me about Oxford is that its so impossibly broad that it cant help but fail as a protocol for a research study regardless of your beliefs about the causation of the disease.

    Tightly defined and controlled subject selection is a first principle in research – its essential for your science to make sense. And if you don’t have well defined subject selection criteria, the science wont make sense no matter how well designed everything else is. You cant iterate on the back of bad subject selection. All you can do start over again.

  7. N.A.Wright says:

    @Mary Dimmock. I’ve only just come back to read replies and I guess the discussion has moved on – however:

    Tightly defined and controlled subject selection is a first principle in research – its essential for your science to make sense. And if you don’t have well defined subject selection criteria, the science wont make sense no matter how well designed everything else is. You cant iterate on the back of bad subject selection. All you can do start over again.

    This may be true for RCTs but certainly does not cover all the statistical demands of medical science (and not in anyway the generality of science). Overly strict criteria can mean that results can not be generalised outside of the test group, something which is fine if the criteria define something real, but in the case of symptom only classification the probability is that one will get false positives that have no meaning when replication is attempted. You simply ends up with study specific results which require another layer of hypothosis to explain variation between studies – pretty much where all the supposed biomarker work has led and which show no evidence of being relatable to symptom based selection no matter how tight that could be made.

  8. Chris says:

    Maybe I can try to mediate between N.A. Wright and Mary Dimmock? I agree with Mary Dimmock–fortunately, since she is largely defending my position–both in seeing the Oxford as impossibly broad and in that “tightly defined and controlled subject selection is a first principle in research.”

    But N.A. Wright also has a point in writing that “overly strict criteria can mean the results can not be generalized outside the test group…in the case of symptom only classification the probability is that one will get false positives that have no meaning when replication is attempted.” This may be true, but the whole point of current research is to get miles beyond the “symptom only” stage–whichever meaning you give to that key word “symptom”–we know what meaning Oxford gives, “subjective sensation.” Fatigue and PEM are not symptoms in that sense–research has shown that “fatigue” correlates with increased mitochondrial dysfunction, with rapid increase of lactic acid in muscles, which defects in clearance mechanisms prolong, with increasing cognitive difficulties, with brain hypoperfusion, etc. etc. And PEM can be demonstrated physiologically with a 2 day VO2 Max test and graphically by showing the gene expression and cytokine release triggered by exercise in people with ME, and other tests. These are physiological events made visible and measurable. It is the refusal to allow entry to evidence like this that is one of the things that makes us so hostile to the Oxford definition. Fortunately, that is now almost certainly out of the game, and attention is now shifting to the Fukuda and its CDC derivative, while we await the outcome of the IOM study.

    Clearly finding the “ideal” definition is still a work in progress, but most of the active researchers (outside the closed self-regarding circle of Oxford psychiatrists) seem to be finding the Canadian pretty good in practice.

    When we know more about the physiological basis of subgroups, when a panel of cytokine levels or a simpler, less damaging 2 day exercise test can be found, when NK function can be more easily tested, when…. (fill in your favourite candidates for biomarkers–they are coming), then smart people will be able to refine the definition process much further. Until then, they can only hunt for the target, try a shot, see what results, and hone the definition further. Which they are already doing–in some ways the International is an advance on the Canadian, but in practice Jason has shown it has unexpected and unwelcome effects, so on to the next stage. “Tightly defined and controlled subject selection” is still the target, though finality has by no means been reached.

