There has been much buzz in the CFS community over the latest CBT (cognitive behavioral therapy) study claiming astounding recovery rates in CFS. This time, the study is called FITNET, a test of internet-based CBT for adolescents with CFS in the Netherlands. I want to focus on two issues with the study: experiment design and results interpretation. For a detailed review of what the paper says, you might want check out the summary on Research 1st. There has also been a great deal of discussion in places like the Phoenix Rising forum, and I found a lot of helpful information there. Other advocates are much more knowledgeable about the issues related to CBT for CFS than I am, and I hope we will see critical examinations of the FITNET study from them as well.
FITNET compared two groups of adolescent CFS patients in the Netherlands. One group received “usual care,” consisting of face-to-face CBT, graded exercise therapy and/or other treatments. The other group received internet-based CBT for six months. School attendance, fatigue severity, and functioning were measured at study outset, 6 months and 12 months. The authors claim that 63% of the FITNET group recovered, compared to only 8% of the usual care group. (Table 3) But a close examination of the paper shows that the results are not so straightforward.
Study design is arguably the most important part of any experiment. If the experiment design is flawed, then even perfect execution of that design will not give reliable results. In my reading of FITNET, I found many flaws in the design of the study.
First, FITNET was intended to test the effectiveness of internet-based CBT. But this is not as simple as putting some questionnaires on a secure website. Software design is a very complex field, and evidence is mounting that human beings process visual information on a computer in particular ways. Best practices in information design are emerging, but when the software in question is intended to deliver cognitive behavioral therapy, there are many layers of complexity to ensuring that the therapy is delivered effectively and as intended. The FITNET authors report in their study protocol that the textual content was revised by two children’s authors “on readability for adolescents.” That was a good step to take, but there is no indication that any other work was done on the design besides the “cooperation with adolescents with CFS who critically appraised text, lay-out and structure.” From an information design perspective, these steps are woefully inadequate. The paper provided a link to the FITNET module, but I could not access it to evaluate the language and presentation.
Second, there is almost no information reported about how completion of the FITNET modules was measured. Progress was monitored in terms of whether the modules were “completed.” But what about page views, time per page, and how module completion was measured? Were there essays or other assignments submitted to show the participant had understood and acted on each module? Or was there a questionnaire with check boxes that a participant could breeze through without really processing the information? Was any effort made to verify or confirm results/answers reported by participants? How was comprehension of the modules assessed? In face to face therapy, the therapist can use body language and facial expressions to help assess whether the patient “gets it.” How did this internet model ensure that participants were internalizing the lessons? In addition, the paper states that “parents followed a parallel program” but there is no data reported about what those modules consisted of, what advice the parents were given, and how their participation was monitored (if at all). Without any of this data, we cannot assess the quality of participation in FITNET.
Third, FITNET did not seem to make adequate allowances for the fact that in some people, CFS tends to have a relapsing-remitting pattern. Data on school attendance and completion of three symptom inventories was collected at outset, 6 months and 12 months. The school attendance data was based on the two weeks previous to the measurement point. There is no data on school attendance and fatigue severity during the intervening months. This is a weakness in the study because if a participant was having a particularly good or bad two-week period, the data would capture that without capturing a more normal or average pattern over a longer period of time.
Fourth, and most importantly, the long list of variables between the FITNET and control groups makes it very difficult to discern what might account for the different results. The FITNET group had strong parental involvement. Their CBT was overseen by cognitive behavioral psychotherapists. FITNET participants did not receive graded exercise therapy or any other treatment for fatigue. The goal of FITNET was return to full-time school, and a back to school plan was discussed early in the process. Finally, the FITNET participants received their CBT at home on their own schedule, a huge energy saving measure.
In contrast, the control group received “usual care” which is not defined with much detail in the paper. Usual care consisted of one or more of the following: group or individual rehabilitation programs, face-to-face CBT, graded exercise therapy, “alternative treatment” or nothing at all. More than half of the usual care group was receiving more than one of those treatments, and 10% received none. There is no information available on how long those therapies were delivered, whether psychotherapists were involved, whether parents were involved, whether full-time school attendance was presented as a goal, or what other treatments may have been delivered.
In treatment studies, particularly psychological treatment studies, it is very difficult to control for all variables. But to me, this seems like a long and significant list of variables that were not controlled. This study has been portrayed as a test of internet-delivered CBT, and if the control group was receiving the same CBT face to face, then I think the results would tell us what benefit (if any) was derived from the internet delivery. FITNET doesn’t tell us this, and in fact, the authors acknowledged that in the study protocol published in 2011. We can’t determine the cause and effect for the FITNET results. Was it the flexibility of receiving internet therapy at home? Was it the support and involvement of parents? Was it the high number of contacts between therapist and participant? Or the focus on the goal of return to school? Or was it a result of the FITNET group not having to participate in GET? There is just no way to know.
We have to pay close attention to the design of CFS studies, especially studies testing psychological therapies. There are more of these studies in the pipeline, including the very expensive NIH-funded study by Dr. Fred Friedberg. Sloppy design will play out in the study results and, as we have seen with FITNET, the results are trumpeted far and wide.
In addition to the multiple differences between the FITNET and usual care groups, there are a number of other questions about how to interpret the reported results.
First, the study protocol published in 2011 promised collection of certain data that was not reported in the final paper. Specifically, the protocol stated that activity patterns would be assessed with an actometer in order to separate the FITNET group into relative active and passive groups. CBT modules would then be customized based on the pattern of the patient. None of this data was reported or referenced in the paper. Why? In addition, actometer data collected during treatment or at its conclusion would have been particularly helpful, but there is no indication that this data was collected. Were the FITNET participants able to become more active over the course of the study? The study focused on school attendance as the primary outcome – but anyone familiar with CFS knows that one activity can be emphasized to the detriment of others. Were the FITNET participants going to school but then bedridden the rest of the time? Was there any change in their socialization or physical activity? One study of graded exercise that originally reported increased activity by CFS patients later revised that conclusion based on further data analysis. They found that the patients did sustain more walking, but cut back on other activities, resulting in no increase in activity overall. We don’t know if this happened in FITNET or not, but I think it is fair to ask whether return to school actually represented any overall increase in physical capacity.
Second, I am confused by the study’s definition of “recovery.” The authors state that recovery was defined post hoc. This means that recovery was not defined in the study protocol, but only after all the data was collected. I do not know how typical this is in behavioral studies, but it seems to me to be a practice fraught with bias. Looking at your data clusters and then defining a subset as recovered seems to be the exact opposite of how it should be done. Recovery, defined by objective measures as in this paper, should be established in advance. And in a sense, the authors did this by setting return to school as the primary outcome of the study. But as I mentioned above, return to school does not actually equate to recovery. The participants could have been in school at the cost of other parts of their lives, and there is the further issue that attendance in school says nothing about performance in school. Can we define a previously A student as recovered if she attends school full time but gets Cs and Ds?
Another aspect of the recovery definition is troubling to me. Recovery is defined as having a fatigue severity score of less than 40, physical functioning score of 85% or more, and school absence of 10% or less in the past two weeks. But in order to be eligible for the study, patients were considered to have severe fatigue is they had a fatigue severity scale of 40 or more, physical functioning score of 85% or less, or a school attendance of 85% or less. This puts a single bright line down the fatigue severity and physical functioning scores: fatigue severity of 40 or more = severe fatigue, but less than 40 = recovered; a physical functioning score of 85% or less = severe fatigue, but 85% or more = recovery. I don’t think you have to be familiar with CFS to recognize that people are not severely fatigued or recovered with no gray area in between. The school attendance requirement does allow for a little wiggle room. Missing 1 day of school or less in the previous two weeks = recovered; missing 1.5 days or more in the previous two weeks = severe fatigue. No correction appears to have been made for missing school for other reasons, such as other illness or family obligations.
Fourth, the authors chose to use a standard deviation of 2 in analyzing their results. What is standard deviation? Basically, standard deviation represents how much variation exists from the average result. If you picture a typical bell curve, the middle of the curve is the average result. As you move away from the center in either direction, the results on the curve are more and more different from the average. The curve can be “sliced” into standard deviations using a mathematical formula. The lower the standard deviation, the less variability there is among individual results. A higher standard deviation means there is more variation among individual results. Defining recovery based on a standard deviation of 2 means that the outcomes for a “recovered” patient are more different from the average outcome than they would be if the standard deviation was 1. This made a huge difference in the results of this study, as can be seen in the appendix to the paper.
The results reported in the main paper were calculated with a standard deviation of 2. In order to be counted as recovered, participants needed to have school absence in the previous two weeks of 10% or less (1 day), fatigue score of less than 40, physical function score of 85% or greater, and report either “I have completely recovered” or “I feel much better but still experience some symptoms.” With these parameters, 63% of the FITNET group was recovered compared to 8% of the usual care group. (Online appendix Table 3a)
When the authors calculated the results with a standard deviation of 1, the numbers are very different. In order to be counted as recovered in this calculation, participants needed to have school absence in the previous two weeks of 6% or less (~ half a day), fatigue score of less than 35, physical function score of 90% or greater, and report “I have completely recovered.” With these stricter parameters, only 36% of the FITNET group was recovered compared to 5% of the usual care group (Online appendix Table 3b).
Examining this data, it is very hard for me to understand why the authors chose to report the more dramatic results from Table 3a – except for the fact that the results were so dramatic. The recovery rate of 63% sounds so impressive, and supports the authors’ conclusion that “Cognitive behavioral therapy for adolescents with chronic fatigue syndrome can now be broadly made available as FITNET.” Any treatment that can cure 2/3 of CFS patients should be widely adopted, right? The 36% recovery rate under the tighter definition just doesn’t sound as impressive a marketing statement. But that tighter definition is much more reasonable. It allows for a small gray area between the measurements that indicate severe fatigue vs. recovery, as I discussed above. It also applies common sense in requiring the participant to say “I have completely recovered” in order to count as recovery.
What has been lost in the discussion of FITNET is that the 36% recovery for FITNET users vs. only 5% for usual care is a significant result. The real question is WHY? Why did FITNET users do so much better than usual care? Now we’ve come full circle to my point about all the variables between the FITNET and control groups. There are many possible explanations, including parental involvement, focus on return to school, therapy delivery at home, and the use of graded exercise therapy. The FITNET group did not have GET, but 49% of the usual care group did. Because of the study design, and the failure to control for all these variables, we cannot accurately identify why the FITNET group responded so well. Because we cannot accurately identify why the FITNET group responded so well, we cannot draw any reliable conclusions about internet delivery of CBT and the likelihood of its success.