Spotlight on Methodology Fundamentals

On the subject of methodology, the COVID 19 pandemic has stimulated a lot of discussion about design of clinical trials,

from this came our idea to highlight essential methodological concepts.




In order to become more familiar with methodological issues the Cochrane Neurological Sciences Field decided to interview two experts in Neurology and Clinical methodology, both from the world of Stroke, International Clinical Trials and Cochrane, on various topics.

The Methodology Fundamentals are aimed primarily at young people, but can also be particularly useful to health professionals who wish to reorganize their knowledge in the methodological field.

We hope that these brief and informative discussions will tempt readers to further investigate the issues proposed, to this end we will select and attach papers that help to highlight the topics at hand.

Who are the experts?
Stefano Ricci - Editor, Cochrane Stroke Group, Perugia, Italy, and
Peter Sandercock - Emeritus Professor of Medical Neurology, University of Edinburgh, UK.

What are the Methodological Questions?

1) When are observational studies enough?
2) Why RCTs are better than observational studies?
3) What kind of RCTs are needed today?
4) Why it is so difficult to plan and arrange large pragmatic trials?
5) What regulatory authorities should modify to allow for the realisation of these kind of trials?
6) How can we ensure that clinical research is done where the results will eventually be applied?

Q1:  When are observational studies enough? 

Observational studies are extremely useful and indeed irreplaceable in various relevant settings:

a)   Evaluation of epidemiological characteristics of a disease (e.g. how common is the condition? what is its prognosis? What are the risk factors?).

b)   Evaluation of the external validity of a trial result.

c)   Planning a trial. Estimate the effects of a treatment from observational epidemiology.

To give an Example:  an observational study shows that an increase of “Y” mmHg in Systolic Blood Pressure is associated with an X% increase in the risk of stroke. In a randomized trial we can use this information to evaluate if the administration of a drug “A” lowering the BP by “Y” mmHg reduce the risk of stroke compared to control. Is the reduction that we expect to find, confirmed from the results of the trial?  And if it is not confirmed, what are the reasons for this difference?

d)   To put the results of a treatment trial in clinical context, including the so-called phase 4 studies (mostly to evaluate side effects).

The role of observational studies, in these specific fields of medicine, cannot be underestimated; however, they cannot substitute randomized control trials when the aim of the study is to evaluate the effect of a new treatment on a specific outcome. This is because in modern medicine the problem is not to pick up the large effect of a treatment (for which no RCT is indeed needed) but a relatively moderate difference, which however -in terms of absolute effect- would modify the outcome of hundreds of thousands of patients in the world. In fact, a 3 or 4% absolute difference in death and disability in acute stroke looks like a modest result, but if that treatment were largely applied then a huge number of patients would benefit from it. If our aim is to test efficacy, we cannot rely on non-randomized observational studies of therapy because this kind of study is prone to many biases, including selection and attrition bias, which can completely distort the result. So, let’s give observational studies their merits, but not give them more room than they actually deserve.

For further reading:


Q2: Why are RCTs better than observational studies?

The simplest answer to this question is: because they reduce (or should we say almost eliminate) many important sources of bias. Biases in clinical research are usually described as follows:

Selection bias is a systematic difference between patients who are selected for treatment with the new therapy and those who are not. This difference may be directly induced by a researcher who believes, for instance, that patients with a less severe disease should be treated with the new treatment, while more severe cases receive only standard care;  in which case, the milder disease in the treated cases will mean they have a better outcome compared to controls, thus making the treatment appear more effective. The magic of randomization ensures that the 2 groups differ only with respect to the treatment being tested.  However, even if the sequence of treatments is randomized, selection bias can occur if the researcher has prior knowledge of which treatment (active or control) the next subject to be enrolled in the study will receive. So, for example, in a trial where the treatment allocation is determined by opening the next treatment pack in the randomized sequence, and if there are small differences between the active and control packs, the researcher may discover the next patient will placed in the control pack and decide not to enroll that individual in the study.  This can be prevented by adequate allocation concealment (e.g. use of a secure web-based treatment allocation system).  Finally, a study with only a small number of patients (i.e. a very small sample size) may cause imbalance in various variables, but usually this can be avoided with a correct and prudent sample size calculation in advance. 

Performance bias, two groups receive different care and ancillary cures, consequently the group that receives better background treatment has a better outcome because of this, and not because of the new treatment. In general, this bias can be avoided with treatments that are truly blinded, so that neither researcher nor subject knows if they are in the active or control arm of the trial. Some interventions that involve testing the organization of care (e.g. stroke unit care vs general medical ward), surgery or therapist intervention (e.g. physiotherapy), cannot be blinded, so the trial must include strategies to minimize the impact of performance bias between study groups.  Some (new) treatments carry with them some ancillary procedures (i.e. more frequent control of BP and neurological status) which can be considered part of the treatment. If it is not possible to blind both the person delivering the treatment and the study participant, then it is vital – as far as possible - to prevent the person who is measuring the outcome of treatment to be aware of the treatment allocation for that subject.

Attrition bias is the difference in outcome due to the compliance to treatment or follow-up. It has been frequently shown in randomised placebo-controlled trials that good compliance is associated with better outcomes even among patients allocated placebo, compared to poor compliance with the scheduled procedure. Loss of patients to follow-up is an additional source of bias, especially if it differs between treatment groups. This bias can be avoided by the use of the intention to treat principle, which is the analysis of each randomized patient, no matter if he or she actually took or received the assigned treatment, and by ensuring that follow-up is complete for both treatment groups.

Detection bias is the different way outcome is evaluated, if the treatment the patient actually took is known. Apart from double blind studies, this bias can be avoided (and indeed is in most recent big trials) with the so-called PROBE design,  which means that both patients and the doctors who care for them know the treatment arm, but a third person, who was not involved in patient’s care, will do the follow-up and the outcome evaluation.

 Therefore, if the randomization procedure is correct (today, a web-based system is usually used), and the above-mentioned biases have been avoided, the trial will have a good “internal validity”, meaning the results can be trusted.  A clinician reading the trial report however should always ask this question: “Ok, they did it, but what about me?” This is the big problem of external validity, or, in other words, the extent to which the results of a study can be generalised to the “real world” population. We will come back to this in the next short note. 

For further reading: 

The Magic of Randomization versus the Myth of Real-World Evidence, Rory Collins, F.R.S., Louise Bowman, M.D., F.R.C.P., Martin 

Q3: What kind of RCTs are needed today?

In stroke medicine, as well as in many other fields of therapy, treatments that offer only small absolute reductions for an individual subject in important bad outcomes (e.g. survival with significant disability) may still be very worthwhile. If the treatment is safe, inexpensive and easy to deliver, and is applied to almost all stroke patients it could avoid a significant number of bad outcomes across the whole population (aspirin, statins and blood pressure lowering are all good examples of this type of substantial population-wide benefit from modest treatment effects).

But how can we pick up these modest reductions in negative outcome (i.e. death and disability), with a reasonable certainty? We need trials with both a high internal and external validity. Internal validity is defined “The degree to which observed changes in a dependent variable can be attributed to changes in an independent variable”, or “The integrity of the experimental design and, consequently, the degree to which a study establishes the cause-and-effect relationship between the treatment and the observed outcome”. External validity is defined “the extent to which the results of a study can be generalised to the ”real world” population”. But my “real world” may well be different from yours… so we are actually  talking about the appropriateness by which a study result can be applied to non-study patients or populations. So, External Validity asks the question of generalizability: To what populations, settings, and treatment variables can this effect be generalized? The answer may well be different when different clinicians in different hospitals are asked the question; the more the various clinicians agree, the higher the External Validity is.

Internal validity is maximized when trials are large (adequately powered), well designed (simple efficient design, ensuring high adherence to the protocol) and achieving high completeness of data with no loss to follow-up. External validity is highest when the trial has broad entry criteria (to ensure relevance to the widest variety of people with the disease), incorporates a range of settings (for example in hospital based studies by including both primary, secondary and tertiary level care hospitals) spread across different regions, thus testing the intervention in the type of settings where it will be applied in practice.  To have the highest chance eventually to be implemented in routine practice,  the trial should select a clear measure of outcome that is relevant to patients, clinicians and health care planners.

We therefore need trials with minimal exclusion criteria, reflecting what happens in daily clinical practice (where we have to treat patients that present with the same condition at varying levels of severity and differing clinical manifestations).  The treatment under test should be feasible in everyday practice (or, in case the treatment is complex, with a possibility to apply it in the near future to the majority of patients).  In turn this requires that trial procedures with are easy to “embed” in the routine clinical practice. Busy physicians have very little time to dedicate to trials, and to obtain large numbers of included patients the extra work for the trialists should be minimal (efficient and high quality trial design).

We will return to the question of improving the design, quality and efficiency of trials in a future edition.

Further reading (if you are interested):
"Fundamentals of Clinical Trials."  Friedman, Furberg, DeMets.  Springer 2010. A clear, well written and very wise book on trials!