I am always interested in the level of evidence of scientific medical studies—simply because it is the level of evidence that will determine my actions as a doctor, in continuing medical education, and as an academic teacher. The level of evidence of the arguments presented is also of central importance in guideline development.

The level of evidence is determined by the probability theory, more precisely the p-value, which describes the probability that different results than the reported ones may be achieved. In order to acquire confirmative level of evidence, one or more equivalent or hierarchically ordered hypotheses that are to be tested have to be set out a priori and in writing. Further, the required number of cases needs to be calculated, if necessary by taking into account multiple testing.

In addition to the results for the primary outcome variables, for which confirmative level of evidence can be achieved if they were planned as described, it is quite common for results of secondary outcome variables to be reported, which have merely hypothesis generating power. Adjusting the p-values of the results of secondary outcome variables is possible only if the variables were fixed a priori. If the secondary outcome variables are defined only at the end of the study or even after the evaluation of the collected data, the suspicion arises (and can’t be cast aside easily) that the reported secondary outcome variables were selected from a plethora of possible variables, on the basis of whether they suited results that were arrived at by chance. Adjusting the p-values is not possible in that scenario.

DOI: 10.3238/arztebl.2010.0417a

Prof. Dr. med. Dipl.-Chem. Frank Pohlandt

Leitlinien-Beauftragter der GNPJ

Fünf-Bäume-Weg 138/1

89081 Ulm, Germany

frank.pohlandt@uni-ulm.de

1.
Victor A, Elsäßer A, Hommel G, Blettner M: Judging a plethora of p-values: How to contend with the problem of multiple testing — Part 10 of a series on evaluation of scientific publications [Wie bewertet man die p-Wert-Flut? Hinweise zum Umgang mit dem multiplen Testen – Teil 10 der Serie zur Bewertung wissenschaftlicher
Publikationen]. Dtsch Arztebl Int 2010; 107(4): 50–6.
