

Here we store material for the Data Analysis part of the CB2030, Systems Biology course.

This project is maintained by statisticalbiotechnology

Question and answers – Hypothesis Testing

Here we store selected questions and answers from previous years incarnations of the course.

Definition of probabilities

  1. What properties of the p value makes them uniformly distributed under H0?

    A1. They are sampled from the same distribution that they are tested against. An analogue is that the probability of a randomly selected Swede to be among the p% tallest persons in Sweden is p.

Definition of p value

Significance levels

One- vs two-sided tests

Q: When to use one-tailed p value and when two-tailed p value?

For instance if you are interested in the differential expression between two samples, you usually want to know only when a gene is expressed at different levels, without considering whether it is over-expressed or under-expressed. In this example the significant “over-expression outcomes” (i.e. when the average difference is positive) would fall in the extreme right of the distribution, while the significant “under-expression outcomes” (i.e. when the average difference is negative) would fall in the extreme left of the distribution. Since you’re interested in both the types of outcomes, you collectively consider them using the two-sided p value. If you, for some reason only are interested in testing under or over-expression, you are free to use the more sensitive one-sided p value. Here is a wikipedia entry on the topic.

p value hacking

Null hypothesis (H0)