05. Permutation Tests
Simulating scenarios, simulating datasets, simulating random variables.
Agenda
September 30, 2024
- Review: logic of hypothesis testing
- Logic of permutation tests
- Examples - 2 samples and beyond
September 30, 2024
- Conditions, exchangeability, random structure
- Different statistics within the permutation test
Readings
Class notes: Permutation Tests
Baumer, Horton, and Kaplan (2021), Simulation (Chp 13) in Modern Data Science for R.
Reflection questions
What is a test statistic?
What is a p-value?
Why for a two sample comparison (treatment A vs treatment B) is it okay to use \(\overline{X}_A - \overline{X}_B\) for a test statistic in a permutation test, but for a t-test the test statistic is necessarily \(t^* = \frac{\overline{X}_A - \overline{X}_B}{\sqrt{s^2_A/n_A + s^2_B/n_B}}\) (that is, divided by a measure of variability)?
How do you know what to permute in order to create a null sampling distribution?
What does “exchangeability” mean (as a technical condition) when discussing permutation tests?
What is the difference between a permutation test and a randomization test? Are there times when doing a randomization test is possible?
What is power? What are type I and type II errors?
Ethics considerations
In a permutation test, sometimes there are many test statistics to choose from (which address the same hypotheses). Why wouldn’t you want to try them all and choose the one that gives you the highest level of significance?
When is it acceptable to claim that the resulting “significant” outcome is actually a causal relationship (and not just an association)?
Slides
In class slides for both 9/30/24 and 10/02/24.
Additional Resources
Rossman & Chance applets:
Statistics without the agonizing pain - John Rauser at Strata + Hadoop 2014
Posit cheatsheets – there is one on purrr!
:::