Simulating scenarios, simulating datasets, simulating random variables.
Class notes: Permutation Tests
Baumer, Horton, and Kaplan (2021), Simulation (Chp 13) in Modern Data Science for R.
What is a test statistic?
What is a p-value?
Why for a two sample comparison (treatment A vs treatment B) is it okay to use \(\overline{X}_A - \overline{X}_B\) for a test statistic in a permutation test, but for a t-test the test statistic is necessarily \(t^* = \frac{\overline{X}_A - \overline{X}_B}{\sqrt{s^2_A/n_A + s^2_B/n_B}}\) (that is, divided by a measure of variability)?
How do you know what to permute in order to create a null sampling distribution?
What does “exchangeability” mean (as a technical condition) when discussing permutation tests?
What is the difference between a permutation test and a randomization test? Are there times when doing a randomization test is possible?
What is power? What are type I and type II errors?
In a permutation test, sometimes there are many test statistics to choose from (which address the same hypotheses). Why wouldn’t you want to try them all and choose the one that gives you the highest level of significance?
When is it acceptable to claim that the resulting “significant” outcome is actually a causal relationship (and not just an association)?
In class slides for both 9/28/21 and 9/30/21.
Rossman & Chance applets:
Statistics without the agonizing pain - John Rauser at Strata + Hadoop 2014
RStudio cheatsheets – there is one on purrr!
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hardin47/m154-comp-stats, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".