class: center, middle, inverse, title-slide # Statistical modelling ## #1.a Hypothesis tests ###
Dr. Léo Belzile
HEC Montréal --- layout: true <div class="my-footer"> <span> <a href="https://lbelzile.github.io/modstat/" target="_blank"><b>Statistical Modelling</b></a> - Dr. Léo Belzile, HEC Montréal </span> </div> --- # Hypothesis testing .center[**Trial analogy**] <img src="img/c1/tests/01-intro-12_Angry_Men_scene.jpg" width="65%" style="display: block; margin: auto;" /> .figcaption[Screenshot of the courtoom drama _Twelve Angry Men_ (1957)] --- ### Recipe An hypothesis test is a binary decision rule Below are the different steps to undertake: -- 1. Define the variables of interest 2. Formulate the alternative and the null hypotheses, `\(\mathscr{H}_a\)` and `\(\mathscr{H}_0\)` 3. Choose the test statistic and compute the latter on the sample 4. Compare the numerical value with the null distribution 5. Obtain the *p*-value 6. Conclude in the setting of the problem --- ### Tech3Lab <img src="img/c1/tests/01-intro-Tech3.png" width="85%" style="display: block; margin: auto;" /> --- ### Texting while walking: a dangerous habit? <img src="img/c1/tests/01-intro-texter.jpg" width="85%" style="display: block; margin: auto;" /> --- ### Study details - 35 participants took part in the study. - Each person had to walk on a treadmill in front of a screen where obstacles were projected. - In one of the sessions, the subjects walked while talking on a cell phone, whereas in another session, they walked while texting. - The order of these sessions was determined *at random*. - Different obstacles were randomly projected during the session. - We are only interested in one kind of scenario: a cyclist riding towards the participant. --- ### Characteristics - Population: adults - Sample: 35 individuals - Variables: - Time to perceive an obstacle: quantitative - Distraction type (cellphone call or texting): nominal variable - Variable of interest: time (in seconds) that it takes for a person to notice the obstacle when walking while texting or talking on a cell phone (measure through an encephalogram) --- ### \#1. Define the variables of interest - `\(\mu_{\texttt{c}}\)` be the average reaction time (in seconds) during a call (`c`) - `\(\mu_{\texttt{t}}\)` be the average reaction time (in seconds) while texting (`t`) -- ### \#2. Formulate the null and alternative hypothesis - Hypothesis of interest: does texting increases distraction? - `\(\mathscr{H}_a: \mu_{\texttt{t}} > \mu_{\texttt{c}}\)` (one-sided) - Null hypothesis (Devil's advocate) - `\(\mathscr{H}_0: \mu_{\texttt{t}} \leq \mu_{\texttt{c}}\)` Express the hypothesis in terms of the difference of means `\begin{align*} \mathscr{H}_a: \mu_{\texttt{t}} - \mu_{\texttt{c}}>0. \end{align*}` --- ### \#3.Choose the test statistic We compare the difference of the mean reaction time - one-sample _t_-test for `\(\texttt{t}-\texttt{c}\)` (paired _t_-test) `\begin{align*} T_D=\frac{\overline{D}-\mu_0}{\mathsf{se}(\overline{D})} \end{align*}` - `\(\overline{D}\)` is the mean difference in the sample. - Under `\(\mathscr{H}_0: \mu_0=\mu_{\texttt{t}}-\mu_{\texttt{c}}=0\)`. - The standard error of `\(\overline{D}\)` is `\(\mathsf{se}(\overline{D})=S_D/\sqrt{n}\)`, where `\(S_D\)` is the standard deviation of the variables `\(D_i\)` and `\(n\)` the sample size. --- .panelset[ .panel[.panel-name[**SAS** code] ```sas proc ttest data=statmod.distraction side=u; paired t*c; run; ``` ] .panel[.panel-name[**SAS** output] .pull-left[ <img src="img/c1/tests/tests_sas_g1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="img/c1/tests/tests_sas_g2.png" width="100%" style="display: block; margin: auto;" /> ] ] .panel[.panel-name[**R** code] ```r data(distraction, package = "hecstatmod") with(distraction, t.test(t-c, alternative = "greater", mu = 0) ) ``` ] .panel[.panel-name[**R** output] ``` ## ## One Sample t-test ## ## data: t - c ## t = 2.91, df = 34, p-value = 0.0032 ## alternative hypothesis: true mean is greater than 0 ## 95 percent confidence interval: ## 0.13092 Inf ## sample estimates: ## mean of x ## 0.31297 ``` ] ] --- ### \#4. Compare the statistic with the null distribution The null distribution is Student-_t_ with 34 degrees of freedom, `\(\mathsf{St}_{34}\)`. We are only interested in the probability that `\(T_D > 2.91\)` under `\(\mathscr{H}_0\)`. -- ### \#5. Obtain a *p*-value The _p_-value is `\(0.0032\)`, which is smaller than `\(\alpha=5\%\)`. ### \#6. Conclude in the setting of the problem We reject `\(\mathscr{H}_0\)`, meaning that the reaction time is significantly higher when texting than talking on the cellphone while walking. The estimated mean difference is `\(0.313\)` seconds.