# Redownload the package to get the data
# remotes::install_github("lbelzile/hecedsm")
data(teller, package = "hecedsm")
library(ggplot2)
library(dplyr)
|>
teller group_by(course, nweeks) |>
summarize(merror = mean(error)) |>
ggplot(aes(x = nweeks,
y = merror,
group = course,
color = course)) +
geom_line() +
theme_classic() +
theme(legend.position = "bottom") +
labs(y = "",
subtitle = "average monthly error (USD)",
x = "number of weeks of 1-1 training")
Problem set 5
Complete this task in teams of two or three students.
Submission information: please submit on ZoneCours
- a PDF report
- your code
Write the name of the teammates on your report.
We consider a two-way analysis of variance using fictional data from Example 6.4 of Berger et al. (2018) on bank tellers: 100 tellers were offered one-on-one training with an experienced clerk for a certain number of weeks on the job, possibly in addition to formal training period. The goal of the study is to determine which combination is most efficient at reducing the monthly error in balance (in dollars). You can access these data directly from R from the hecedsm
package or download the SPSS data.
Download and answer the following questions. Write a report as if you were doing an analysis for a scientific publication.
- Produce a plot of the monthly error per teller as a function of the number of weeks of one-on-one training.
- Produce a quantile-quantile plot of the residuals and use Levene’s test to check whether the variance in each subgroup is the same. Report on these preliminary checks.
- Assess whether there an interaction between number of weeks of one-on-one training and the formal training using an hypothesis test.
- Compute pairwise contrasts between weeks of one-on-one supervision for the case of people with formal training.
- Compare the difference in errors between 8 and 4 weeks of one-to-one supervision, for people with and without formal training.
Hint: to create a pretty interaction plot in R, try the following: