(Adapted from ISRS 3.17) Some people claim that they can tell the difference between a diet soda and a regular soda in the first sip. A researcher wanting to test this claim randomly sampled 80 such people. He then filled 80 plain white cups with soda, half diet and half regular through random assignment, and asked each person to take one sip from their cup and identify the soda as diet or regular. 53 participants correctly identified the soda.
\[H_0: p=0.5\] \[H_A: p \ne 0.5\] where \(p\) is the proportion of all people who can correctly tell the difference between a diet soda and a regular soda in the first sip
Each dot represents one simulation. For each simulation, 80 values are randomly generated, that are equally likely to be correct or incorrect, since the null hypothesis is that people are equally likely to be right or wrong in identifying the type of soda. The dots are placed at the proportion of the 80 values that are “correct” in the simulation. This is an estimate of the test statistic assuming the nyll hypothesis is true.
Our test statistic is \(53/80=0.6625\). There are no dots in the plot that are at a value greater than or equal to 0.6625 or less than or equal to 0.3375. (0.3375 is the same distance below 0.5 that 0.6625 is above it.) So the estimate of the P-value is 0.
We would conclude that we have strong evidence that people tasting the sodas are not just randomly guessing so we have strong evidence that people have some ability to tell the difference between a diet soda and a regular soda in the first sip.
Only iv. is a valid interpretation. iii. is missing the fact that the P-value is calculated by comparing the value of the test statistic to results generated assuming that the null hypothesis is true. i. and ii. are incorrect because a P-value is not an estimate of the parameter we are testing. It gives us a measure of how likely or unlikely our data are, assuming that the value of the parameter in our null hypothesis is true.
(Adapted from ISRS 3.15) A 2012 survey of 2,254 American adults indicates that 17% do their browsing on their phone rather than a computer or other device. According to an online article, a report from a mobile research company indicates that 38% of Chinese adults only access the internet through their cell phones. Suppose you wanted to conduct an hypothesis test to determine if the American survey data provide strong evidence that the proportion of Americans who only use their cell phones to access the internet is different than the Chinese proportion of 38%.
\[H_0: p=0.38\] \[H_A: p \ne 0.38\]
\(p\) is the proportion of all American adults who do their browsing on their phone rather than a computer or other device. In Question 1, we were determining if people were doing better than random guessing, with equal chance of being right or wrong. Here we are testing whether a proportion (0.17) calculated from the sample data from a survey is consistent with another value (0.38) being the value for the population.
The test statistic is 0.17.
library(tidyverse)
repetitions <- 1000
simulated_stats <- rep(NA, repetitions)
n_observations <- 2254
test_stat <- 0.38
for (i in 1:repetitions)
{
new_sim <- sample(c("phone", "other"), size=n_observations, prob=c(0.38, 0.62), replace=TRUE)
sim_p <- sum(new_sim == "phone") / n_observations
simulated_stats[i] <- sim_p
}
sim <- data_frame(p_phone = simulated_stats)
ggplot(sim, aes(p_phone)) +
geom_histogram(binwidth=0.01) +
labs(x="Proportion who use phone assuming that it is 0.38") +
geom_vline(xintercept=0.17, color="red") + geom_vline(xintercept=0.38+(0.38-0.17), color="red")
sim %>%
filter(p_phone >= 0.38+(0.38-0.17) | p_phone <= 0.17) %>%
summarise(p_value = n() / repetitions)
## # A tibble: 1 x 1
## p_value
## <dbl>
## 1 0
The estimated P-value is 0 as, assuming the proportion who use their phones is 0.38, no simulations gave a value as extreme or more extreme than 0.17. We have very strong evidence that the proportion of American adults who do their browsing on their phone is different than the proportion of Chinese adults.
A Scottish woman noticed that her husband’s smell changed. Six years later he was diagnosed with Parkinson’s disease. His wife joined a Parkinson’s charity and met other people with that odour. She mentioned this to researchers who decided to test her abilities. They recruited 6 people with Parkinson’s disease and 6 people without the disease. Each of the recruits wore a t-shirt for a day, and the woman was asked to smell the t-shirts and determine which were worn by someone with Parkinson’s disease. She was correct for 11 of the 12 t-shirts. You can read about this here.
set.seed(11) # assume the last 2 digits of my student number are 11
repetitions <- 1000
simulated_stats <- rep(NA, repetitions)
n_observations <- 12
test_stat <- 11/12
for (i in 1:repetitions)
{
new_sim <- sample(c("correct", "incorrect"), size=n_observations, prob=c(0.5, 0.5), replace=TRUE)
sim_p <- sum(new_sim == "correct") / n_observations
simulated_stats[i] <- sim_p
}
sim <- data_frame(p_correct = simulated_stats)
ggplot(sim, aes(p_correct)) +
geom_histogram(binwidth=0.01) +
geom_vline(xintercept=11/12, color="red") +
geom_vline(xintercept=0.5-(11/12-.5), color="red")
sim %>%
filter(p_correct >= 11/12 | p_correct <= 0.5-(11/12-.5)) %>%
summarise(p_value = n() / repetitions)
## # A tibble: 1 x 1
## p_value
## <dbl>
## 1 0.005
The hypotheses being tested are \[H_0: p=0.5\] \[H_A: p \ne 0.5\] where \(p\) is the proportion of shirts that the woman correctly identified.
The test statistic is \(\hat{p}=11/12=0.9167\). We simulate 1000 values presuming that the woman is randomly guessing, so the probability she is correct is 0.5. From these 1000 simulations, 5 give a value greater than or equal to 0.9167 or less than or equal to 0.0833. So the proportion of observations in our simulation that are as extreme or more extreme than what we observed is 0.005. This is our estimate of the P-value.
We conclude that we have strong evidence that the woman is identifying people with Parkinson’s differently than she would have had she been randomly guessing.
set.seed(11) # assume the last 2 digits of my student number are 11
repetitions <- 100000
simulated_stats <- rep(NA, repetitions)
n_observations <- 12
test_stat <- 11/12
for (i in 1:repetitions)
{
new_sim <- sample(c("correct", "incorrect"), size=n_observations, prob=c(0.5, 0.5), replace=TRUE)
sim_p <- sum(new_sim == "correct") / n_observations
simulated_stats[i] <- sim_p
}
sim <- data_frame(p_correct = simulated_stats)
ggplot(sim, aes(p_correct)) +
geom_histogram(binwidth=0.01) +
geom_vline(xintercept=11/12, color="red") +
geom_vline(xintercept=0.5-(11/12-.5), color="red")
sim %>%
filter(p_correct >= 11/12 | p_correct <= 0.5-(11/12-.5)) %>%
summarise(p_value = n() / repetitions)
## # A tibble: 1 x 1
## p_value
## <dbl>
## 1 0.00614
The hypotheses being tested are \[H_0: p=0.5\] \[H_A: p \ne 0.5\] where \(p\) is the proportion of shirts that the woman correctly identified.
The test statistic is \(\hat{p}=11/12=0.9167\). We simulate 100,000 values presuming that the woman is randomly guessing, so the probability she is correct is 0.5. From these 100,000 simulations, 614 give a value greater than or equal to 0.9167 or less than or equal to 0.0833. So the proportion of observations in our simulation that are as extreme or more extreme than what we observed is 0.00614. This is our estimate of the P-value.
We conclude that we have strong evidence that the woman is identifying people with Parkinson’s differently than she would have had she been randomly guessing.
Note: in tutorial you compared your answers to a. and b. with your classmates. A larger number of repetitions results in a more precise estimate of the P-value. So you should have seen less variability among the P-values you and your classmates obtained for b. than for a.
The hypotheses and number of observations (12) would be the same so there would be no need to re-run the simulation. To get the P-value, we would now look at how many of the simulations resulted in a proportion of correct values that are 12 or 0. Nothing else would change. In this case, our P-value would be 0.
R Markdown source he simulations resulted in a proportion of correct values that are 12 or 0. Nothing else would change. In this case, our P-value would be 0 (or \(<0.001\)).