Tutorial grades will be assigned according to the following marking scheme.
Mark | |
---|---|
Attendance for the entire tutorial | 1 |
Assigned homework completiona | 1 |
In-class exercises | 4 |
Total | 6 |
One survey showed that among 785 randomly selected subjects who completed four years of college, 144 smoke and 641 do not (based on data from the American Medical Association). The rate of smoking in the general population is reported to be 27%. Researchers are interested in finding out if the rate of smoking is different for college graduates than for the general population.
\(H_0: p = 0.27\)
\(H_1: p \neq 0.27\)
where \(p\) is the proportion of college graduates who are smokers.
Each dot represents one simulation. For each simulation, \(144 + 641 = 785\) values are randomly generated under the assumption that 27% of college graduates are smokers and 73% are nonsmokers, and a dot corresponds to the proportion of smokers in this simulation population of 785 people.
The test statistic is \(144/785=0.18\) here. There are no dots in the plot that are lower than \(0.18\) or larger than \(0.27 + (0.27 - 0.183) = 0.36\) (0.36 is the same distance from 0.27 than 0.18 is). Thus, the estimate of the p-value is 0, which means that the value of our test statistic is very unusual, under the null hypothesis.
Only (iii) is a valid conclusion. Conclusions (i) and (ii) are incorrect because the null hypothesis is that there is no difference between the smoking rate among college graduates and the general population. Conclusion (iv) is wrong because a small p-value (<0.01) provides strong evidence against the null hypothesis, not for the null hypothesis.
In August 2017, researchers at Dalhousie university conducted a poll to learn more about attitudes towards the legalization of marijuana across Canada. They found that 14.5% of respondants strongly disagreed with the legalization of recreational marijuana, based on a sample of 1,087 adults who had been living in Canada for at least 12 months.
Suppose a headline reads “One out of six Canadians strongly disagree with the legalization of recreational marijuana in Canada”.
\(H_0: p = 1/6\)
\(H_1: p \neq 1/6\)
where \(p\) is the proportion of individuals who strongly disagree with the legalization of marijuana in Canada.
## [1] 0.0537
The test statistic here is \(\hat{p} = 0.145\). We simulated 10,000 samples of size 1087 under the assumption that 1/6 people strongly disagreed with marijuana legalization. For each of these simulated samples, we calculated the proportion of people who strongly disagreed with legalization. From these 10,000 simulations, we found that in 5.37% of simulations, the proportion of “strongly disagree” responses was either smaller than 0.145 or larger than (1/6 + (1/6 - 0.145))=0.188333. In other words, the proportion of observations in our simulation that are as extreme or more extreme than what we observed is 0.0537, and this is our estimate of the p-value. We conclude that we have weak evidence against the null hypothesis that 1/6 people strongly disagree with the legalization of marijuana. In other words, there is weak evidence that the data contradicts the headline.
Sys.time()
function to record the time before and after your simulation, and then calculate the difference. For examplestartTime <- Sys.time()
YOUR CODE
endTime <- Sys.time()
endTime - startTime # This is the time it took to run your code
Which of the estimated pvalues do you think is the best estimate of the true p-value? What happens to the runtime as the number of simulations increases?
calculate_pvalue_legalization <- function(repetitions){
# create a vector of missing values to store results
simulated_stats <- rep(NA, repetitions) # 10,000 missing values
n_observations <- 1087
test_stat <- 0.145
distance_from_null <- 1/6 - test_stat; # note that 1/6=0.166667
other_extreme <- 1/6 + distance_from_null
for (i in 1:repetitions)
{
new_sim <- sample(c("strongly_disagree", "dont_strongly_disagree"), size=n_observations, replace=TRUE, prob=c(1/6, 5/6))
sim_p <- sum(new_sim == "strongly_disagree") / n_observations
# add the new simulated value to the ith entry in the vector of results
simulated_stats[i] <- sim_p
}
# turn results into a data frame
sim2 <- data_frame("p_strongly_disagree" = simulated_stats)
# plot
ggplot(sim2, aes(p_strongly_disagree)) +
geom_histogram(binwidth=0.01) +
geom_vline(xintercept = test_stat, color="red") +
geom_vline(xintercept = other_extreme, color="blue")
# Calculate p-value
num_extreme <- sim2 %>%
filter(p_strongly_disagree < test_stat | p_strongly_disagree > other_extreme) %>%
nrow()
pvalue <- num_extreme / nrow(sim2)
return(pvalue)
}# end calculate_pvalue_legalization function
set.seed(123)
startTime_100 <- Sys.time();
pvalue_100 <- calculate_pvalue_legalization(repetitions = 100)
endTime_100 <- Sys.time();
runtime_100 <- endTime_100 - startTime_100;
startTime_1000 <- Sys.time();
pvalue_1000 <- calculate_pvalue_legalization(repetitions = 1000)
endTime_1000 <- Sys.time();
runtime_1000 <- endTime_1000 - startTime_1000;
startTime_10000 <- Sys.time();
pvalue_10000 <- calculate_pvalue_legalization(repetitions = 10000)
endTime_10000 <- Sys.time();
runtime_10000 <- endTime_10000 - startTime_10000;
startTime_50000 <- Sys.time();
pvalue_50000 <- calculate_pvalue_legalization(repetitions = 50000)
endTime_50000 <- Sys.time();
runtime_50000 <- endTime_50000 - startTime_50000;
data.frame(repetitons=c(100,1000,10000,50000),
pvalue=c(pvalue_100, pvalue_1000, pvalue_10000, pvalue_50000),
runtime=c(runtime_100, runtime_1000, runtime_10000, runtime_50000))
As the number of repetitions increases, we expect that the estimated pvalue should get closer to the true pvalue. However, as the number of repetitions increases, we see that the runtime increases as well. When doing simulations to test a hypothesis, we need to consider the tradeoff between accuracy and runtime. We want to pick enough repetitions to get reliable results (10,000 minimum, if this is feasible) without the runtime becoming unreasonably long.
In this case, I think that choosing 10,000 to 20,000 repetitions is sensible, as this would take approximately 1 second to run (note that runtimes will vary depending on your device and the other programs you have running).
In a Gallup poll of 1012 randomly selected adults, 9% said that cloning of humans should be allowed. We want to test the hypothesis that a proportion \(c\) of adults believe cloning should be allowed.
\(H_0: p = c\)
\(H_1: p \neq c\)
where \(p\) is the proportion of adults who believe cloning should be allowed.
prop_cloning
: the proportion of adults who believe cloning should be allowed (under the null hyppothesis)repetitions
: the number of repetitionscalculate_pvalue_cloning <- function(prop_cloning, repetitions){
# create a vector of missing values to store results
simulated_stats <- rep(NA, repetitions)
n_observations <- 1012
test_stat <- 0.09
for (i in 1:repetitions)
{
new_sim <- sample(c("allowed", "not_allowed"), size=n_observations, replace=TRUE, prob=c(prop_cloning, 1-prop_cloning))
sim_p <- sum(new_sim == "allowed") / n_observations
# add the new simulated value to the ith entry in the vector of results
simulated_stats[i] <- sim_p
}
# turn results into a data frame
sim3 <- data_frame("p_allowed" = simulated_stats)
# Calculate p-value
num_extreme <- sim3 %>%
filter( abs(p_allowed - prop_cloning) > abs(test_stat - prop_cloning)) %>%
nrow()
pvalue <- num_extreme / nrow(sim3)
return(pvalue)
}
set.seed(31)
calculate_pvalue_cloning(prop_cloning = 0.12, repetitions = 10000);
## [1] 0.0043
The test statistic here is 0.09. We simulate 10,000 datasets under the null hypothesis that the proportion of people who believe cloning should be allowed is 0.12. The estimated p-value from these 10,000 repetitions is 0.0043, which means that the proportion of the observations in our simulations that are as extreme or more extreme than what we observed from the true sample is 0.0043.
We conclude that we have strong evidence against the null hypothesis that 12% of people believe that cloning humans should be allowed. In other words, we conclude that the percentage of people who believe cloning should be allowed is not equal to 12%, and we have strong evidence that the data contradicts the headline
(ii) "Poll reports that 90% of people believe cloning humans should not be allowed"
set.seed(32)
calculate_pvalue_cloning(prop_cloning = 0.10, repetitions = 10000);
## [1] 0.2921
The test statistic here is 0.09. We simulate 10,000 datasets under the null hypothesis that the proportion of people who believe cloning should not be allowed is 0.90 (or equivalently, that the proportion of people who believe cloning should be allowed is 0.10). The estimated p-value from these 10,000 repetitions is 0.2921, which means that the proportion of the observations in our simulation that are as extreme or more extreme than what we observed from the true sample is 0.2921.
We conclude that we have no evidence against the null hypothesis that 10% of people believe that cloning humans should be allowed (or equivalently, that 90% of people believe that cloning humans should NOT be allowed). In other words, the data does not contradict the headline in this case.