Bring your answers to Questions 1 and 2.
file_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/week3/Week3PracticeProblems-student.Rmd"
download.file(url = file_url , destfile = "")
Look for the file “Week3PracticeProblems-student.Rmd” under the Files tab then click on it to open.
Change the subtitle to “Week3PracticeProblemsSolutions” and change the author to your name and student number.
Type your answers below each question. Remember that R code chunks can be inserted directly into the notebook by choosing Insert R from the Insert menu (see Using R Markdown for Class Assignments). In addition this R Markdown cheatsheet, and reference are great resources as you get started with R Markdown.
coin
that represents the outcomes of flipping a coin. coin <- c("H","T")
sample()
function to simulate flipping a coin 10 times. Run it twice do you get the same results?coin <- c("H","T")
sample(coin, 10, replace = T)
## [1] "T" "T" "H" "H" "T" "T" "H" "T" "T" "H"
coin <- c("H","T")
results <- sample(coin, 10, replace = T)
sum(results == "H")
## [1] 3
flipcoin <- function(){
coin <- c("H","T")
results <- sample(coin, 10, replace = T)
sum(results == "H")
}
flipcoin()
## [1] 8
flipcoin <- function(n){
coin <- c("H","T")
results <- sample(coin, n, replace = T)
sum(results == "H")
}
flipcoin(20)
## [1] 17
flipcoin <- function(n){
coin <- c("H","T")
results <- sample(coin, n, replace = T)
(sum(results == "H") / n)*100
}
flipcoin(20)
## [1] 75
flipcoin()
.library(tidyverse)
numtoss <- 1:500
perchead <- sapply(1:500,flipcoin)
coinflips <- data.frame(numtoss,perchead)
coinflips %>% ggplot(aes(x = numtoss, y = perchead)) + geom_point(alpha = 0.5) + geom_hline(yintercept = 50, colour = "red")
What do you observe as the number of tosses approaches 500?
The proportion of heads approaches 50%.
library(tidyverse)
fludat_prov <- read_csv("fludat_prov.csv") # import data from file
popdat <- read_csv("popdat.csv") # import data from file
Examine the data frames using the Environment tab in RStudio. Do the data frames have the same number of rows and columns?
Yes.
fludat_prov
and popdat
. Recode so that the names are the same in both data frames.fludat_prov$prov <- recode(fludat_prov$prov, "Province of Québec" = "Quebec", "Province of Ontario" = "Ontario", "Province of Saskatchewan" = "Saskatchewan", "Province of Alberta" = "Alberta")
popdat
data frame to calculate the number of provinces/territories in each region?popdat %>% group_by(region) %>% summarise(n = n())
popdat
data fill in the missing regions in the region
variable and repeat part (c).popdat$region[popdat$prov == "Alberta"] <- "West" #recode only the region value for Alberta
popdat$region[popdat$prov == "Quebec"] <- "East" #recode only the region value for Alberta
popdat %>% group_by(region) %>% summarise(n = n())
popdat
and fludat
tables to create a new data frame. How many variables and observations are in the new data frame? Is this what you expected? Explain.fludat_prov %>% inner_join(popdat, by = "prov")
fludat_prov %>%
inner_join(popdat, by = "prov") %>%
group_by(prov) %>%
mutate(rate = fluA/testpop_size) %>%
arrange(desc(rate))