Tutorial grades will be assigned according to the following marking scheme.
| Mark | |
|---|---|
| Attendance for the entire tutorial | 1 | 
| Assigned homework completiona | 1 | 
| In-class exercises | 4 | 
| Total | 6 | 
These problems are based on the lesson Joining Data Frames.
The file heroes_information_exer.csv contains some information on superheroes and super_hero_powers_exer.csv conatins some information on powers of superheroes.
The following questions are based on data in heroes_information.csv and super_hero_powers.csv.
heroes_information.csv and super_hero_powers.csv into R using read_csv from the tidyverse library.If you are using rstudio.cloud then here is the R code.
library(tidyverse)
hero_info <- read_csv("heroes_information_exer.csv")
hero_power <- read_csv("super_hero_powers_exer.csv")If you are using RStudio on your own computer then use this R code (internet connection required).
heroinfo_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/Fall2018/week5/heroes_information_exer.csv"
heropower_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/Fall2018/week5/super_hero_powers_exer.csv"
hero_info <- read_csv(heroinfo_url)
hero_power <- read_csv(heropower_url)How may variables and observations are in each data frame?
Suggest a key to join the two data frames?
What proprotion of superheroes in heroes_information also have data in super_hero_powers?
What is the number of observations, average, median, standard deviation, and inter-quartile range of weight for superheroes for each category of marksmanship? (HINT: use the group_by() function then summarise())
Are superheroes with marksmanship thinner compared to those without marksmanship? Create a visualization to compare the distribution of weight between superheroes that have marksmanship and those that don’t have marksmanship. Which distribution has more variability?