Instructions

What should I bring to tutorial on October 12?

  • R output (e.g., plots and explanations) for Question 1 (a)-(e). You can either bring a hardcopy or bring your laptop with the output.

Tutorial Grading

Tutorial grades will be assigned according to the following marking scheme.

Mark
Attendance for the entire tutorial 1
Assigned homework completiona 1
In-class exercises 4
Total 6

These problems are based on the lesson Joining Data Frames.

Practice Problems

The file heroes_information_exer.csv contains some information on superheroes and super_hero_powers_exer.csv conatins some information on powers of superheroes.

The following questions are based on data in heroes_information.csv and super_hero_powers.csv.

Question 1

  1. Read both data sets heroes_information.csv and super_hero_powers.csv into R using read_csv from the tidyverse library.

If you are using rstudio.cloud then here is the R code.

library(tidyverse)
hero_info <- read_csv("heroes_information_exer.csv")
hero_power <- read_csv("super_hero_powers_exer.csv")

If you are using RStudio on your own computer then use this R code (internet connection required).

heroinfo_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/Fall2018/week5/heroes_information_exer.csv"
heropower_url <- "https://raw.githubusercontent.com/ntaback/UofT_STA130/master/Fall2018/week5/super_hero_powers_exer.csv"

hero_info <- read_csv(heroinfo_url)
hero_power <- read_csv(heropower_url)

How may variables and observations are in each data frame?

  1. Suggest a key to join the two data frames?

  2. What proprotion of superheroes in heroes_information also have data in super_hero_powers?

  3. What is the number of observations, average, median, standard deviation, and inter-quartile range of weight for superheroes for each category of marksmanship? (HINT: use the group_by() function then summarise())

  4. Are superheroes with marksmanship thinner compared to those without marksmanship? Create a visualization to compare the distribution of weight between superheroes that have marksmanship and those that don’t have marksmanship. Which distribution has more variability?