Poster Presentation

The final project for this course will be a statistical analysis of data from Geotab on the topic below. You will present your findings in the style of a poster display of a professional scientific conference. You and your team will post your work on a board, give a short oral presentation of your findings when visited by evaluators, and be prepared to answer questions about your work.

Deliverables

  1. A poster to be displayed at the STA130H1 poster fair. See more details about the poster and poster fair below.
  2. The R markdown document (.Rmd) that produced the poster and the html (.html) output file, to be submitted before 9:30 on Monday, April 2 if you are in section LEC0101 and before 13:30 on Monday, April 2 if you are in section LEC0201. Submission details to follow.
  3. A 5 minute oral presentation summarizing your work that you will give at the poster fair.

The Poster Fair

When

  • Date: April 2, 2018

  • Time: During the lecture time that you are registered. If you are registered in LEC0101 then the time is 10:10 - 12:00, or if you are registered in LEC0201 then the time is 14:10 - 16:00. Make sure to come in time to post your work before 10 minutes past the hour. Take down your poster at the end of your class time.

Location

  • Atrium of Bahen Centre, 40 St George Street
    (Note: come to the Bahen atrium and not your lecture room on April 2.)

The Poster

  • The poster boards on which you will display your work are 6 feet wide by 3 feet high. You must ensure that what your group decides to post will fit on one board.
  • Groups can attach individual sheets of paper to the poster board using the velcro that will be provided. No other methods of attachment to the board are permitted.
  • You should produce your poster using R markdown to create slides. We recommend ioslides. You can then print the slides on 8.5 x 11 inch sheets of paper, to be posted on the poster board. The skeleton of an R markdown document to produce the slides for your poster is here. When planning your pages, think of your poster as a grid of these pages, which you will arrange in columns, to effectively show your work.
    For more details on using R markdown see:
  • Your poster should include R output, but not R code. See the template for an example of how to write code chunks to do this.
  • Your poster must include the names of your team members, your tutorial section (e.g., TUT0101A), and your group number as assigned by your TA.
  • Your poster should include the following sections: Introduction, Statistical Methods, Results, Conclusion. See the template for optional sections.
  • See the Self-Evaluation Checklist for STA130 Project and the template R markdown document for details on what to include.

A few guidelines for an effective poster display:

  • Your poster should be clear, concise, and easy to read quickly.
  • Do not use small fonts as it will need to be read at a distance of about 6 feet.
  • Figures display information more efficiently than text.
  • Numbered or bulleted lists convey points in a poster setting more effectively than blocks of text.

The Oral Presentation

At the poster fair you will be asked to give a 5 minute presentation summarizing your work. This time limit is firm and you will be asked to stop when time is up. Each team member must speak during this presentation.

Teamwork

  • Preparatory work will be carried out in tutorial and will form part of your tutorial grade.
  • You will work in a group comprised of either 3 or 4 students.
  • You must work with students in your tutorial only. Some tutorial time will be used for project work.
  • Your TA can help you form a group with other students in your tutorial.
  • All team members are expected to contribute equally to the completion of the project. All team members must present part of the oral presentation at the poster fair.
  • Your group may not work with members of another group. You may not discuss the project with anyone except for your team, professors, and the course TAs.

Evaluation

Grade component* Value
Poster Content 50%
Reproducibility of Poster Content 10%
Oral Presentation 40%

* Students will be evaluated as a team.

Poster Content and Reproducibility

  • The rubric for the poster evaluation is here.

  • Before the poster fair, you must send your TA the html and R Markdown files of your poster. These are due at 9:30 if you are in section LEC0101 and at 13:30 if you are in section LEC0201.

  • Your TA will attempt to reproduce your poster using the R Markdown (.rmd) files you submit.

  • If your TA cannot run the .rmd files you submit to reproduce your poster content then your group will receive 0; if the TA has to make minor changes to get it to run then your group will receive 1; and if it runs with no changes then your group will recieve 2.

  • If the R Markdown and HTML files are submitted after the deadline every member of your group will lose 10% of your overall final project mark as long as they are submitted at most 24 hours after the deadline. The R Markdown and HTML files will not be accepted more than 24 hours after the deadline.

Oral Presentation

  • During the poster fair, you will be visited by members of the STA130H1 teaching team. You will give them a 5 minute presentation about your work. The rubric for the oral presentation is here.

  • Every member of the team is expected to speak as part of the oral presentation.

  • If a student in a group isn’t present at their group’s presentation then they will need a valid excuse (e.g., UofT illness form), otherwise they will receive 50% of the group mark.

  • If a student doesn’t speak at all during the presentation and is unable to answer a direct question then they will receive 50% of the group mark. If a student neither speaks nor responds to any questions they will receive 0.

Data Analysis Expectations

You will carry out a data analysis on data from Geotab using R to address the topic below.

We expect that your analysis may require data wrangling, exploratory data analysis (plots and summary statistics), tests, confidence intervals, classification trees, and regression models. Your project does not need to include all of these statistical methods nor does it need to include all of the variables in the data set. You might also choose not to include all observations, or to make new variables from the data that may be more suitable for answering your questions of interest.

The goal is not to carry out an exhaustive analysis, nor to apply everything you have learned in the course. The goal is to demonstrate that you have learned how to use R, that you can appropriately apply the methods we have covered in class to address a question, and that you can effectively interpret and present the results.

The Data

The red dots on the map below show hazardous driving instances in Canada recorded in Geotab’s hazardous driving data set.

library(tidyverse)
library(ggmap)

hazardousdriving <- read_csv("hazardousdriving.csv")
hazardousdriving_can <- hazardousdriving %>% filter(Country == "Canada")
qmplot(AvgLongitude, AvgLatitude, data = hazardousdriving_can, maptype = "toner-lite", color = I("red"))

Your group will explore the hazardous driving data set. We have proposed questions for your group to address in your project. There are many ways you can address these questions in the data. Your group will need to focus your project, choosing what you will consider. You do not need to consider every variable in the data set.

The hazardous driving data set is described here. The “Datasheet” link provides an explanation of each of the variables in the data.

Frequently Asked Questions about the Data

Question: Can the severity score be derived from other variables in the data set?

Answer: No. The severity score cannot be derived from other variables in the data set.

Question: Is it correct to say that there is no data on the number of people involved in an incident? So, for example, we can’t assume anything about the number of people involved in a multi-passenger vehicle incident or any other incident.

Answer: Yes.

Question: What is the definition of multi-passenger vehicle?

Answer: MPV or multi-purpose vehicles is a classification specified by the car manufacturer. SUV and vans are also included this classification.

Final Project Questions

There is conflicting evidence about which province in Canada has the most dangerous driving conditions. A recent study claimed that Halifax is the most dangerous Canadian city to drive. Does this mean that Nova Scotia has the most dangerous driving in Canada? According to a report by the Allstate Insurance Company New Brunswick had the highest collision claims. dangerousroads.org claims some of the most dangerous highways in Canada are in Manitoba and Nova Scotia.

Use the hazardous driving data to answer the following questions:

  1. Define what your group considers to be “hazardous driving” for the purposes of your project. Note that we expect that each group may have a different definition. The analysis your group does will follow from how you choose to define hazardous driving.
  2. According to your definition of hazardous driving, which province has the most driving hazards?
  3. How does hazardous driving in the province you identified in 2. compare to each of the other provinces?

To answer questions 1 - 3, you must use the hazardous driving data. It is not necessary to use data from any other source. However you are allowed to supplement the hazardous driving data with other data if you’d like.