Collecting and Summarizing Data

Notes and in-class exercises

Download the .qmd file for this activity here. For this and all future activities, you will download the .qmd file, open it in Rstudio, and use it to type your answers/code to the exercises and reflection on the in-class activities. A rendered (“printed”) version of the activity, with solutions, will always be available on this course page.

Notes

Welcome to our first in-class activity! Today we will collect and summarize some data. Our goals are to get to know the people in this class and to start working with data.

By the end of this lesson, you should be able to:

  • Define cases and variables
  • Apply the 5 W’s + H (who, what, when, where, why, and how) to data collection

Links to related reading(s):

This activity is structured a bit differently than the activities for the remainder of the class. In this section, you’ll typically find a mini-lecture, review material, or guided / structural examples, followed by exercises.

For today, you’ll have some steps to follow for an interactive, tactile activity at your tables before working through the exercises below together.

Step 1

At your table, write a 1-2 word answer to each of the following 7 question(s), each on a separate post-it note (do this individually). If there are two questions: write your answer to the first question on the left-hand side of the post-it note, and your answer to the second question on the bottom of the post-it note.

  1. How many hours of sleep did you get last night? How many cups of coffee did you drink this morning?

  2. What is your declared or potential major? (If you are a double major, just pick whichever one you think of first.)

  3. What is your class year? (first year, sophomore, junior, senior)

  4. How many stats courses have you taken in the past?

  5. On a scale of 1 (get me out of here) to 10 (yay!), how excited are you about this course?

  6. Is it your birthday this semester? (yes/no)

  7. How many unread emails do you have in your inbox right now?

  8. Have you used R/RStudio a lot, a little, or never?

Step 2

Designate one person from your table to distribute your group’s answers to the questions to the corresponding “station” around the classroom. Each table should have a number on it, corresponding to the “station”/question.

Step 3

Fill out an electronic version of these questions. We’ll come back to this in a future class. Wait until everyone in your group is done with this before moving on to the exercises.

Exercises

You’ll now answer a few questions related to the data collected at your group’s table. Data is anything that contains information.

Exercise 1

With your group, in no more than two sentences per question, respond to the questions posed by the 5 W’s + H for the data at your group’s table. If you need a refresher on the 5 W’s + H, check out the related readings posted at the top of this activity!

Who

Response: Type your response here.

What

Response: Type your response here.

When

Response: Type your response here.

Where

Response: Type your response here.

Why

Response: Type your response here.

How

Response: Type your response here.

Exercise 2

Move the post-it notes around to construct a visualization of the responses at your station.

Exercise 3

Calculate at least one numerical summary of the post-it note responses at your station, and record your numerical summary in the response text below. Write a complete sentence, not just the numerical summary alone! This is good practice for summarizing data in more formal writing.

Response: Type your response here.

Exercise 4

In no more than three sentences, describe what you learn about from the visual and numerical summaries. Try to write your description in a way that tells an interesting story about the people in this class.

Response: Type your response here.

Exercise 5

Imagine that you took all the post-it notes in this room and organized them into a spreadsheet. What would each row in the underlying data set represent? What would each column of the data set represent? Check out the first related reading, linked at the top of this activity, if you need assistance!

Response: Type your response here.

Reflection

In three to four sentences, reflect upon today’s activity. Some reflection prompts are found below:

  • Do you think the context in which the data was collected influenced the results you found?
  • Who would the numerical summaries you calculated potentially be represented of? Could you generalize the information you learned to a broader population (all students at Mac, perhaps), or would you have ethical concerns with generalizing your results?
  • What (if anything) surprised you about the numerical summaries calculated by your group, or other groups?
  • Did your data visualization help you better understand, or discover new things about the data that otherwise would have been difficult to distinguish? Why or why not?

Response: Type your response here.