One Big Post to End Biostats Night School
Let's put it all together and do some work in R. Lots of code and videos coming your way!
Welcome to the final post of Biostats Night School! If you made it this far, congratulations. Over the past weeks, we’ve explored the fundamental tools of biostatistics, and now it’s time to bring it all together. In this post, we’ll walk through every major concept using a fictional dataset we created specifically for this class. Think of this as your grand tour through the world of biostatistics, one thoughtful step at a time.
Meet the Data
Our dataset includes 500 made-up individuals. Each person has information about their age, gender, race, ethnicity, BMI, blood pressure, cholesterol level, smoking status, disease diagnosis, treatment group assignment, and whether they were flagged as high risk. This isn’t real data, but it was designed to feel real—messy, imperfect, and full of stories waiting to be uncovered.
You can download the data (.csv file) here: http://bit.ly/3TErs7B
library(tidyverse)
library(janitor)
data <- read_csv("Biostatistics_Teaching_Dataset.csv") %>% clean_names()
If you are a paid subscriber, keep going to see more code and examples. Not a paid subscriber yet? Why not consider trying it out and checking out this and other lessons in the Public Health Night School?
Keep reading with a 7-day free trial
Subscribe to Public Health Night School to keep reading this post and get 7 days of free access to the full post archives.