Linear Regression

For this project, we used linear regression to model housing prices in San Luis Obsipo, California (SLO). We created a dataset by randomly selecting data from Zillow ourselves, choosing from properties sold in SLO in a 90-day time span from February 2022 through April 2022. We collected variables such as sold price, square footage, number of beds, baths, and parking spaces, the year the house was built, lot size, home type, and whether or not they had a pool, cooling, or heating on each property in our data set.

This project includes exploratory data analysis, transformations, and model building techniques to find a model which best fits the data. Additionally, we performed statistical inference and model validation to answer key research questions and assess the overall model fit. This work was done in collaboration with Martin Hsu and Rachel Roggenkemper.

This work was done using R.

Final Report