The purpose of Least Squares Regression is to find an appropriate linear model for two quantitative variables in order to make valid predictions. We have gathered data on the heights of of fathers and sons measured in inches. We would like to use the height of the father to predict the height of the son:
We are using Father’s Height as our explanatory variable and Son’s height as our response variable. Please use software to complete the following questions.
- Calculate the summary statistics for the explanatory and response variables. That is calculate the mean and standard deviation of each.
- Calculate the correlation r between the heights of Fathers and Sons. Interpret this correlation.
- Draw a scatter plot of the data collected and place the regression line on your graph.
- Compute the equation of the Least Squares Regression Line.
- Interpret the slope of the Least Squares Regression Line correctly
- If we use this equation to predict how tall my son will be, what is his predicted height? I am 70 inches tall
- R2 is a good summary statistic to see if the regression line is a good fit or not a good fit. This measure is the the fraction of variation in the response variable explained by the least squares regression line of 1 the explanatory variable. Another way to determine if we have a good fit is to make a Residual Plot. This is a scatter plot with the residual values plotted on the y-axis and the explanatory variable plotted on the x-axis. In a sense we have taken the regression line and placed it horizontally about zero still keeping the vertical distances from the regression line the same. This is a good way to determine if we have any possible outliers or influential observations. Residuals with unusually high residual values are expected outliers. Make the Residual Plot and report R2 . Is the Regression model a good fit?