Everyday Math Essentials
Cover quick calculations for percentages, fractions, averages, and ratios used in school, shopping, and spreadsheets.
Calculate line of best fit, R-squared, and make predictions
Linear regression is a statistical method for modeling the relationship between a dependent variable (Y) and an independent variable (X) by fitting a linear equation to observed data. The method finds the "line of best fit" that minimizes the sum of squared distances between actual data points and predicted values.
m: Slope (rate of change of Y with respect to X)
b: Y-intercept (value of Y when X = 0)
Where x̄ is the mean of X values and ȳ is the mean of Y values. The slope represents how much Y changes for each unit change in X.
The Y-intercept is calculated after finding the slope, ensuring the regression line passes through the point (x̄, ȳ).
R² measures how well the regression line fits the data. It ranges from 0 to 1 and represents the proportion of variance in Y that can be explained by X:
For example, R² = 0.75 means 75% of the variation in Y can be explained by changes in X, while 25% is due to other factors.
The correlation coefficient measures the strength and direction of the linear relationship:
Correlation measures the strength and direction of a relationship between two variables (r), while regression creates a predictive equation (y = mx + b) that can be used to estimate Y values from X values. Correlation tells you "how related," regression tells you "how to predict."
You need at least 2 data points to calculate a line, but more points provide better reliability. Generally, 10-30 data points give reasonable results. More data points help identify the true relationship and reduce the impact of outliers.
A negative slope means there's an inverse relationship: as X increases, Y decreases. For example, in a study of car age vs. value, the slope would be negative because older cars typically have lower values.
Linear regression works best when the relationship between variables is approximately linear. If your data shows a curved pattern, exponential growth, or other non-linear relationships, you may need polynomial regression or other methods. Always visualize your data first.
It depends on your field. In physical sciences, R² > 0.9 is often expected. In social sciences, R² > 0.5 can be considered good. In business, R² > 0.7 is typically strong. The key is whether the model is useful for your specific purpose, not just the R² value alone.
Outliers can significantly affect the regression line because the method minimizes squared distances. A single extreme point can pull the line away from the majority of data. Always check for outliers and consider whether they should be included or investigated separately.
Extrapolation (predicting outside your data range) is risky because the linear relationship may not hold beyond observed values. Interpolation (predicting within your data range) is generally safer. Use extrapolation cautiously and only when you have good reason to believe the relationship continues.
These grouped paths are designed to help you continue with the most common follow-up calculations in this category.
Cover quick calculations for percentages, fractions, averages, and ratios used in school, shopping, and spreadsheets.
Move from powers and logarithms into more advanced solving tools when the problem gets more complex.
Calculate dimensions, area, and triangle relationships using a connected geometry workflow.