📊 Linear Regression Calculator

Calculate line of best fit, R-squared, and make predictions

📈 Enter Data Points

X Values (Independent Variable)

Enter values for the independent variable

Y Values (Dependent Variable)

Enter values for the dependent variable

📊 Results

Regression Equation

y = mx + b

Line of Best Fit

Slope (m)

Y-Intercept (b)

R² (R-Squared)

Correlation (r)

Regression Line & Data Points

💡 Make Predictions

Enter X value:

📝 Calculation Steps

📚 Understanding Linear Regression

What is Linear Regression?

Linear regression is a statistical method for modeling the relationship between a dependent variable (Y) and an independent variable (X) by fitting a linear equation to observed data. The method finds the "line of best fit" that minimizes the sum of squared distances between actual data points and predicted values.

The Regression Equation

Standard Form

y = mx + b

m: Slope (rate of change of Y with respect to X)
b: Y-intercept (value of Y when X = 0)

How to Calculate the Slope (m)

Slope Formula

m = Σ((xᵢ - x̄)(yᵢ - ȳ)) / Σ(xᵢ - x̄)²

Where x̄ is the mean of X values and ȳ is the mean of Y values. The slope represents how much Y changes for each unit change in X.

How to Calculate the Y-Intercept (b)

Y-Intercept Formula

b = ȳ - m × x̄

The Y-intercept is calculated after finding the slope, ensuring the regression line passes through the point (x̄, ȳ).

Understanding R² (Coefficient of Determination)

R² measures how well the regression line fits the data. It ranges from 0 to 1 and represents the proportion of variance in Y that can be explained by X:

R² = 1.0: Perfect fit - all points lie exactly on the line
R² = 0.8-1.0: Very strong relationship
R² = 0.6-0.8: Strong relationship
R² = 0.4-0.6: Moderate relationship
R² = 0.2-0.4: Weak relationship
R² < 0.2: Very weak or no relationship

For example, R² = 0.75 means 75% of the variation in Y can be explained by changes in X, while 25% is due to other factors.

Correlation Coefficient (r)

The correlation coefficient measures the strength and direction of the linear relationship:

r = +1: Perfect positive correlation
r = 0.7 to 1: Strong positive correlation
r = 0.3 to 0.7: Moderate positive correlation
r = 0: No correlation
r = -0.3 to -0.7: Moderate negative correlation
r = -0.7 to -1: Strong negative correlation
r = -1: Perfect negative correlation

Real-World Applications

Business: Sales forecasting, revenue prediction, market analysis
Economics: Analyzing relationships between economic indicators
Science: Studying relationships between variables in experiments
Healthcare: Predicting patient outcomes based on treatment data
Education: Analyzing test scores and study time relationships
Real Estate: Predicting house prices based on features
Finance: Portfolio analysis and risk assessment

Assumptions of Linear Regression

Linearity: The relationship between X and Y is linear
Independence: Observations are independent of each other
Homoscedasticity: Variance of residuals is constant
Normality: Residuals are normally distributed
No outliers: Extreme values can significantly affect the line

Interpreting the Results

Positive slope: Y increases as X increases
Negative slope: Y decreases as X increases
Slope magnitude: Larger absolute values indicate steeper relationships
Y-intercept: Starting value of Y when X is zero (may not always be meaningful)

Frequently Asked Questions

What's the difference between correlation and regression?

▼

Correlation measures the strength and direction of a relationship between two variables (r), while regression creates a predictive equation (y = mx + b) that can be used to estimate Y values from X values. Correlation tells you "how related," regression tells you "how to predict."

How many data points do I need?

▼

You need at least 2 data points to calculate a line, but more points provide better reliability. Generally, 10-30 data points give reasonable results. More data points help identify the true relationship and reduce the impact of outliers.

What does a negative slope mean?

▼

A negative slope means there's an inverse relationship: as X increases, Y decreases. For example, in a study of car age vs. value, the slope would be negative because older cars typically have lower values.

Can I use linear regression for any data?

▼

Linear regression works best when the relationship between variables is approximately linear. If your data shows a curved pattern, exponential growth, or other non-linear relationships, you may need polynomial regression or other methods. Always visualize your data first.

What's a good R² value?

▼

It depends on your field. In physical sciences, R² > 0.9 is often expected. In social sciences, R² > 0.5 can be considered good. In business, R² > 0.7 is typically strong. The key is whether the model is useful for your specific purpose, not just the R² value alone.

How do outliers affect linear regression?

▼

Outliers can significantly affect the regression line because the method minimizes squared distances. A single extreme point can pull the line away from the majority of data. Always check for outliers and consider whether they should be included or investigated separately.

Can I extrapolate beyond my data range?

▼

Extrapolation (predicting outside your data range) is risky because the linear relationship may not hold beyond observed values. Interpolation (predicting within your data range) is generally safer. Use extrapolation cautiously and only when you have good reason to believe the relationship continues.

🧭 Guided Journeys

Use a workflow instead of stopping at one result

These grouped paths are designed to help you continue with the most common follow-up calculations in this category.

Everyday Math Essentials

Cover quick calculations for percentages, fractions, averages, and ratios used in school, shopping, and spreadsheets.

Algebra & Equations

Move from powers and logarithms into more advanced solving tools when the problem gets more complex.

Geometry & Measurement

Calculate dimensions, area, and triangle relationships using a connected geometry workflow.