Correlation and Regression
Correlation and Regression
Published on October 22, 2025 • By EduResHub Team
Simple Correlation and Regression Analysis
Step 1: Enter the data
x <- c(0, 15, 45, 60, 75, 90, 105, 120)
y <- c(3.3, 3.5, 4.0, 4.2, 4.6, 5.0, 5.3, 5.8)
dat <- data.frame(x, y)
dat x y
1 0 3.3
2 15 3.5
3 45 4.0
4 60 4.2
5 75 4.6
6 90 5.0
7 105 5.3
8 120 5.8Step 2: Correlation analysis
# Correlation coefficient (Pearson)
r <- cor(x, y)
r[1] 0.9901078# Correlation test (shows r, p-value, and confidence interval)
cor.test(x, y)
Pearson's product-moment correlation
data: x and y
t = 17.285, df = 6, p-value = 2.402e-06
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
0.9442173 0.9982792
sample estimates:
cor
0.9901078 Step 3: Simple linear regression
# Fit the model
model <- lm(y ~ x)
# Summary of the regression model
summary(model)
Call:
lm(formula = y ~ x)
Residuals:
Min 1Q Median 3Q Max
-0.18559 -0.08176 -0.00473 0.06430 0.18378
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.154955 0.088995 35.45 3.36e-08 ***
x 0.020511 0.001187 17.29 2.40e-06 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.1326 on 6 degrees of freedom
Multiple R-squared: 0.9803, Adjusted R-squared: 0.977
F-statistic: 298.8 on 1 and 6 DF, p-value: 2.402e-06Step 4: ANOVA table
anova(model)Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x 1 5.2533 5.2533 298.78 2.402e-06 ***
Residuals 6 0.1055 0.0176
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Step 5: 95% Confidence intervals for coefficients
confint(model, level = 0.95) 2.5 % 97.5 %
(Intercept) 2.93719242 3.37271749
x 0.01760701 0.02341401Step 6: Fitted values and residuals
fitted_values <- fitted(model)
residuals_values <- resid(model)
# Add them to the dataset
dat$fitted <- fitted_values
dat$resid <- residuals_values
dat x y fitted resid
1 0 3.3 3.154955 0.1450450450
2 15 3.5 3.462613 0.0373873874
3 45 4.0 4.077928 -0.0779279279
4 60 4.2 4.385586 -0.1855855856
5 75 4.6 4.693243 -0.0932432432
6 90 5.0 5.000901 -0.0009009009
7 105 5.3 5.308559 -0.0085585586
8 120 5.8 5.616216 0.1837837838Step 7: Predict new values (optional)
# Example: predict Y for new X values
new_x <- data.frame(x = c(0, 30, 60, 90, 120))
predict(model, newdata = new_x, interval = "confidence", level = 0.95)
fit lwr upr
1 3.154955 2.937192 3.372717
2 3.770270 3.619400 3.921141
3 4.385586 4.270356 4.500815
4 5.000901 4.863176 5.138626
5 5.616216 5.416634 5.815799
# Example: predict Y for new X values
new_x <- data.frame(x = c(0, 30, 60, 90, 120))
predict(model, newdata = new_x, interval = "confidence", level = 0.95) fit lwr upr
1 3.154955 2.937192 3.372717
2 3.770270 3.619400 3.921141
3 4.385586 4.270356 4.500815
4 5.000901 4.863176 5.138626
5 5.616216 5.416634 5.815799Step 8: Scatter plot with regression line
# ===============================
# Step 8: Scatter plot with regression line and equation
# ===============================
# Fit regression line on the data
plot(x, y,
xlab = "X values",
ylab = "Y values",
main = "Scatter Plot with Regression Line and Equation",
pch = 19)
# Add the regression line
abline(model, lwd = 2, col = "blue")
# Get coefficients and R² for labeling
b0 <- round(coef(model)[1], 3)
b1 <- round(coef(model)[2], 3)
R2 <- round(summary(model)$r.squared, 4)
# Create equation and R² text
eq_text <- bquote(hat(y) == .(b0) + .(b1)*x)
r2_text <- bquote(R^2 == .(R2))
# Add text to the plot
legend("top",
legend = c(as.expression(eq_text), as.expression(r2_text)),
bty = "n",
text.col = "black")

# ===============================
# Step 8: Scatter plot with regression line and equation
# ===============================
# Fit regression line on the data
plot(x, y,
xlab = "X values",
ylab = "Y values",
main = "Scatter Plot with Regression Line and Equation",
pch = 19)
# Add the regression line
abline(model, lwd = 2, col = "blue")
# Get coefficients and R² for labeling
b0 <- round(coef(model)[1], 3)
b1 <- round(coef(model)[2], 3)
R2 <- round(summary(model)$r.squared, 4)
# Create equation and R² text
eq_text <- bquote(hat(y) == .(b0) + .(b1)*x)
r2_text <- bquote(R^2 == .(R2))
# Add text to the plot
legend("top",
legend = c(as.expression(eq_text), as.expression(r2_text)),
bty = "n",
text.col = "black")
Step 9: Interpretation (for students)
- Correlation (r) shows the strength and direction of relationship.
- Regression equation gives a prediction line:
[ = b_0 + b_1X ] - R-squared shows how much variation in Y is explained by X.
- ANOVA and p-values test if the relationship is statistically significant.
- Residuals show how far observed values are from the predicted line.
- Correlation (r) shows the strength and direction of relationship.
- Regression equation gives a prediction line:
[ = b_0 + b_1X ] - R-squared shows how much variation in Y is explained by X.
- ANOVA and p-values test if the relationship is statistically significant.
- Residuals show how far observed values are from the predicted line.