

The model predicts that 12.867% (cell P7) of the population will be below the poverty level when infant mortality is 7.0 (per 1,000 births), 70% of the population is white and crime is 400 (per 100,000 people). Note that this formula and the one in cell P11 are array formulas, and so you need to press Ctrl-Shft-Enter, even though they produce a single value. The calculations for the prediction interval are identical except that the standard error (cell Q11) is calculated by the formula The formulas used for the confidence interval are shown in column S of Figure 3.įigure 3 – Key formulas for the confidence interval Now we calculate the confidence and prediction intervals, as shown in range O3:Q13. We have also inserted the matrix ( X T X) -1 in range J6:M9, which we calculate using the Real Statistics formula =CORE(C4:E52), referencing the data in Figure 1. We have added the required data for which we want to calculate the confidence/prediction intervals in range O18:O22. Alternatively, we can use Excel’s Regression data analysis tool or, as we have done on the left side of Figure 2, by using the Real Statistics Linear Regression data analysis tool.įigure 2 – Calculation of Confidence and Prediction Intervals We next perform a regression analysis using columns C, D and E as the independent variables and column B as the dependent variable. Observation: Click here for proofs of the above formulas.Įxample 1: Find the 95% confidence and prediction intervals for Poverty where Infant Mortality is 7.0, White = 80 and Crime = 400 based on the data in Example 2 of Multiple Regression Analysis using Excel, which is reproduced in Figure 1 (in two blocks to fit better on the page). Here, the square-root term is called the standard error of the prediction. The 1 – α prediction interval of ŷ 0 is therefore


The prediction interval is calculated in a similar way, except that now the variance is the variance of the residual y − ŷ, which is Here t crit is the critical value of the t distribution with df Res = n − k − 1 degrees of freedom with significance level α/2, i.e. The 1 – α confidence interval for the true value of ŷ 0 is therefore If X 0 is the column array with values 1, x 01, x 02, …, x 0 k, then an unbiased estimate of the standard error of ŷ 0, called the standard error of the fit, is given by the formula Where the regression coefficients b j are based on the n × ( k +1) data array X (with ones in the first column).įor any specific values of the x j, say x 01, x 02, …, x 0 k, we have the predicted value Let’s assume that we have a regression line
