본문 바로가기
Study

Summary of Regression Analysis

by 재르미온느 2024. 6. 7.

Matrix approach to regression analysis

1. Random vectors and matrices

- Mean vector

- Covariance matrix : Symmetrix matrix

 

[Basic theorems]

w=Ay

- A : constant matrix

- y : random vector

(1) E(w) = E(Ay) = A*E(y)

(2) Cov(w) =A * Cov(y) * At

 

2. Simple linear regression model in a matrix terms

y X b e

- E(e)=0

- Cov(e)= σ2I

 

e~MVN(0, σ2I)

 

3. LSE of ß

ß = (XtX)-1(Xty), if (XtX)-1 exists

 

4. Fitted values and residuals

H=X(XtX)-1Xt   : Hat

- Symmetric

- idempotent

(1) Fitted values : y hat = Hy

(2) e = (I-H)y

* Unbiased estimator of Cov(e) = s2(e) = MSE(I-H)

 

5. ANOVA

(1) SSTO = yt(I-1/n*J)y

(2) SSE = yt(I-H)y

(3) SSR = SSTO - SSE = yt(H-1/n*J)y

 

6. Inferences

(1) Cov(ß) = σ2(XtX)-1

- s2(ß)=MSE(XtX)-1

 

 

Multiple linear regression - I

<Analysis of Variance>

- SSTO = yty - ytJy/n

- SSE = yty-ßXty

- SSR = ßtXty - ytJy/n

 

 

Multiple linear regression - II

SSTO = SSR + SSE = SSE(x1,x2) + SSR(x1,x2) = SSE(x1) + SSR(x1) = SSE(x2)+SSR(x2)

 

SSR(x2|x1) = SSR(x1,x2) - SSR(x1) = SSE(x1) - SSE(x1,x2)

 

SSR(x3|x1,x2) = SSR(x1,x2,x3) - SSR(x1,x2) = SSE(x1,x2)-SSE(x1,x2,x3)

SSR(x2,x3|x1) = SSR( x1, x2,x3) - SSR(x1) = SSE(x1) -SSE(x1,x2,x3)

 

----

SSTO = SSR(x1) + SSE(x1) =SSR(x1) + SSR(x2|x1) + SSE(x1,x2)

SSR(x1,x2) = SSTO - SSE(x1,x2)

SSR(x1,x2) = SSR(x1)+SSR(x2|x1)

 

 

<Testing on regression coefficients>

1. 모든 ß가 0?

2. 특정 ßk가 0?

3. 특정 몇 ß가 0?

 

<r>

r2y1.2 = {SSE(x2)-SSE(x1,x2)} / SSE(x2) = SSR(x1|x2) / SSE(x2)

r2y2.1 = {SSE(x1)-SSE(x1,x2)} / SSE(x1) = SSR(x2|x1) / SSE(x1)

 

r2y4.123=SSR(x4| x1,x2,x3) / SSE(x1,x2,x3)

 

Variable selection techniques

1. All possible regression procedure

(1) Rp2(SSEp) criterion

(2) Ra,p2(MSEp) criterion

(3) Mallow's Cp criterion

- Cp는 p값과 비슷하면서 작아야 한다.

 

2. Stepwise regression methods

(1) Forward selection

(2) Backward selection

(3) Forward stepwise regression