# Linear Regression in 3 Forms

Linear regression can be implemented a few different ways that can lead to the trade-off of accuracy versus speed. Rereading this post by Steven G. Johnson on the Julia Discourse site will give you summary

Here is a screenshot if you don't want to link out.

### Well and ill conditioned

For reference to what conditioning is and what *well* and *ill* condition mean see this wikipedia article. If you have access to the book *Numerical Linear Algebra, *Trefethen and Bau, you'll find section III useful (also in the screenshot Johnson refers to this book in terms of QR factorization as the basis for how Julia handles `A \ b`

, which can be found in section II).

## Some Code

The three forms of linear regression is a good thing to remember if you are implementing your own function. Julia makes this easy. Here are the the three forms.

`using LinearAlgebra, BenchmarkTools`

`#set up some data to regress on`

`N = 10000`

`xn = rand(N);`

`Xp = [ones(N) xn];`

`yp = (10 .+ xn .* 0.3);`

`(sum(Xp), sum(yp))`

### Fast but not accurate

`# normal equations approach`

`#fastest but least accurate; doubles the number of digits you lose to rounding.`

`#only use when you know you have well-conditioned matrices`

`function linreg1(X, y)`

` #β_hat`

` return (X' * X) \ (X' * y)`

`end`

### Not the fastest and not the least accurate

`#QR factorization`

`#slower than linreg1 but uses pivoted QR factorization.`

`#more accurate for badly conditioned matrices than "normal equations" approach`

`#it does not square the condition number`

`function linreg2(X, y)`

` #β_hat`

` return X \ y`

`end`

### Slow but accurate

`#SVD approach`

`#uses SVD to apply psuedo-inverse. the slowest of the linreg* approaches`

`#most accurate of linreg*`

`function linreg3(X, y)`

` #β_hat`

` return pinv(X) * y`

`end`

### Benchmarks

`println("linreg1: ", linreg1(Xp, yp))`

`linreg1(Xp, yp) `

`#show that we are not modifying the inputs`

`(sum(Xp), sum(yp))`

`println("linreg2: ", linreg2(Xp, yp))`

`linreg2(Xp, yp) `

`#show that we are not modifying the inputs`

`(sum(Xp), sum(yp))`

`println("linreg3: ", linreg3(Xp, yp))`

`linreg3(Xp, yp) `

`#show that we are not modifying the inputs`

`(sum(Xp), sum(yp))`