# Variance: regression, clustering, residual and variance – Liyun Chen ’11

GSE are a great school, and their thoughts are worth spending time on..

Liyun Chen ’11 (Economics) is Senior Analyst for Data Science at eBay in Shanghai, China. The following post originally appeared on her economics blog in English and in Chinese. Follow her on Twitter @cloudlychen

Variance is an interesting word. When we use it in statistics, it is defined as the “deviation from the center”, which corresponds to the formula , or in the matrix form (1 is a column vector with N*1 ones). From its definition it is the second (order) central moment, i.e. sum of the squared distance to the central. It measures how much the distribution deviates from its center — the larger the sparser; the smaller the denser. This is how it works in the 1-dimension world. Many of you should be familiar with these.

Variance has a close relative called standard deviation, which is essentially the square root of variance, denoted by // . There is…

View original post 1,425 more words