Mathematics/Statistics/Regression Lines
A regression line is effectively an attempt to fit some data to a straight line.
Basically, given some set of data points, you create a line that as closely as possible fits the data. This is accomplished by trying to minimize the total distance between all points and the line itself.
Since this is regression line is a standard straight line, it can be represented by a standard Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle y = mx + b}
equation.
Residuals
Given a single data point, a residual is the distance between that single point and the regression line.
A positive residual value indicates that the data point is somewhere above the regression line. A negative residual value indicates that the data point is somewhere below the regression line. Larger values indicate the point is farther away.
To calculate the residual for some point at Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle (p_x, p_y)} , we have the following equation for some Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle m} and Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle b} from our regression line:
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle residual_p = p_y - (m(p_x) + b)}
We can then take this a step further and combine all residuals in our dataset to get an overal view of "how closely the regression line matches our dataset".
Failed to parse (SVG (MathML can be enabled via browser plugin): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sum_{i=1}^n (r_i)^2}
Squaring means means that negative residual values don't cancel out positive values. It also means points that are farther away from the line end up with more weight than points closer to the line.
Least Squares Regression
Least Squares Regression is one of the more popular ways to minimize the sum of all residuals in our dataset.