Mathematics/Statistics/Core Measurements: Difference between revisions
Brodriguez (talk | contribs) (Create range section) |
Brodriguez (talk | contribs) (Add mean notation) |
||
Line 6: | Line 6: | ||
== Mean/Average == | == Mean/Average == | ||
The "mean" and "average" are effectively two different words for the same thing. | The "mean" and "average" are effectively two different words for the same thing.<br> | ||
Effectively, this attempts to get the most "middle" value given a set of items. | |||
{{ Tip | Often, this is represented as <math>\bar{x}</math>, pronounced "x bar".}} | |||
=== Standard (Unweighted) Mean === | === Standard (Unweighted) Mean === | ||
Line 69: | Line 70: | ||
{{ ToDo | Differentiate between statistical range and mathematical function range. }}<br> | {{ ToDo | Differentiate between statistical range and mathematical function range. }}<br> | ||
Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items. | Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items. | ||
In other words, this attempts to measure how much values tend to spread apart in a given set. | |||
Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8. | Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8. |
Revision as of 13:43, 11 May 2020
Below are some of the most basic, and regularly used forms of measurements in statistics.
All of these measurements are used to gather information about a list of items.
Mean/Average
The "mean" and "average" are effectively two different words for the same thing.
Effectively, this attempts to get the most "middle" value given a set of items.
Standard (Unweighted) Mean
Unweighted Mean is what most people think of when someone says "mean" or "average.
Effectively, take all values in a list and add them together. Then divide this sum by the total count of original values.
Scientific Notation:
Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_{n}]:
Direct Notation:
Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_n]:
Example:
Given a list of [1, 4, 7, 5, 9, 9, 2, 10].
Step 1) Sum: 1 + 4 + 7 + 5 + 9 + 9 + 2 + 10 = 47
Step 2) Divide sum by count: = 5.875
Weighted Mean
Weighted Mean is similar to above, except that each value has a "weight" associated with it.
Scientific Notation:
Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_{n}], with associated weights [w_0, w_1, ..., w_{n-1}, w_{n}]:
Direct Notation:
Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_n], with associated weights [w_0, w_1, ..., w_{n-1}, w_{n}]:
Example:
Given a list of [1, 4, 7, 5, 9, 9, 2, 10], the first 4 values are twice as important as the last 4. Step 1) Sum: 2*1 + 2*4 + 2*7 + 2*5 + 1*9 + 1*9 + 1*2 + 1*10 = 64 Step 2) Divide sum by weights: = = 5.333
Median
Similarly to mean, the median attempts to get the most "middle" item given a set of items.
However, instead of doing so by literal value, it does this by count of items.
For odd numbered sets, the median is the exact middle number.
Given a list of [1, 2, 3, 5, 7], the median is 3.
For even numbered sets, the median is the middle two numbers.
Given a list of [1, 2, 3, 4, 5, 7], the medians are 3 and 4.
Mode
The mode is the value that occurs most frequently in a set of items.
Given a list of [1, 2, 2, 3, 3, 4, 4, 4], the mode is 4.
Range
Template:ToDo
Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items.
In other words, this attempts to measure how much values tend to spread apart in a given set.
Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8. Thus, the range is 8 - 2 = 6
Note that this can be less than useful when the data set has outliers.
Template:ToDo
If we introduce a new value of 100 to our above list, we get [2, 4, 5, 7, 8, 100]. The lowest and highest values are now 2 and 100. Thus, the range is now 100 - 2 = 98. This isn't very descriptive of our data anymore.