Mathematics/Statistics/Core Measurements: Difference between revisions

From Dev Wiki
Jump to navigation Jump to search
(Create range section)
(Add mean notation)
Line 6: Line 6:


== Mean/Average ==
== Mean/Average ==
The "mean" and "average" are effectively two different words for the same thing.
The "mean" and "average" are effectively two different words for the same thing.<br>
Effectively, this attempts to get the most "middle" value given a set of items.


Effectively, this attempts to get the most "middle" value given a set of items.<br>
{{ Tip | Often, this is represented as <math>\bar{x}</math>, pronounced "x bar".}}


=== Standard (Unweighted) Mean ===
=== Standard (Unweighted) Mean ===
Line 69: Line 70:
{{ ToDo | Differentiate between statistical range and mathematical function range. }}<br>
{{ ToDo | Differentiate between statistical range and mathematical function range. }}<br>
Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items.
Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items.
In other words, this attempts to measure how much values tend to spread apart in a given set.


  Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8.
  Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8.

Revision as of 13:43, 11 May 2020

Below are some of the most basic, and regularly used forms of measurements in statistics.
All of these measurements are used to gather information about a list of items.

Note: Most of these are easier to use when the list of items is sorted by some meaningful ordering. For some, such as #Median, they will only work on sorted lists.


Mean/Average

The "mean" and "average" are effectively two different words for the same thing.
Effectively, this attempts to get the most "middle" value given a set of items.

Tip: Often, this is represented as , pronounced "x bar".

Standard (Unweighted) Mean

Unweighted Mean is what most people think of when someone says "mean" or "average.
Effectively, take all values in a list and add them together. Then divide this sum by the total count of original values.

Scientific Notation:

Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_{n}]:
 

Direct Notation:

Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_n]:
 

Example:

Given a list of [1, 4, 7, 5, 9, 9, 2, 10].
 
Step 1) Sum: 1 + 4 + 7 + 5 + 9 + 9 + 2 + 10 = 47
Step 2) Divide sum by count:  = 5.875

Weighted Mean

Weighted Mean is similar to above, except that each value has a "weight" associated with it.

Scientific Notation:

Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_{n}], with associated weights [w_0, w_1, ..., w_{n-1}, w_{n}]:
 

Direct Notation:

Given a list of n terms, [x_0, x_1, ..., x_{n-1}, x_n], with associated weights [w_0, w_1, ..., w_{n-1}, w_{n}]:
 

Example:

Given a list of [1, 4, 7, 5, 9, 9, 2, 10], the first 4 values are twice as important as the last 4.
 
Step 1) Sum: 2*1 + 2*4 + 2*7 + 2*5 + 1*9 + 1*9 + 1*2 + 1*10 = 64
Step 2) Divide sum by weights:  =  = 5.333


Median

Similarly to mean, the median attempts to get the most "middle" item given a set of items.
However, instead of doing so by literal value, it does this by count of items.

For odd numbered sets, the median is the exact middle number.

Given a list of [1, 2, 3, 5, 7], the median is 3.

For even numbered sets, the median is the middle two numbers.

Given a list of [1, 2, 3, 4, 5, 7], the medians are 3 and 4.


Mode

The mode is the value that occurs most frequently in a set of items.

Given a list of [1, 2, 2, 3, 3, 4, 4, 4], the mode is 4.


Range

Template:ToDo
Range is the difference between the lowest and highest values. Theoretically, it is yet another attempt to get the most "middle" value out of a set of items.

In other words, this attempts to measure how much values tend to spread apart in a given set.

Given a list of [2, 4, 5, 7, 8], the lowest and highest values are 2 and 8.
Thus, the range is 8 - 2 = 6

Note that this can be less than useful when the data set has outliers.
Template:ToDo

If we introduce a new value of 100 to our above list, we get [2, 4, 5, 7, 8, 100].
The lowest and highest values are now 2 and 100.
Thus, the range is now 100 - 2 = 98.
This isn't very descriptive of our data anymore.