Mathematics/Statistics/Distributions: Difference between revisions
Brodriguez (talk | contribs) (Create page) |
Brodriguez (talk | contribs) (Expand normal distribution info) |
||
Line 1: | Line 1: | ||
A distribution of a data set is essentially a graph that shows all values of data and how often they occur. | A distribution of a data set is essentially a graph that shows all values of data and how often they occur. | ||
For an ordered data set, the x-axis denotes the values of the data. The y-axis denotes the frequency of each specific value. | For an ordered data set, the x-axis denotes the values of the data. The y-axis denotes the frequency of each specific value.<br> | ||
Alternatively, if the distribution is trying to predict future outcomes, then the x-axis still denotes values, but the y-axis denotes the probability of that value occurring. | |||
{{ ToDo | Add more distributions. }} | {{ ToDo | Add more distributions. }} | ||
Line 7: | Line 8: | ||
== Normal Distribution == | == Normal Distribution == | ||
Also known as a | Also known as a '''Bell Curve''', or '''Gaussian Distribution'''. | ||
A normal distribution is based on continuous numerical data, | A normal distribution is based on continuous numerical data, and is always symmetrically centered around the average value, aka the mean.<br> | ||
The width of the curve is defined by the [[Statistics - Core Measurements#Standard Deviation|standard deviation]] of the dataset. | |||
In a normal distribution, 68% of all data will fall between 1 standard deviation of the mean, and 95% of all data will fall between 2 standard deviations of the mean.<br> | |||
In other words, the farther from the mean, the less frequently values occur. | |||
{{ ToDo | Add image of distribution. }} | {{ ToDo | Add image of distribution. }} | ||
{{ ToDo | Possibly include information on "Central Limit Theorem? }} |
Revision as of 00:11, 14 May 2020
A distribution of a data set is essentially a graph that shows all values of data and how often they occur.
For an ordered data set, the x-axis denotes the values of the data. The y-axis denotes the frequency of each specific value.
Alternatively, if the distribution is trying to predict future outcomes, then the x-axis still denotes values, but the y-axis denotes the probability of that value occurring.
Normal Distribution
Also known as a Bell Curve, or Gaussian Distribution.
A normal distribution is based on continuous numerical data, and is always symmetrically centered around the average value, aka the mean.
The width of the curve is defined by the standard deviation of the dataset.
In a normal distribution, 68% of all data will fall between 1 standard deviation of the mean, and 95% of all data will fall between 2 standard deviations of the mean.
In other words, the farther from the mean, the less frequently values occur.