Mathematics/Statistics/Chi-Square Test: Difference between revisions

From Dev Wiki
Jump to navigation Jump to search
(Create page)
 
(Create full "goodness of fit" section)
Line 24: Line 24:
Less Formal:
Less Formal:
  <math>\chi^2 = \frac{(Actual_1-Expected_1)^2 + (Actual_2-Expected_2)^2 + ... + (Actual_{n-1}-Expected_{n-1})^2 + (Actual_n-Expected_n)^2}{Expected_1 + Expected_2 + ... + Expected_{n-1} + Expected_n}</math>
  <math>\chi^2 = \frac{(Actual_1-Expected_1)^2 + (Actual_2-Expected_2)^2 + ... + (Actual_{n-1}-Expected_{n-1})^2 + (Actual_n-Expected_n)^2}{Expected_1 + Expected_2 + ... + Expected_{n-1} + Expected_n}</math>
== Initial Variables ==
The following variables are used in both types of Chi-Square tests. Determining these is the first step before doing anything else.
=== Null Hypothesis ===
The '''Null Hypothesis''' is integral to the Chi-Square test. This is effectively a special way of saying "this is what we're testing for."
This null hypothesis is represented by <math>H_0</math>, and should essentially always be a declaration in support of our "expected" outcome.
=== Alternative Hypothesis ===
To counter <math>H_0</math>, we have an '''Alternative Hypothesis''', represented as <math>H_a</math>. Often, this is as simple as "Our <math>H_0</math> is not true."
=== P-Value ===
Finally, we have a '''P-Value''' or <math>\alpha</math>. In layman's terms, this can be though of a '''Signficance Level'''. We use this at the end to determine if our result is significant or not.
Our P-Value is generally between 0 and 1, and is generally chosen based on the expected [[Statistics/Distributions|distribution]] for our population.<br>
Since a [[Statistics/Distributions#Normal Distribution|normal distribution]] is one of the most common distribution types, most of the time our P-Value = 0.05.<br>
This is because, in a normal distribution 95% of values will fall within two [[Statistics/Core Measurements#Standard Deviation|standard deviations]], so we only care if we hit an instance outside of that.
== Goodness of Fit Test ==
The '''Goodness of Fit''' Chi-Square test is used to evaluate if a smaller, sample population matches a larger one.<br>
A lot of times, we'll doubt if something works the way it claims, so we get a sample subset and compare results.<br>
If our final result is below our P-Value, then it means our sample was unlikely enough to reject the initial claim.
=== Running the Test ===
Once we determine our [[#Initial Variables|initial variables]], we can conduct our test. We plug our values into the [[#General Notation|above formulas]] and get a result.
We then calculate '''Degrees of Freedom''' (Df), which is just a fancy way of saying "number of possible outcomes, minus 1".<br>
We use this Df value to look up a Chi-Square probability table (it's probably best to just google this) and find the appropriate row.<br>
On this row, we find the rough equivalent of what our above formula gave us, and then note the value at the very top of this column.
Finally, we look back at our P-Value. If the table value is greater than our P-Value, we accept <math>H_0</math> as valid. If the table value is less than our P-value, we reject <math>H_0</math> as invalid and instead accept our <math>H_a</math>.
=== Example ===
==== Background ====
Hypothetically, let's say you're a student. A big, national test is coming up, and all the questions are multiple choice with answers of A, B, C, or D.<br>
The publisher of the test claims that "all our tests are structured this way, and there is an equal chance of any letter being correct for any question.<br>
In other words, one would expect each of the answers should come up as the correct answer exactly 25% of the time.
When you get the results back, you're not confident in what the publisher claimed. For your specific version of the test, the correct answers were dispersed as follows:
A: 20%
B: 20%
C: 25%
D: 30%
Note that, due to random chance, there is always going to be some variation from that original 25%. But you feel like these results are a bit extreme, so you investigate further.<br>
One way to do so is the Chi-Square Goodness of Fit test.
==== Initial Variables ====
The first step is always to determine our <math>H_0</math>, <math>H_a</math>, and <math>\alpha</math>.
<math>H_0</math> can be "The publisher's claim is correct. Each possible answer has a 25% chance of occurring."<br>
<math>H_a</math> can be "The publisher's claim is incorrect and there is a bias towards one or more possible answers."<br>
Since we expect a normal distribution, our <math>\alpha</math> will be 0.05.
==== Using the Formula ====
We can consider our version of the test to be an adequate sample of a larger population. The "larger population" is considered to be "all tests and test versions created by the publisher".<br>
To word it another way, our test answers can be the "actual" and the claim by the publisher can be the "expected". Thus we can proceed without any additional information.
In this case, we have the following formula values:
<math>\chi^2 = < a > + < b > + < c > + < d ></math>
&nbsp;
<math>\chi^2 = \frac{(20-25)^2}{25} + \frac{(20-25)^2}{25} + \frac{(25-25)^2}{25} + \frac{(35-25)^2}{25}</math>
&nbsp;
<math>\chi^2 = \frac{-5^2}{25} + \frac{-5^2}{25} + \frac{0^2}{25} + \frac{10^2}{25}</math>
&nbsp;
<math>\chi^2 = \frac{25}{25} + \frac{25}{25} + \frac{0}{25} + \frac{100}{25}</math>
&nbsp;
<math>\chi^2 = 1 + 1 + 0 + 4</math>
&nbsp;
<math>\chi^2 = 6</math>
We also have 4 possible outcomes (aka four different possible test answers), so our '''Degrees of Freedom''' is:
<math>Df = 4 - 1 = 3</math>
Looking up an external Chi-Square table on google, the Df = 3 row indicates that a value of 6.251 occurs at column 0.10.<br>
In other words, 10% of the time, with Df = 3, you'll get a value of 6.251 for any random sample of a population.<br>
Comparing this 0.10 value back to our <math>\alpha=0.05</math>, we note that our computed table value is higher.
With this, we can conclude that the values of our test are actually not as extreme as we originally though, and our null hypothesis holds.
== Test for Independence ==
The '''Test for Independence''' Chi-Square test is used to evaluate if two variables are correlated in some way.

Revision as of 11:01, 15 May 2020

The Chi-Square test, alternatively called test, is used to measure two possible things:

  • Chi-Square Goodness of Fit test - Determines if a set of sample data matches a larger population.
  • Chi-Square Test for Independence - Determines if two variables are at all correlated.

General Notation

In general, Chi-Square is represented by a formula. The notations are as follows:

  • O stands for "Observed/Actual".
  • E stands for "Expected".
  • Spans across some data set of size n.

With this in mind we have the following formulas:

Scientific Notation

Formal:


Less Formal:


Direct Notation

Formal:


Less Formal:


Initial Variables

The following variables are used in both types of Chi-Square tests. Determining these is the first step before doing anything else.

Null Hypothesis

The Null Hypothesis is integral to the Chi-Square test. This is effectively a special way of saying "this is what we're testing for."

This null hypothesis is represented by , and should essentially always be a declaration in support of our "expected" outcome.

Alternative Hypothesis

To counter , we have an Alternative Hypothesis, represented as . Often, this is as simple as "Our is not true."

P-Value

Finally, we have a P-Value or . In layman's terms, this can be though of a Signficance Level. We use this at the end to determine if our result is significant or not.

Our P-Value is generally between 0 and 1, and is generally chosen based on the expected distribution for our population.
Since a normal distribution is one of the most common distribution types, most of the time our P-Value = 0.05.
This is because, in a normal distribution 95% of values will fall within two standard deviations, so we only care if we hit an instance outside of that.

Goodness of Fit Test

The Goodness of Fit Chi-Square test is used to evaluate if a smaller, sample population matches a larger one.
A lot of times, we'll doubt if something works the way it claims, so we get a sample subset and compare results.
If our final result is below our P-Value, then it means our sample was unlikely enough to reject the initial claim.

Running the Test

Once we determine our initial variables, we can conduct our test. We plug our values into the above formulas and get a result.

We then calculate Degrees of Freedom (Df), which is just a fancy way of saying "number of possible outcomes, minus 1".
We use this Df value to look up a Chi-Square probability table (it's probably best to just google this) and find the appropriate row.
On this row, we find the rough equivalent of what our above formula gave us, and then note the value at the very top of this column.

Finally, we look back at our P-Value. If the table value is greater than our P-Value, we accept as valid. If the table value is less than our P-value, we reject as invalid and instead accept our .

Example

Background

Hypothetically, let's say you're a student. A big, national test is coming up, and all the questions are multiple choice with answers of A, B, C, or D.
The publisher of the test claims that "all our tests are structured this way, and there is an equal chance of any letter being correct for any question.
In other words, one would expect each of the answers should come up as the correct answer exactly 25% of the time.

When you get the results back, you're not confident in what the publisher claimed. For your specific version of the test, the correct answers were dispersed as follows:

A: 20%
B: 20%
C: 25%
D: 30%

Note that, due to random chance, there is always going to be some variation from that original 25%. But you feel like these results are a bit extreme, so you investigate further.
One way to do so is the Chi-Square Goodness of Fit test.

Initial Variables

The first step is always to determine our , , and .

can be "The publisher's claim is correct. Each possible answer has a 25% chance of occurring."
can be "The publisher's claim is incorrect and there is a bias towards one or more possible answers."
Since we expect a normal distribution, our will be 0.05.

Using the Formula

We can consider our version of the test to be an adequate sample of a larger population. The "larger population" is considered to be "all tests and test versions created by the publisher".
To word it another way, our test answers can be the "actual" and the claim by the publisher can be the "expected". Thus we can proceed without any additional information.

In this case, we have the following formula values:


 

 

 

 

 

We also have 4 possible outcomes (aka four different possible test answers), so our Degrees of Freedom is:


Looking up an external Chi-Square table on google, the Df = 3 row indicates that a value of 6.251 occurs at column 0.10.
In other words, 10% of the time, with Df = 3, you'll get a value of 6.251 for any random sample of a population.
Comparing this 0.10 value back to our , we note that our computed table value is higher.

With this, we can conclude that the values of our test are actually not as extreme as we originally though, and our null hypothesis holds.


Test for Independence

The Test for Independence Chi-Square test is used to evaluate if two variables are correlated in some way.