What is Correlation Analysis and How is it Performed ?
April 3, 2025
Correlation analysis is a vital tool in the hands of any Six Sigma team. As the Six Sigma team enters the analyze phase they have access to data from various variables. They now need to synthesize this data and ensure that they are able to find a conclusive relationship. What is Correlation Analysis ? One…
The 5 why method or the Root Cause analysis method that has been described in the Tools section plays an important role in determining that the X’s are recorded at actionable level. In this implementation of the 5 why tool, there is a slight variation from the standard methodology and hence it has been explained…
Once the Scatter plot has been used to find out the correlation between the inputs being measured as well as the desired outputs, it is now time to come up with an equation which shows the precise relationship. This is called Regression. Regression is a technique which summarizes the relationships observed in the Scatter plot…
The P-Value is a statistical representation of the likelihood that the null hypothesis is true. Therefore the P-Value is the probability that the output (Y) will not change as a result of the variation that we are deliberately introducing in the input (X).
Example: If we have a null hypothesis that says that there is no statistically significant relationship between the efficiency of workers in the New York factory as compared to the one in Sacramento California, then the P value attaches a probability to this statement.
If we say that for testing the following hypothesis we assign a P-Value of 0.05. This means that we are saying that 95 out of a 100 times, the efficiencies will not be different. Therefore we are admitting that 5% of the times they could be different. In this case the null hypothesis will only be rejected if we prove that differences in efficiencies arise let’s say 7% of the times.
The P-Value is an important part of the hypothesis problem. Changing the P-Values slightly can change which hypothesis is selected and which is rejected. Therefore the selection of the P-Value must be carefully done. There are different types of errors associated with choosing the wrong P Value. These errors have been described later. The management must decide which error they can afford to make before selecting the P-Value.
The P-Value and confidence interval are intertwined. In fact, if you have the value of one you can automatically derive the value of the other. The formula used in P value = 1 - Confidence Interval. Therefore for a P Value of 0.05, the confidence interval is 0.95 or 95%.
Statistical conclusions cannot be taken literally. One needs to carefully understand how to interpret them before any decisions are made which are based on them. For instance, we stated above that the null hypothesis stands true 95% of the time. Does this mean that if we were to conduct a 100 experiments right away, the null hypothesis would stand true 95 times or more? Well, not really.
Although this is what is ideally expected to happen, the entire subject of Statistics is based on the law of large numbers. This means that these conclusions can be tested for validity only after 1000’s of trials have taken place. When the numbers are sufficiently large, 95% will hold true!
Your email address will not be published. Required fields are marked *