# How to Draw a Scatter Plot

When scatter plots were discovered, drawing them was a complex task. It often required the use of statisticians and scientists. In the more recent past most of the drawing has been automated. However there is still a large amount of human involvement as well as human judgement which is required. The steps are mapped as follows:

**Step 1: Decide the Two Variables**

The most important step of the analysis is performed even before the analysis begins. In text book problems we assume that we know the variables between which we have to find correlation. However in real life, there are many variables and therefore many cases of correlation possible. Selecting the variables in between there exists a material relationship that if understood will benefit the process is important.

**Step 2: Collect Data**

Once the variables have been selected, relevant data needs to be collected to draw meaningful conclusions about the same. This can be done by applying the relevant design of experiment and coming up with measurements that will be used as inputs into the system. This process like every other follows the principle of GIGO i.e. Garbage In Garbage Out and hence due care must be taken regarding the input data.

**Step 3: Map the Data**

Once the data has been collected, it must be mapped on the X and Y axes of the Cartesian Co-ordinate system. This will give the viewer an idea about where the majority of the points are centred, where the outliers are and why this is the case. Nowadays, this does not have to be done manually. There are software available that will automatically fetch the incoming data real time and map it on to a scatter plot.

**Step 4: The Line of Best Fit**

The next step is to statistically compute the line of best fir for the scattered data points. This means that mathematically a line will be worked out that fits through most of the lines and is closest to the rest of them. This line has an equation that can be used to predict the nature of relationship between the variable. This step too, early required complex calculations, prone to human error. Now software can do it seamlessly and in no time.

**Step 5: Come Up With an Exact Number**

The next step is to come up with a co-relation co-efficient. This number as stated earlier is the best metric to understand correlation and lies between -1 and +1. The software will work out and give you a correlation co-efficient. Expensive software are not required. Something as simple as an excel sheet can be used.

**Step 6: Interpret the Number**

The last step is to interpret the number. Anything above + or – 0.5 suggests a strong correlation. 0 represents no correlation while -1 or +1 represents perfect co-relation. Perfect correlation may be an indicator for causation. However, it does not imply causation, all by itself.

