# Correlating fish and water

Correlation:

• Correlation is the tendency of two or more things to vary together.
• There is a reciprocal and mutual relationship between them.
• There is a regular association or connection.
• They go together in a somewhat predictable way.
• There is some relation existing between several phenomena or things.
• Most usually it is between things which can be measured, although that goes beyond the core meaning.
• There are various mathematical measures in use to describe correlation numerically.

Let’s look at an example:

Height and weight tend to go together. Tall people tend to be heavier. Short people tend to be lighter. Since the relationship is mutual, we can say the heavy people tend to be taller, and light people tend to be shorter. However, we know that some short people are very heavy, either muscular or overly fat. Some tall people are relatively light for the opposite reasons, not muscular or skinny. The degree of correlation varies.

Normally, correlation is computed mathematically. The common method is to compute a statistic called the Pearson Correlation Coefficient, also called Pearson r. It gives a measure ranging from -1 to +1. A Pearson r of -1 indicates that two measures are perfectly correlated, but move in the opposite direction. A Pearson r of +1 indicates that two measures are perfectly correlated, and move in the same direction. A Pearson r of 0 indicates that two measure show no relationship.

A lot of studies, particularly those reported on medicine and nutrition in the media, report correlation numbers (if they even do that). What they do not tell you is that correlation does not show that one factor causes the other. The correlations may be purely accidental, they may only show that some other factor is causing them both to mutually vary, or that one may actually be causally linked to the other.

For spurious and humorous correlations, see: https://www.tylervigen.com/spurious-correlations

Since the correlation coefficient varies from 0 to +/- 1, you might be forgiven for thinking that a correlation coefficient of .5 is a big deal. However, a better measure of the degree of association is given by the square of the coefficient. So, a Person r of .5, when squared, giving a statistic called the coefficient of determination, the proportion of variance of one measure predicting the other, shows that 25 percent of the variability can be determined by the correlation of .5. In similar fashion, a Pearson r of .4 gives a coefficient of determination of 16 percent. Maybe this correlation is not quite the big deal that you initially thought it to be.

So, what is the correlation between the presence of living fish and the presence of water? When you find living fish, you normally find water. It does not work the other way around, does it? It seems to me that the mutuality part is not in evidence. The correlation coefficient should not be all that high, if my thinking is correct.