Statistical tests basically evaluate the results of an investigation in light of chance occurrences. These tests permit the researcher to assess the probability that such results might have been due to chance. For example, one might be comparing the recall performance of two groups and find that there were mean correct responses of 25 and 42 for Groups A and B respectively. It is important for the researcher to determine how frequently such differences might be expected by chance alone.
The term statistically significant is applied to results where the probability of chance is equal to or below an agreed-upon level. Most psychologists accept a 5% (or lower) probability of chance as being statistically significant. This is typically reported as p < .05, which means that if the study were replicated 100 times, these same results would attain fewer than five times due to chance. Another frequently used level of significance is the 1% level, which is generally reported as p < .01. In this case one would expect to observe the results by chance only one time out of 100. These significance levels are based on convention and are widely accepted, but they are not derived from mathematical justification. There are some differences among various disciplines regarding what is considered statistically significant.
Statistical significance may be determined for both tests of difference and tests of relationship. For difference questions, one can establish how often such differences would be obtained due to chance. For relationship questions, one can assess how often a given relationship level (correlation coefficient) would be observed due to chance. Statistical significance and the resulting decision regarding the likelihood of an effect due to chance are influenced considerably by the research design employed.