Helpful Statistical Interpretations

stats research

Correlation

Biased samples can generate correlations that are not present in the population (in this case when the observed subpopulation depends on the observed variables for which a correlation is supposed). If celebrities are always attractive, talented, or both but never neither attractive nor talented than these two traits might be correlated in the subpopulation of celebrities even though this correlation is not present in the general population.

Be careful when observing a correlation across multiple groups. The correlation within and between groups might be completely diametrical Example: Low birth-weigth paradox

Hypothesis Testing and Statistical Significance

A non-significant statistical test is not to be treated as support for the null-hypothesis Conclusions can only be made under additional consideration of a priori statistical power

When only looking at the subset of significant results in a distribution of effect sizes, low-powered tests lead to effects sizes in this subset to be unrealistically bloated This is especially problematic when presupposing that significant results are more likely to be published For further reading, see this blog post by Andrew Gelman

A non-significant multivariate test implies that no test in a set of corresponding univariate tests with appropriately adjusted alpha levels would reach statistical significance The inversion of this conclusion is not true (a significant multivariate test does not imply significance in any test in a set of corresponding univariate tests with adjusted alpha) Rather, a significant multivariate test only implies significance for at least one linear combination of corresponding univariate statistical tests, which must not always coincide with an “actual” univariate test

Regression

A positive change in one unit of X changes the predicted outcome variable to the value of beta given all other predictors remain constant This conditional interpretation is important since beta can have a completely different value in a simple regression with just one predictor As meaningful interpretations seem hard to grasp considering this conditional interpretation, I regard it an additional argument for favouring parsimonious models when carrying out research

The value of the intercept represents the prediction of the outcome variable given all predictors have the value 0 This is generalizable across groups with differing fitted intercepts, for example in linear mixed models

Statistical Thinking and Decision-Making

This blog post on Andrew Gelman’s blog nicely illustrates how not taking into the account the base rate of events/classes/… results in wrong probability estimates. This is especially eye-opening when base rates are very unevenly distributed, for example when testing for a rare disease. Depending on the diagnostic utility of a test, positive results might much more often be false-positives than intuitively assumed. Related: Prosecutor’s fallacy

After observing red 20 times in a row, the gambler put all his money on black. As each event remained statistically independent, the roulette landed on red again.

Any questions, comments or corrections? Do not hesitate to get in touch with me via Twitter or send me an e-mail.

Questions? Thoughts? Generate a Comment to this Post!


Enter Name:


Enter a Title for Later Reference:


If Applicable, Enter Reply Reference:


Enter Comment:



Recent Blogging on Statistical Concepts

stats r

Permutation Test for F-score Differences in Python

code-snippet python stats

Code Snippet: Generalized Linear Mixed Models Power Analysis in R

R stats code snippet

Search this Website