20 August 2015

Specialisation, statistical errors in science and P-hacking

Background: Can we quantify levels of specialisation?

Like many other links posted on this blog, those included in my post of 18 August promoted specialisation as one of the best tickets to the top end of the translation market.
Specializing: a ticket to the high end of the profession? and Future-proofing the translation profession also focus on just what top-end translators mean by specialisation.

Here then is a further example for translators (and would-be translators) specialising (or aiming to specialise) in scientific papers and science journalism.

A case study for the scientific translator

First, anyone working in these fields should be aware of the debate — currently gathering pace and soon, many hope, to become truly momentous — on statistical errors. See, for instance Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume by Regina Nuzzo published by Nature (vol. 504) in February 2014.

The issue give me an opportunity to quantify my ideas on the depth of understanding a high-end translator requires to work proficiently on texts the use p-values or a similar level of statistical analysis.
  1. The translator should have a reasonable grasp of what p-values are and how they are calculated.
  2. Given that tens of thousands of scientists are accused of misunderstanding the proper use of p-values, as explained in the link above, it is probably too much to ask that translators know more than the scientists that they work for in the this area. But ...
  3. The translator should definitely be aware of the debate and its impact on the document in hand and on science in general.
  4. The translator should be familiar with the relevant terminology and all relevant subtleties and nuances.

On P-hacking

Quote #1 from Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume:
Perhaps the worst fallacy is the kind of self-deception for which psychologist Uri Simonsohn of the University of Pennsylvania and his colleagues have popularized the term “P-hacking” ... (which), says Simonsohn, “is trying multiple things until you get the desired result” — even unconsciously.

P-hacking terminology

Quote #2 (my bold):
P-hacking; it is also known as data-dredging, snooping, fishing, significance-chasing and double-dipping.
Quote #3:
It may be the first statistical term to rate a definition in the online Urban Dictionary, where  the usage examples are telling: “That finding seems to have been obtained through p-hacking, the authors dropped one of the conditions so that the overall p-value would be less than .05”, and “She is a p-hacker, she always monitors data while it is being collected.”


Want to learn still more?
Read We found only one-third of published psychology research is reliable – Now what?

Key quote (my bold):
There are two major ways that researchers quantify the nature of their results. The first is a p-value, which estimates the probability that the result was arrived at purely by chance and is a false positive. (Technically, the p-value is the chance that the result, or a stronger result, would have occurred even when there was no real effect.) Generally, if a statistical test shows that the p-value is lower than 5%, the study’s results are considered “significant” – most likely due to actual effects.
Another way to quantify a result is with an effect size – not how reliable the difference is, but how big it is. Let’s say you find that people spend more money in a sad mood. Well, how muchmore money do they spend? This is the effect size.

No comments:

Post a Comment

Full circle

After completing a BSc in physics and maths in Australia and extended travels in Africa I found a job in Paris that left me with considerab...