Before starting, I would like to clarify that Kelly’s calls for evidence to counteract calls for culling are a responsible thing to do (although arguably, calling for a cull before researching this may not be). He has received a huge amount of abuse on social media, and very little in the way of constructive arguments against culling. I do think the calls for a cull were a genuine response of concern for victims of the attacks. However, I hope to address here that 1) there is no evidence that shark attacks are out of control in Reunion and they pose a low risk, 2) there is a likely reason for a ‘slight’ increase in the number of attacks over time, 3) culling may make the problem of attacks worse. The links all point to peer reviewed journal articles – while these are official links, you may find you can obtain access to them without paying by copying and pasting the article title into Google or Google Scholar.

**The Reunion problem in numbers**

The headlines are 20 shark attacks since 2011 (eight of which have been fatal). However, 2011 itself saw five attacks, meaning there have been less than three per year on average since 2012. Scientific studies have shown that the rate is slowly increasing (see below for why), and on long-term data collected since 1980, typically there would now be about two attacks per year. So, with the exception of 2011, nothing major has changed, there hasn’t been a recent surge of attacks that can be statistically validated as different from long-term trends.

The risk is also relatively low. The population of Reunion is around 850,000 with another 405,000 visitors each year. As a quick sweeping assumption, if half of these people go in the sea, then that means going in the sea in a typical recent year means an approximately 1 in 200,000 chance of being attacked by a shark. The odds of *dying* in a transportation accident in 2013 were approximately 1 in 48,000, over four times higher than even being bitten by a shark in Reunion.

**Reasons for increasing attacks**

Since 1980, there has been a steady but small increase in the risk of a shark attack on Reunion Island. To some extent, this can be attributed to more tourism and more water related activities. Simply put, if sharks bite humans randomly (albeit with a very low probability of doing so), then more humans in the water will mean more bites. However, there are also other issues effecting likely shark bite rates in Reunion. While many theories exist, including over fishing of prey species, a recent study suggests that for Reunion, the main issue is a rise in poorly regulated agriculture. The study states: “Agriculture […] represents an important component of the island’s economy. However, run-off and waste-water are poorly contained.” Agricultural run-off affects water clarity, and therefore sharks’ vision. It is well known that areas of poor visibility are the site of higher numbers of shark attacks. Indeed, the attack on a bodyboarder last week was in a river mouth (which had been closed to all water users by the authorities, due to increased risk of shark attacks).

**The problems with culling**

Bull sharks (the most likely species thought to have caused the majority of attacks) are seasonal visitors* *to Reunion. While studies have shown good site fidelity for bull sharks (many return to the same site each year), some do not, and other new individuals arrive. Culling may reduce the population in one season, or even for a few seasons, but it will have little mid- to long-term effect on numbers and can not eliminate the risk of an attack.

Attracting sharks (to catch them or cage dive) usually involves the use of chum (dead fish, blood and oil thrown into the water). Ultimately the use of chum may lead to further attacks on humans, and could attract them into shallower water. The evidence is far from certain, but a recent review of scientific literature says: “We are not aware of any published studies that have examined [whether chumming increasing shark attacks] and therefore, this is certainly an area requiring further research. However, research on a variety of other species has shown that some increased risk of aggression toward humans is possible in provisioned animals.”

Conclusions

Sharks are apex predators which control marine ecosystems and have even been shown to help alleviate climate change by storing or regulating carbon dioxide release. The risk of a shark attack at Reunion is very low, and culling is unlikely to be effective and may even be counterproductive. The solution to preventing attacks is to respect sharks, learn about their behaviour, and avoid more dangerous areas. Along with that, we need to protect the ocean from over fishing (so sharks have food) and pollution (so they can identify humans from their main prey)– as protecting the ocean also protects us from the risk of attack.

]]>However, our recent paper (Spiers et al., 2016), demonstrates the potential role of predatory fish, including sharks, and other large marine predators such as whales and dolphins, of influencing the carbon budget of the ocean. In short, reducing the numbers of predators may increase carbon dioxide production of the ocean’s ecosystems and result in greater levels of climate change.

The reason is simple, and something almost every student of biology from GCSE or high school upwards will have been taught – trophic inefficiency. Energy (and biomass) is lost as it passes up the food chain. Typically only 10% is passed from predators to prey. So, removing large numbers of predators (and almost all commercial fish we eat are predatory and a long way up the food chain), results in greater levels of biomass further down the foodchain. In the marine environment, this is normally zooplankton and smaller fish. Studies have shown that respiration is proportional to biomass, so a greater biomass of animals means more carbon dioxide production.

While our study is based on a theoretical model, studies on the role of predators on carbon production have been conducted before, in simple systems consisting of only a few species (Atwood et al., 2013; Strickland et al., 2013). Again, they have shown that carbon production can be increased by removing the top predators from the system. More recent work has also demonstrated how predators can effect feeding of prey species and subsequent storage of carbon in marine ecosystems (Atwood et al., 2015). Furthermore, there are other studies such as that by Nicol et al. (2010) that demonstrate the potential role of whales on climate change, though providing nutrients for phytoplankton (the ‘plants’ of the sea) to grow (largely by transporting iron across thermoclines, by feeding at depth and defecating at the surface, allowing these nutrients to come to the surface waters where light is also available for photosynthesis).

It is important to realise the negative effects that abusing ocean ecosystems are having, and to begin to realise the potential consequences are far more far reaching than just for the fish being removed. Despite the recent reduction in demand for shark fin soup, shark finning is still continuing at a very high level in many areas of the world, decimating shark populations even from the levels present 20 years ago. Even sustainable fishing practices involve reduction of fish stocks (typically to half of their natural size), as this allows the greatest catch to be taken each year – and as such, even sustainable fishing could have big effects on ocean carbon production.

However, there are positives from this research too. Allowing predatory fish, sharks, whales and other marine animals to increase (i.e. their populations to recover), will mean that carbon production of the ocean is likely to decrease (from current levels). Creating sustainable fisheries (as a first step), keeping the ban on commercial whaling and outlawing (and enforcing existing bans) on shark finning will have a double positive effect. Not only will numbers of these large, beautiful and enigmatic creatures increase, but also the amount of carbon dioxide entering the oceans (causing acidification) and the atmosphere (causing climate change) will be reduced.

Read the full paper here: http://dx.doi.org/10.1016/j.ecoinf.2016.10.003

Contact me for more information at rstafford – at – bournemouth – dot – ac – dot – uk

References:

Atwood, T.B., Hammill, E.,,Greig, H.S., Kratina P., Shurin J.B., Srivastava D.S., Richardson J.S. (2013) Predator-induced reduction of freshwater carbon dioxide emissions. Nature Geoscience, 6, 191-194.

Atwood, T. B., Connolly, R. M., Ritchie, E. G., Lovelock, C. E., Heithaus, M. R., Hays, G. C., et al. (2015). Predators help protect carbon stocks in blue carbon ecosystems. Nature Climate Change, 5,1038-1045.

Nicol, S., Bowie, A., Jarman, S., Lannuzel, D., Meiners, K. M. and Van Der Merwe, P. (2010), Southern Ocean iron fertilization by baleen whales and Antarctic krill. Fish and Fisheries, 11: 203–209.

Richardson, A. J., Bakun, A., Hays, G. C., & Gibbons, M. J. (2009). The jellyfish joyride: causes, consequences and management responses to a more gelatinous future. Trends in Ecology & Evolution, 24, 312-322.

Spiers, E.K.A., Stafford R., Ramirez M., Vera Izurietab D.F., Chavarria J. 2016. Potential role of predators on carbon dynamics of marine ecosystems as assessed by a Bayesian belief network. Ecological Informatics. In press: doi: 10.1016/j.ecoinf.2016.10.003

Strickland M.S., Hawlena D., Reese A., Bradford M.A., Schmitz O.J. (2013) Trophic cascade alters ecosystem carbon exchange. Proceedings of the National Academy of Sciences, 110, 11035-11038.

]]>So, the purpose of this work is to predict which teams will finish in the Champions’ league from the UK premiership (the top 4). I’ve already done some work, based on current points and form (see here), which is my hard quantitative data. However, things can change – there is a transfer window in January to buy new players etc. So, expert opinion might be useful too. And, in football, everyone is an expert… So ‘public opinion’ might be useful too.

The data I have are:

The previous points and form data (see here).

‘Expert’ opinion. This was actually difficult to find (as would probably always be the case), so the best I have is a summary of what players each team needs to buy in the transfer window to maximise their success (courtesy of Match magazine – 30^{th} Dec 2014).

Given that transfers could greatly affect the team, I have then created a ‘money’ variable, which basically looks at the likelihood of being able to purchase the recommended players.

Finally, I’ve found a public survey, asking which teams will be the top 4 finishers (from quibblo.com on the 4^{th} Jan 2015).

Of course, these data are in a wide range of forms – so how do I integrate them?

I’m going to convert each data type into a probability (between 0 and 1) of finishing in the top 4. The previous form data are already in this format.

Expert opinion. I’ve used part formula, part intuition to create this. For example, for both Man City and Chelsea, there were no specific signings recommended – however, the tone of writing (as judged by me – so, yes, slightly subjective) was more positive for Chelsea – Man City’s entry said: “..a big name transfer would be a massive boost for the players and fans”, compared to Chelsea: “we’d sign a massive star to make their squad even more unstoppable”. Both are clearly high, as no weaknesses were identified – so Chelsea get a value of 0.95, Man City = 0.9. Other than that, it was relatively simple – one player identified as vital (in a role, such as defence, striker) = 0.6, two players = 0.5, three players = 0.4.

Money. This was very subjective, and wasn’t really researched here. However, it is well known (even by me) that some teams have more money than others. Hence, this looked at the likelihood of being able to buy the players identified above – lower scores for poorer clubs and those needing more players.

Public opinion – This was largely numeric data anyway. Votes were available for each team, in this case, the highest number of votes (26) was for Chelsea – converting to 0.95 probability. Man City and Arsenal were on 22 and 23 votes resperctively -both getting probabilities of 0.9. Liverpool had 17 votes (p=0.7), Man U 14 (p=0.6), Tottenham had 5 votes (p=0.2), Southampton has 2 votes (p=0.1) and West Ham were not on the list (p=0.1 – all other teams were low, so it is likely if they were included, they would be low too). Obviously there may be bias in here – with people voting for teams they support over true opinion, but largely this is the nature of public opinion, it is bias – and it doesn’t need to necessarily be treated as equal to other data (see below).

The probabilities for each team (as well as the overall probability – prior to interactions) are shown below:

Integrating the data:

This was done by setting up a Bayesian belief network using JavaBayes (available here), in the same way as for the political data previously. In this case, all four data sets fed into a final posterior distribution (however, as this was to be used in further analysis, it was called the ‘new prior’). The nice aspect of JavaBayes here is that it becomes intuitive that some parts of the data deserve more weighting than other parts. For example, in the screen shot below, it is clear that the previous form is given more weighting than the other variables (if form suggests the team will not make the top 4 – i.e. it is FALSE, but all the other variables suggest they will make the top 4, then there is only a 0.4 probability that they will in fact make the top four (in these calculations). Full details of the probabilities used are in the XML file (here – right click and ‘Save link as’ to access) which can be loaded into JavaBayes.

Interactions between teams:

From previous form and current (at the start of January) points, we identified 8 teams which could finish in the top 4. However, there are interactions between teams – if one wins, then by default, the one they play against loses. Equally there are only 4 places in the top 4 (an obvious fact, but perhaps one that needs stating…). So, if one team are in the top 4, then this means the chances of others getting there are decreased.

Incorporating reciprocal interactions in Bayesian networks is difficult statistically, and as such not really done in an intuitive manner by most BBN software. However, it is quite easy computationally (see here for details). This Excel file, with associated VBA code, runs reciprocal interactions – how it works can probably be followed from this paper. In this case, the interaction probabilities (the third tab of the worksheet) are key. It is obvious from the data above that Man City and Chelsea will be extremely unlikely to finish outside the top 4. Hence, really the competition is for the remaining two places. Hence interaction strengths between teams and either Chelsea of Man City are weaker than for the others (closer to 0.5, meaning equal chance of being or not being in the top 4).

Running the simulation produces the following predictions:

So, top 4 finishes likely for Chelsea, Man City and Man U. The final place is equally likely to be either Southampton or Arsenal (based on predictions and data from the start of 2015).

The current (24^{th} Feb) positions are (and yes, this is my first look at this, since early Jan):

So, these really are looking pretty good at the moment.

]]>In this case, we have a simple interaction – UKIP influences labour and conservative (we don’t really need to consider the reciprocal, it’s unlikely UKIP will get the most votes, even if they get a lot). Bayesian belief networks handle these simple interactions well. We have a new structure for the network:

And some new functions – quite simple, what will happen to expert opinion should UKIP votes increase (or decrease). Everything else is the same.

In the network, the UKIP node is blue. The reason for that is that this can be ‘observed’. Being observed gives the node a probability of 1 (of increasing – OR of decreasing). Of course, we can’t be that certain, but we can use this to help predict what might happen, depending on what happens to UKIP.

Fundamentally, if we don’t know what happens to UKIP, then the pollsters have no idea who will win. We get the same probabilities as before (62% likely for a labour win). However, if the UKIP vote decreases (let’s assume the highly unlikely event of Nigel Farage saying something offensive in the next 4 and a half months….. ) then we observe that the UKIP vote goes down and… it’s now only 53% likely that Labour will get the most votes.

Simple Bayesian belief networks are fine for these simple interactions – but let’s go back to the football. Only four teams can qualify for Europe, so if one becomes more likely, this should effect the chances of the others. Simple Bayesian networks don’t cope well with this – but in what is likely to be the final part of these blogs, I’ll show how this can be overcome and we can incorporate interaction, data and opinion to make some final predictions.

]]>However, converting qualitative data into beliefs (or probabilities between 1 and 0 of an event occurring) is actually easier. Essentially it is just educated guess work. An easy example – expert opinion on who will win the most votes in the general election is largely – It is too close to call. A justification of that can be found in this paragraph from the Observer newspaper (see here for full article – http://www.theguardian.com/politics/2014/dec/27/2015-general-election-unpredictable-green-party-ukip:

“Political pundits are hedging their bets as never before. Their crystal balls reveal only a thick fog of uncertainty. They can agree on one thing – that it is impossible to say who will be prime minister after the election in five months’ time. “The 2015 election is the most unpredictable in living memory,” says Robert Ford, co-author of a book about the rise of Ukip, Revolt on the Right. “Past elections have been close but none has featured as many new and uncertain factors with the capacity to exert a decisive impact on the outcome.””

So, in answer to the question – will the Conservatives get the most votes – the belief is simple – 0.5, or I have no idea… there’s a 50:50 chance…

It’s easy enough to combine this (possibly not insightful, but at least honest) expert opinion with our predictions from yesterday’s opinion poll analysis using a Bayesian belief network (BBN). The following diagram (and parameterised belief network) was made in the free JavaBayes software, available here: http://www.cs.cmu.edu/~javabayes/Home/

You can download the code for the network (in XML format) here

http://www.rickstafford.com/software/basic_network.xml

It’s not as scary as it all looks, essentially a BBN is just a way of formalising combining probabilities, although it does use the standard Bayesian equation to do so. However, the ‘Beliefs’ from yesterday’s opinion poll analysis are combined with the ‘expert’ opinion (the 50:50 split) to give an overall probability of each party winning the most votes. The final node then tells us the probability of Labour or the Conservatives having the most votes (in this case, they do add to one, as no other party is thought to be able to actually get the highest number of votes).

There are some simple functions to include here – for example, how do we weight the different types of evidence? The function is simple enough to complete, and looks like this:

What it means is, given that the opinion poll data AND expert opinion both indicate labour is definitely winning (i.e. have values of 1) then the probability of labour winning in reality (given the election is a long time away) will be 90%. In practice, the input values (or priors, if you like) are not 1, but are 0.97 and 0.5 respectively. Combining these gives a probability of labour getting the most votes of 69%. Such an approach seems realistic – if expert opinion was absolutely certain that Labour would win, and they were even higher in the opinion polls, then even with 4 months to the election, it would seem right that we would be 90% sure of the final result. Incidentally, the odds of the Conservatives getting the most votes is 0.312 (essentially through the same parameter set) – at this point they add up to 1, but this isn’t essential at this point, as there is a final node to consider.

The final node in the BBN provides the result – the function for working this out looks like this:

Essentially, if the node has data from the two feeding nodes that Labour win (with probability 1) and Conservatives lose (with probability 0 of winning) then it calls a Labour win. If the two nodes disagree, then it doesn’t know what to make of it. The final node here gives the following outcome:

Labour 69%, Conservatives 31% – identical to before, but this is because both sides of the network are in agreement. If they weren’t, then we’d get a different result here.

As you can see, we’ve now combined two types of data to get a better prediction of the current state of knowledge (based on data on or before 4th Jan) as to who will win the election. Why is this a better prediction? Because the opinion poll alone is a snapshot in time, and very likely to change once campaign starts. The expert opinion recognises this and moderates the results.

Next I’ll look at how to combine semi-opinion, semi-quantitative data together. The election story becomes a bit more complicated, as experts haven’t stopped at the ‘I don’t know’ stage, but have taken a more in depth analysis – and obviously football pundits have a lot to say…

]]>Conservatives – 28%

Labour – 33%

Lib Dem – 14%

UKIP – 14%

Green 5%

Others 6%

The question I want answered is: What is the likelihood of the conservatives getting the most number of votes at the election. This is a very different question from who will win the most seats, who the prime minister will be and so on. But again, as with the football league the conversion of the data above to answer the question I want to ask isn’t straightforward. Here’s how I approached it.

Typically, there is a margin of error of 3% in a typical opinion poll (based only on the sample size of people tested). The margin of error is calculated as follows:

1.96 * (√((p.q)/n))

The important bit here being the 1.96, indicating in fact it is 95% likely (given a normal distribution) that the real margin of error falls in the limit given. However, assuming the normal distribution (and in the absence of other information), this means we can simulate scenarios based the standard deviation of the error being 3/1.96.

So, to simulate the actual percentage of votes cast for any party, we calculate a random number from a normal distribution with the mean as the value above for each party, and the SD as 3/1.96.

If we replicate this a lot of times (10,000) then we can calculate what we are in fact interested in -the likelihood of the conservatives getting the most votes. This means their percentage of the vote must be higher than labour’s.

The simulation gives a probability of 2.73% for the conservatives getting the most votes, compared to labour’s 97.27. Given the percentage in the opinion polls, no other parties stand a chance.

However, the total number of votes must add to 100%, and the current simulation does not account for this. If we enforce this addition to 100% – basically each party’s percentage is out by a certain margin of error, and we will only count replicate runs where the total for all parties sums to 100, then we get a slightly different result – a 3.39% chance of getting the most votes, compared to 96.61 for Labour. The difference is still there on multiple runs of this simulation, probably because it is more likely that a margin of error in favour of the conservatives would be offset by a margin of error against Labour (more likely, not definite).

So, there we have it – values from the quantitative data which I can begin to use. There are a number of factors which need to be taken into account to give me the best possible prediction, and given that manifestos are not written yet, expert opinion may be very important here. How these get combined will be the subject of further posts.

R code for these simulations is given here.

]]>My solution is this:

I work out the overall form of the team this season (I am not considering current form, recent form, changes in management or anything else at the moment).

Prob of winning = Games Won / Games Played

And the same for drawing and losing

I then simulate all remaining games (18 for each team on the 4th Jan)

Suppose I take Chelsea, currently top of the table:

Won 14, drawn 4, lost 2

Prob of winning = 14/20 = 0.7

Prob of drawing = 4/20 = 0.2

Prob of losing = 2/20 = 0.1

So for each game, I generate a random number between 0 and 1. If it is less than 0.1, I assume they lose, if it is between 0.1 and 0.3, they draw, if it is over 0.3, they win.

I do this for all 20 teams, then calculate their points (3 for each win, 1 for each draw). I then pick the highest four teams. As I am only interested in the top four teams, I am not worried if aspects of these calculations are not fully correct – for example, in this calculation, all the top teams could (although it is unlikely) win all their remaining games. In practice, this could not happen, as they will have to play each other and they can’t all win these games.

Since this is all highly stochastic, I run the simulation 10,000 times and calculate the percentage of times each team ends up in the top four – this gives me my probability.

So, for those of you interested, here are the probabilities – if your team isn’t here, then they have no chance, sorry….

Arsenal – 20.01%

Chelsea – 99.99%

Liverpool – 2.19%

Man City – 99.99%

Man United – 71.53%

Newcastle – 0.32%

Southampton – 59.54%

Stoke City – 0.19%

Swansea City – 2.12%

Tottenham – 31.12%

West Ham – 13.00%

The R code for the calculations may be horribly inefficient, but is available here (it took an hour to calculate, so if you want to play with it, then reduce the bootstrap size).

The league table as of the 4th Jan – formatted for the R code – is here.

Of course, there should be consideration of new players, current form, new managers and so on, and that is something which can be built on, now I have my probabilities. More to come on this in the next few days

]]>To combine data types, I’m going to use Bayesian Belief networks, or minor modifications of these – see here and here, and my previous post here. However, it is important to understand the basics of these first. Essentially you need to know the certainty of a given event occurring (this certainty then becomes modified by other interacting activities). However, the certainty of something happening isn’t always obvious. For example, in my premiership football example, I have a league table and some rankings and points of different teams – but let’s ask a specific question. What is the probability of any given team qualifying for Europe (Champions league to be precise)? To do this, they need to finish in the top four places. How do I infer that from points or rankings at the moment (5 months before the end of the season)? How can I then modify this, based on expert opinion and so on? I’m going to work through examples of this as I have the time to work on them – but the aim is to have robust predictions based on data already available today (4th Jan 2015), even if it is February or March by the time I get around to finishing the predictions. So, first up, how to calculate my hard, quantitative priors for the Premiership…. ]]>

An approach I’m currently working on is using Bayesian networks to model community interactions by formalising data from multiple sources, knowing that there is likely to be some inaccuracy and guess work in each of these sources: http://eventmobi.com/intecol2013/agenda/34843/183954

A current piece of research is to study the dynamics of rocky shore communities using these networks. Not only do I want to know if they can predict the dynamics correctly, but also if expert opinion can correctly parameterise the networks to make these predictions.

Rocky shores are very well studied, especially through manipulative experiments. As such, it should be easy for an expert on a particular system to make judgements about what will happen – based on known data and published results. However, this is not normally the case for most marine environments. Species interactions are poorly known, and some degree of guess work is required to make predictions.

Therefore, to make expert opinion ‘less certain’, I would like to ask rocky shore ecologists who have not directly worked in the UK, or other marine biologists (UK or otherwise) who haven’t worked on rocky shores to take a questionnaire about species interactions on UK shores. It is likely you will know some of the species, or taxonomically or functionally related species, so will be able to provide educated guesses about what will happen. However, I don’t want you to conduct research on what most likely happen, or look up any of the species which you don’t know. This will hopefully mimic the kind of knowledge often required to be produced by experts in less well studied environments.

The survey can be found here:

http://rickstafford.com/expert_survey.html

I don’t ask you for your opinion on every interaction, as that would make a very long survey. Please be as confident as you can be when answering questions, and try to avoid ‘don’t know’ unless you really don’t have a clue.

If you enter your email, I’ll get back to you once I have results I can share.

]]>http://www.nature.com/news/scientific-method-statistical-errors-1.14700?WT.mc_id=PIN_NatureNews

The underlying nature of the piece is that p values are unreliable. The headlines of the figure forming the major argument in the text states: “A P value measures whether an observed result can be attributed to chance. But it cannot answer a researcher’s real question: what are the odds that a hypothesis is correct? Those odds depend on how strong the result was and, most importantly, on how plausible the hypothesis is in the ¬first place.”

However, this argument appears to relate to p-values calculated from a test such as a chi-squared or Fisher’s exact, and only these tests, as I’ll explain. Such a test may be something along these lines. You set up a classic choice experiment with a fish and some bait. The fish swims through a tube and reaches a branch in the tube (left or right). The food is in the left branch. You run the experiment 100 times and the fish turns left 73 times. Is this down to chance?

If only chance operated, then in theory, the fish has a 50:50 chance of going either way each time (ignore memory or any other preferences for now). In practice, it is unlikely to go left exactly 50 times and right exactly 50 times. If the left:right ratio was 49:51, it would be a good fit. So would 48:52, but would 73:27? You can test this statistically, and it will tell you if these results are down to chance or not. For this kind of test, the argument in the article holds – see figure here:

http://www.pinterest.com/pin/449093394064037711/

This is a weak test, with no replication. There are a whole range of reasons the fish might turn left more, which have nothing to do with the hypothesis of sensing food. However, even if you overcome the shortcomings of the method (the scientist’s job, not the statistician), it is easy to see how a different fish might respond differently (unlike most tests, replication isn’t really ingrained in these in the same way). If we got a p value < 0.05, however, it would tell us that we are 95% certain that this fish turned left more than what would be expected by our guess of what should happen by chance (i.e. chance may not actually be 50:50, even if we think it is). In this sense, the test is robust.

In addition to this, most statistical tests are different. Most are based around linear models (ANOVAs or regressions). The purpose of an ANOVA is to tell us if there is a difference in sample means between categories (are there more snails in site 1, than site 2 or site 3?). If we count every snail at the site, then we have the true population, and there’s no need for a test. However, if we sample, and work out the number per quadrat (the replicate), then we are only estimating the population mean with a sample mean (the mean is the average number per quadrat).

All an ANOVA does is to work out if having a mean for each category gives a better fit to the data than an overall mean for all categories. If so, we can conclude there is a ‘significant difference in the means’. So a p < 0.05 for an ANOVA means we are 95% sure that at least one of the category means is different from the others. Given you have taken a representative sample of sufficient size, this is robust, and repeatable, although you wouldn’t necessarily expect to get the exact same p-value each time, and with a 95% confidence, consideration of type 1 and 2 errors are of course relevant).

Equally, a regression is very similar to an ANOVA. A p < 0.05 means that if you draw a line of best fit through the points, it is a better fit to the data than a horizontal line running through the mean of the samples.

For these tests the assumption that: “A p value measures whether an observed result can be attributed to chance” is not the case. It measures if there is a better way of fitting a model to the data than just looking at how samples cluster around the mean.

There are some other issues in the article, such as the lack of association between significance and effect size in tests such as ANOVA, which do need addressing (although a nice graph of the results helps in this case). The r2 in a regression also gives a good measure of fit, normally providing more meaning than a p value. Additional information is good, but this doesn’t mean the p value is bad.

However, overall, I simply don’t buy the argument that p values are unreliable. Knowing what a statistics test does is important. Interpreting results and hypotheses in the knowledge of what a stats test does is important. But p values are very useful.

As scientists, we normally want to know if an effect can be attributed to chance. Surely we all know that this doesn’t prove our hypothesis? This just sounds like the classic rehash of not understanding cause and effect, which if you’re not easily offended is best described here:

However, removing p values removes the ability to test if something occurs by chance. I’ve recently been working with GLMMs, and the fact that there is no reliable p value is an issue for me. I can’t tell if what I have found is likely to be a real effect or not. I’m sure those with much more understanding of GLMMs can tell me how to proceed. But let’s be honest, I’m lost, and I’m sure anyone who is a non-statistician reading my paper would be as well. With a bit of common sense, they shouldn’t be lost if there are p-values.

For science to progress, we need to have a greater understanding of what statistics do, but we also need to make sure that these statistics are easy to use and easy to understand for the average scientist, even if not for the general public (but wouldn’t it be nice if we could explain it to the public too?). The p value is fit for purpose, we just need to understand what it tells us. Losing it would put scientific understanding back a long way.

Regina Nuzzo (2014). Statistical Errors Nature, 506, 150-152 DOI: 10.1038/506150a

]]>