So, this blog has been pretty dormant for a while now, but a New Year (in capitals, so it must be new) and a new intention to actually do something with the blog. I’m going to post a number of articles based on a talk I was asked to give a few years ago – about combining diverse data to make better predictions. Essentially I am interested in this from a conservation management approach – how can we incorporate local knowledge, qualitative research and diverse quantitative measures into meaningful predictions to successfully manage and protect ecosystems. However, the two examples I’m going to work on are not in this area at all: they are 1) predicting the general election result, and 2) predicting which football teams will qualify for Europe in this year’s premier league. I don’t really know much about either – which probably is good to avoid bias creeping in. There may be a few mistakes in what I say, but really it is all about quantifying data in the same way, and I don’t really care about the predictions…
To combine data types, I’m going to use Bayesian Belief networks, or minor modifications of these – see here and here, and my previous post here. However, it is important to understand the basics of these first. Essentially you need to know the certainty of a given event occurring (this certainty then becomes modified by other interacting activities). However, the certainty of something happening isn’t always obvious. For example, in my premiership football example, I have a league table and some rankings and points of different teams – but let’s ask a specific question. What is the probability of any given team qualifying for Europe (Champions league to be precise)? To do this, they need to finish in the top four places. How do I infer that from points or rankings at the moment (5 months before the end of the season)? How can I then modify this, based on expert opinion and so on? I’m going to work through examples of this as I have the time to work on them – but the aim is to have robust predictions based on data already available today (4th Jan 2015), even if it is February or March by the time I get around to finishing the predictions. So, first up, how to calculate my hard, quantitative priors for the Premiership….