{"id":44,"date":"2015-01-05T16:35:34","date_gmt":"2015-01-05T16:35:34","guid":{"rendered":"http:\/\/blog.rickstafford.com\/?p=44"},"modified":"2015-01-05T16:35:34","modified_gmt":"2015-01-05T16:35:34","slug":"probability-of-winning-an-election","status":"publish","type":"post","link":"http:\/\/blog.rickstafford.com\/?p=44","title":{"rendered":"Probability of winning an election"},"content":{"rendered":"<p>While the time to the UK general election is roughly the same as that to the end of the football season (see <a href=\"http:\/\/blog.rickstafford.com\/?p=38\">here<\/a>), there are considerable differences in the type of data presented on which to make predictions. The biggest difference is that for the football league tables, half of the games have already been played \u2013 and those points are \u2018in the bag\u2019. For the election, nothing has been decided. However, a good place to start the predictions is with opinion polls. The opinion polls at the start of January (Guardian\/ICM \u2013 17th Dec) gave the following.<\/p>\n<p>Conservatives \u2013 28%<br \/>\nLabour \u2013 33%<br \/>\nLib Dem \u2013 14%<br \/>\nUKIP \u2013 14%<br \/>\nGreen 5%<br \/>\nOthers 6%<\/p>\n<p>The question I want answered is: What is the likelihood of the conservatives getting the most number of votes at the election. This is a very different question from who will win the most seats, who the prime minister will be and so on. But again, as with the football league the conversion of the data above to answer the question I want to ask isn\u2019t straightforward. Here\u2019s how I approached it.<\/p>\n<p>Typically, there is a margin of error of 3% in a typical opinion poll (based only on the sample size of people tested). The margin of error is calculated as follows:<\/p>\n<p>1.96 * (\u221a((p.q)\/n))<\/p>\n<p>The important bit here being the 1.96, indicating in fact it is 95% likely (given a normal distribution) that the real margin of error falls in the limit given. However, assuming the normal distribution (and in the absence of other information), this means we can simulate scenarios based the standard deviation of the error being 3\/1.96.<\/p>\n<p>So, to simulate the actual percentage of votes cast for any party, we calculate a random number from a normal distribution with the mean as the value above for each party, and the SD as 3\/1.96.<\/p>\n<p>If we replicate this a lot of times (10,000) then we can calculate what we are in fact interested in  -the likelihood of the conservatives getting the most votes. This means their percentage of the vote must be higher than labour\u2019s.<\/p>\n<p>The simulation gives a probability of 2.73% for the conservatives getting the most votes, compared to labour\u2019s 97.27. Given the percentage in the opinion polls, no other parties stand a chance.<\/p>\n<p>However, the total number of votes must add to 100%, and the current simulation does not account for this. If we enforce this addition to 100% \u2013 basically each party\u2019s percentage is out by a certain margin of error, and we will only count replicate runs where the total for all parties sums to 100, then we get a slightly different result \u2013 a 3.39% chance of getting the most votes, compared to 96.61 for Labour. The difference is still there on multiple runs of this simulation, probably because it is more likely that a margin of error in favour of the conservatives would be offset by a margin of error against Labour (more likely, not definite). <\/p>\n<p>So, there we have it \u2013 values from the quantitative data which I can begin to use. There are a number of factors which need to be taken into account to give me the best possible prediction, and given that manifestos are not written yet, expert opinion may be very important here. How these get combined will be the subject of further posts. <\/p>\n<p>R code for these simulations is given <a href=\"http:\/\/www.rickstafford.com\/software\/elec.r\">here<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>While the time to the UK general election is roughly the same as that to the end of the football season (see here), there are considerable differences in the type of data presented on which to make predictions. The biggest &hellip; <a href=\"http:\/\/blog.rickstafford.com\/?p=44\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/posts\/44"}],"collection":[{"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=44"}],"version-history":[{"count":1,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/posts\/44\/revisions"}],"predecessor-version":[{"id":45,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=\/wp\/v2\/posts\/44\/revisions\/45"}],"wp:attachment":[{"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=44"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=44"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/blog.rickstafford.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=44"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}