Computers still struggling with sporting schedules

As a child, like many others, I was an obsessive sports fan. But my geeky side soon surfaced and I realized I was much more interested in the statistics and logistics behind competitions.

For example, while I may have struggled with algebra and chemical equations, I knew by heart the formula for how many total games are played in a round-robin tournament (total teams, multiplied by total teams minus one, then divided by two) and the most efficient way to organize a knockout/elimination tournament (the difference between the number of entries and the nearest higher number from the series 2,4,8,16,32,etc, is the number of byes needed in the first round.)

So you can imagine my enthusiasm for an article by the BBC’s Paul Fletcher looking at the computer system used to decide the fixture list in the English football league. It’s a fascinating look at both the benefits and limitations of computing for real world solutions.

To put the task into context, there are several differences in the league from most US sports leagues: divisions are much bigger (20 or 24 teams), teams only play teams from their own division (twice a season, once at each side’s stadium),

Now, you might think it was a simple enough process as all you’d need is a randomizer function. However, there are several factors which are built into the process. The key one is that each team is paired to a nearby rival (which may be in a different division) and the system aims to avoid both playing at their home ground on the same day as this can cause problems with policing, traffic and public transport.

But there are several other factors which complicate matters, including (among many others):

  • Some ‘unpaired’ teams share safety officials or are on the same public transport line, meaning its also best they don’t play at home on the same day. (This can cause a chain effect where one team’s fixture can affect a seemingly unconnected team many miles away.)
  • League rules say teams must not play more than two consecutive games at home.
  • Certain pairings of teams with an increased potential for crowd trouble should be avoided on the final day of the season when there’s an increased chance of the game’s outcome being decisive to a team’s final position.

What’s perhaps most surprising is that the solution is found each season by one man working on a laptop. Paul Snellgrove has a standard program for the system, but begins the process by manually building in around 90 specific requests from teams and local authorities to avoid certain match-ups on certain dates. The computer then works out the entire list in under an hour… but that’s just the start of it.

The list then goes through as many as 40 revisions after consultations with league officials, clubs and representatives of supporters. Most of these revisions aim to minimize traveling time for fans, particularly on weekdays (when there is less time to travel after work) and on public holidays (when public transport is more limited.) Each time one game is changed, such as moving dates, as many as 48 other games can be affected. The entire process must be completed in the brief period between the last game of the previous season and the publication of the lists, which this year was just 23 days.

The main reason for the manual revisions and the human intervention is that the standard version of the system does not take geography into account. This is partially because doing so would add a major degree of complexity (witness this university study into the problem) and partly because dealing with such logistics is what computer scientists call an ‘NP complete‘ problem, meaning that not only is there no known perfect system for solving it, but that each added level of detail exponentially increases the time needed to solve it.

(The most common variation is the traveling salesman problem, which aims to find the most efficient route which visits a set number of cities once each. The Journal of Problem Solving once dedicated an entire issue to the subject, including whether humans may be able to solve the problem as, or even more, efficiently than a computer.)

Some people commenting on the BBC article have complained that it should be simple enough to add geographical information, for example by cross-referencing the system with Google Earth co-ordinates. The problem here is that distance is not necessarily a guide to ease of travelling time. For example, a city equidistant from two neighbours may have a high-speed road to one, but only a narrow, winding road through hilly terrain to another. Public transport links may also vary greatly. And that’s before you even start thinking about weekday vs weekend traffic patterns.

This year the system has been tweaked so that distance is built-in to the calculations for midweek games. Only time will tell how well it works and how long humans will continue having to tweak the computer’s output.

(Picture via Wikipedia)


Comments are closed.