The prize has a transformative effect on the careers and finances of literary novelists. Last year’s The Luminaries (Granta) had sold only modestly on its initial publication, but has since gone on to shift 560,000 copies worldwide. Meanwhile, since taking the prize in 2003, Yann Martel’s The Life of Pi (Canongate) has sold 3 million copies, its film adaptation taking more than half-a-billion dollars at the box office. It has changed the fortunes of its publisher along the way. Heady stuff.
Imagine it were possible for publishers and booksellers to tell which are the books that successfully achieve that knife-edge balance between literary accomplishment and commercial success. It could have powerful implication for the publishing industry. And we think it might just be possible to do that by mining the data publicly available on Twitter.
We decided to test this theory
We gathered hundreds of thousands of tweets that mentioned the six books on the Man Booker Prize shortlist. Then we created an algorithm that looked at this data, aggregating what was being said about each book by whom and in what quantity to see if it were possible to use this information to predict the outcome of the prize. And we were surprised at the accuracy of the results.