08 Feb 2008

Titlesquatting: The 85,759 Books of Philip M. Parker

[Prompted by this MetaFilter discussion.]

New technologies create new ways of communicating, thinking, and producing, but they also inevitably create new ways for con-men and hucksters to make an easy buck. Email brought us instantaneous, nearly zero-cost global communication; it also brought us spam. Webpages and search engines brought us more information at our fingertips than ever before in history; it also brought us domain squatting, tasting, typosquatting, blog and link spam. It’s an iron law of human nature that wherever there is a way to take advantage of a system for profit, someone will do it.

Amazon.com is poised to make the so-called “long tail” of book publishing available to all of us, by allowing ‘print on demand’ publishers to list their books in Amazon’s online catalog, and then print the copies individually, whenever an order comes in. It’s an idea with a lot of promise: by eliminating overhead, PoD allows books on incredibly niche subjects – which traditionally would have had a single short-run printing and then gone out of print, or not been printed at all – to stay available and in print.

But now this technology has found its own problem, eerily reminiscent of email’s spam and the web’s ad-ridden pages: automatically-produced ‘books’ consisting of database dumps on a particular subject. Like typosquatters who buy up thousands of domain names, knowing that it only takes a few ad hits to recoup the cost, or an email spammer who sends out billions of messages knowing only a few will lead to sales, a ‘titlesquatter’ can create thousands of ‘books’ in a database like Amazon’s, each on an almost ridiculously-niche subject. If an order comes in, the information is quickly assembled from publicly-available sources and the tome is sent out.

Phillip M. Parker, a professor of marketing at INSEAD, seems to be taking this route. He has over 80,000 books listed on Amazon, on subjects ranging from obscure medical conditions to toilet-bowl brushes. According to a Guardian article, they are written by a computer, at a rate of approximately 1 every 20 minutes.

Although some of the books do get positive reviews (not that this is saying much; Amazon’s review system is anything but unbiased), even the books’ supporters note that they are mainly compendia of Internet sources. This review, on “The Official Patient’s Sourcebook on Interstitial Cystitis” which retails for $24.95, is fairly representative:

I was very disappointed when I reviewed this book. It was almost as if the author(s) went to a search engine, and the NIH’s Medline, and the National Library of Medicine (PubMed) did a search for IC then made a book out of the results. … In my opinion, just a few hours on the web “today” will yield more current and useful information than that provided by this book. For those seeking information on IC, I suggest a search on “google.com” instead.

Others are more blunt:

The is downloaded copy of the NIAM website, and a list other research websites. I learned more from Google.

Although there may be a place and a market for ‘sourcebooks’ of this type, when they are clearly described and marked as being machine-written or -compiled, judging from the reviews it seems as though many consumers are purchasing them expecting more, and are consequently disappointed. This is bad news for print-on-demand, and the ‘long tail’ in general: if Amazon and others do not work to keep the content of their catalogs high, consumers may learn to mistrust anything that’s not highly ranked in sales numbers. PoD already has a poor reputation within the publishing industry, and if machine-generated books with plausible-sounding titles become more common, to the point where users have to sort through dozens of infodump ‘sourcebooks’ to find one offering new information, the situation could get far worse. At worst, it could turn users away from reference books completely – why bother buying reference books, if the majority of them just reprint what you can find in an online search anyway?

Although nothing that Parker is doing is illegal or even contrary to Amazon’s current policies, it makes sense for Amazon and other retailers that catalog PoD books to nip this behavior in the bud, before it becomes a full-fledged epidemic. If there’s anything that we should have learned from email and web spam, it’s that what begins as an oddity and an annoyance can quickly become a major waste of time and resources.

This entry was converted from an older version of the site; if desired, it can be viewed in its original format.