Monday, April 04, 2011

Essay: Will ePub and Mobi be Obsolete in the Future?

One corner of eBook discussions is the format wars. Should we go ePub or Mobi? Amazon or iBooks? But one question consumers should be asking is whether the books they buy now will still be valid in the future. For example, with video, we saw Blu-Ray replace DVDs which replaced VCDs which replaced VHS. On the other end of the spectrum, we have CDs and mp3s which have been resilient over the decades. What I want to talk about in this essay concerns ePub and Mobi. By purchasing these formats, are we future-proof or will it be replaced down the line?

Honestly, ePub and Mobi have a lot of shortcomings--which I'll detail below--that leads me to believe that they will be outdated in a few years time. There are some problems which can be rectified in the future, but unless the publisher re-issues the eBook for free (just like what happens with Apps in iTunes), you'll still be stuck with the older version (and unlike the print market, I don't think 1st-edition eBooks will become collector's editions). But that is just one possibility, and at the end of the day it's the market who decides. For example, txt is honestly a horrendous format for eBooks: no formatting, no font choices, no way of tracking the pages, etc. Yet whether it's legit vendors like Smashwords or book piracy sites, it still has (albeit perhaps dwindling) a significant presence.

ISBN: To clarify, this isn't an issue specific to either ePub and Mobi, but to eBooks in general. The point of ISBN is to have a standard, unique identification for each book. That's why when you buy a hardcover, it has a different ISBN from the paperback. Unfortunately, the eBook industry is still in its infancy stages that no one yet has universal standards--or at least with the capability to enforce them. Right now, eBooks fall into one of three categories: those without ISBNs, those with non-unique ISBNs (i.e. they're the same as the print ISBNs), and those with unique ISBNs. The last two can cause confusion as it's not immediately clear which eBooks have unique ISBNs and which don't.

Now most consumers won't really care whether a book has an ISBN or not (the fact that we've gone on this long buying ISBN-less eBooks says something) but from a holistic industry perspective, it's invaluable in terms of reference and standarization. Aside from identifying which books are legit and which aren't, it also keeps track of the book's "version", as well as being useful in tracking specific titles (whether you're using Amazon or Shelfari for example). Now as a publisher, having an ISBN is an extra expense, especially if we are to implement it correctly (the appropriate response is that each eBook format should have its own unique ISBN). I won't even discuss the problems magazine publishers are facing right now (no ISBN, ISBN for each issue, or an ISSN). But when taking the long-term into account, we need those unique ISBNs, at least in majority of eBooks if not in all of them.

The good news is that once everyone agrees to implement unique ISBNs, the current eBook formats can easily accommodate them. The bad news is that publishers will have to re-issue those same books, and while it may not impact the way you read your current eBook, one day you might be wondering why eBook A has (at least) two different ISBNs.

Reference Tracking: Most people will probably recognize this as page numbers but let's not be limited to that model. The Bible for example could do away with page numbers because it has another effective method of tracking which part you're referring to: chapters and verse numbers. The problem with today's current eBooks is that they don't have such reference points. It's not a problem when used in isolation (i.e. you're reading it for yourself) but when we start discussing books with others, it's handy to have a common reference point. You can't suddenly cite "12% of Book A" in your term paper because 12% will vary from device to device, from book to book. Amazon is currently adding page numbers to its books but my Kindle books still don't have page numbers (i.e. the publisher will have to re-issue them) and it also begs the question, which print book is Amazon using as a reference point? Page 100 of a hardcover will be different from page 100 of a paperback.

Now I'm not sure about ePub but I think page numbers can be implemented in the current format, at least with slight tweaking of the code and the standards (that's a "can be remedied now" in Mobi and a "maybe in the immediate future" for ePub). But again, the problem is similar to that of ISBNs: publishers will need to re-issue those same eBooks for consumers who previously bought them in order for everyone to have a standard reference point.

Footnotes: This is an example of the current limitations of ePub and Mobi. It can't do footnotes, period. It can create endnotes but there's a significant difference between reading footnotes and endnotes, whether it's fiction or non-fiction. This limits the kind of books that are functional in either format.

Again, it's not absolutely impossible to implement considering the roots (HTML) of ePub and Mobi, but it will need an updating of the standards of either format (i.e. not anytime soon). One solution is to use the note-taking ability of each format/device and allow publishers to load notes/footnotes into the eBook itself (of course this has its own set of complications, such as whether such text is editable by users). Another method is to expand the HTML capabilities of ePub and Mobi and allow modern HTML techniques which can remedy the problem. (See what my web designer did for this footnote-heavy short story for example.)

Layout Limitations: There's really a lot of layouts that can't be implemented with either ePub and Mobi (and the lack of footnotes as mentioned above is just one symptom). RPG books for example are illustration and table-heavy which is why they tend to be released as PDFs. Comics are similarly sub-optimal for either ePub or Mobi. And it's not just texts with images. The layout of "North Shore Friday" by Nick Mamatas got screwed up in the Kindle.

For me, this is one of the biggest limitations of both formats, and why a "better" format might pop up in the future. It's either that or the ecosystem adapts and instead of having a single "universal" format, consumers will now buy eBooks with specialized formats depending on the type of book (it's not necessarily a horrible paradigm shift, although it does mean you'll need several programs--or even several specialized devices--to read a wide variety of books).

Again, the good news is that neither ePub nor Mobi has explored the full potential of the HTML format. Its future versions might include full access to the capabilities of HTML 5.0 for example.


So what does this all mean? I do think it's likely that while ePub and Mobi might survive in the future if they take the right steps, the eBooks you currently own will one day become obsolete (and honestly no technology is 100% future-proof). That doesn't mean I'll stop buying eBooks now, but it has to be taken into consideration when you're building a digital archive (i.e. an archival library) today. What format will you store your files? ePub and Mobi might be consumer-centric, but that might not be the best choice in terms of storing reference material. (On the other hand, I seldom re-read books, so repurchasing the thousands of books I currently own might not be an issue, and I'll end up only re-buying the books I really, really want or plan on reading again.)

It also leads to an interesting discussion with regards to the future: do we end up with a universal book format, or do we end up with specialized formats (i.e. CBR for comics, PDFs for text books, ePub for fiction, etc.). And of course, this is also steered by the choices of consumers. Again, txt is a horrible format for eBooks, yet it's still circulating today.

4 comments:

martin said...

Charles,

I think it's unlikely that we'll have several different programs, it's much more likely that the one program will just be modified to adapt to several different formats. Ebook formats are software not hardware, so it's also unlikely that they will become obsolete in the way Laserdisc or other such formats have become obsolete

Charles said...

In an ideal world, that might be the case. But we're not living in an ideal world.

There are, of course, software that currently reads a combination of ePub, Mobi, and PDF (like Calibre and Stanza) but they're the exception from the norm, and some layout the text vastly different (i.e. ePub vs PDF has different technical requirements, although iBooks is more than capable of presenting both). Various agendas will come into play here, hence the possible scenario of multiple formats, multiple software, and even multiple specialized hardware.

And software does become obsolete. Few people still use Wordstar for DOS for example.

House said...

I think a distinction needs to be made about executable software versus readable data. Wordstar might not be used any more, but I would wager that there is some way to read its files on a modern computer system, even if it had been a proprietary binary format.

ePub, for all its shortcomings, it supposedly an open, text-based standard. In theory, one can "unzip" an ePub file and see the human-readable contents, which are basically just HTML. From that perspective, even if it is no longer dominant, it should still be possible to maintain a legacy reader without much trouble unless e-reader hardware dramatically changes.

The DRM the big vendors attach to ePub is likely to become obsolete before the format itself. While the technologically-savvy can break it easily, in the US that is a crime even if you own the e-book, so the major impediments to long-term ePub readability may be legal rather that technical.

I agree that a more versatile format will likely emerge. I think, however, that---if we ignore the complications of DRM---then ePub could have a lifetime similar to the old txt file. After all, it's really just a jazzed up version of that, anyway.

Scott Laz said...

The Calibre program can convert between epub, mobi and lit files with little problem. Hopefully it will be possible convert these old files into newer formats as they arise. My Blu-Ray player can play regular DVDs and CDs (but I still need my turntable!).