Ebooks and Media Formats

Nate Hoffelder at the Digital Reader writes up a writer who considers the natural death cycle of various physical media. He finishes with a couple of bizarre conclusions:

  1. ``So in conclusion, I’m going to go against my source and predict an indefinite lifespan for current ebook formats (barring some unpredictable random occurrence).''

  2. ``This is one of those times where it is safe to bet on proprietary over open formats.''

Mr Hoffelder is usually sharper than this. His #1 runs counter to everything we know about the digital era, as does his #2. More on the second notion:

Kinds of ebooks

  • An ebook (the field Mr Hoffelder covers) is a digital file. They fall into a range of proprietary to open formats.

  • 1. Binary Blobs These are the most proprietary. They can be opened and read only with software that understands the file format, and the reader software itself is usually also proprietary. To read this format 100 years from now, you will need a computer and the hardware and software to read whatever physical media the file lives on (these latter will be assumed as common of all ebook formats). But you also need the software that can open and display the contents of the binary blob.

  • 2. Compressed and Encrypted These are compressed (maybe by proprietary compression algorithms, or by standard compression algorithms) and encrypted. These files are almost as proprietary as the binary blobs. To read them you must have software that can uncompress the file and software that will pass the proper key to decrypt it. Usually a proprietary file such as this will give readers one tool that will do both these jobs.

  • 3. Compressed but Not Encrypted These are compressed (again, by either proprietary or standard compression algorithms) but not encrypted. All you need is to be able to uncompress them.

  • 4. Not Compressed but Encrypted These are not compressed, but they are encrypted, so you will need some sort of software that will let you enter a key to unlock the book contained within the encryption wrapper.

  • 5. Nor Compressed Nor Encrypted These are just text, with some degree of markup, of which there are a number of levels:

    • a. Highly Markup Open Document format, and Rich Text Format, are both markup schemes that are very verbose: in order to find the text content of the book, you will have to wade through a sea of markup language, after traversing a continent of header information. Usually though you will use software that understands and interprets the markup jungle employed.

    • b. Medium Markup Medium markup might be HTML, whose header is only an islet, not a continent, and whose markup of the content is mere puddles, easily skipped over. This is the densest, most-highly-marked up, format that you could read as raw text if you had a few hours' (at most) study of the markup language used.

    • c. Light Markup Light markup schemes or minimal markup -- the best-well known and used now seems to be John Gruber's Markdown -- use the least amount of markup to render the text, and are generally designed so that you could read the raw text with very little or no knowledge of the markup language; the intention behind such schemes is that the markup is almost self-explanatory, though some notions have evolved from conventions of email that maybe a reader from 100 years from now would need help understanding.

    • d. No Markup -- Text Only ... This can comprise basic ASCII text or 8- or 16- or 32-bit encodings. Seven-bit ASCII text will need the least amount of software to decode, and higher-bit encodings will need only a little bit more. Such files might declare their encoding at the start of the file, as HTML files are encouraged to do.

Every ebook that comprises what was called a `book' 60 years ago -- i.e., text -- will contain text. This text is the content and the heart of the ebook onion. Around that text is a wrapper of markup, more or less verbose. And that markup-wrapped text is then either encrypted or compressed, or both, as further wrappings. And this may then be blobbed into a binary file format.

Therefore, in order to read any ebook in a proprietary format, all the succeeding, interior layers of the ebook onion must also be readable.

In other words, to read any proprietary format ebook, you must be able to read text.

Text is the only format that will die only when the physical media the file is etched on, and all methods of reading data off that media, crumbles away.


Last built: Mon, Mar 17, 2014 at 9:41 AM

By SWP Pond, Wednesday, August 14, 2013 at 11:35 AM. When in doubt, blog.