Home | Links | Updates | Andromeda | Avatars | Anime | Dark Angel | eBooks | Matrix | Other | Pern | Stargate | Star Trek | Star Wars | Search
Because it's a completely open and free standard.
The .epub is a standard for eBooks created by the International Digital Publishing Forum.
It consists of basic XHTML for the book content, XML for descriptions, and a re-named zip file to hold it all in.
Anyone can make these eBooks, and since they're essentially just XHTML, anyone can read them.
For a review of programs that can read ePub formatted books, click here.
Some books in the IDPF .epub format are available here.
There are now a few tools for automating the process of creating ePub books. Since the main goal of this page is making an ePub book by hand, I'll only briefly review them here.
Adobe InDesign
I've heard that this is what most publishers are using to create ePub books.
If I skipped a few mortgage payments, I could buy a copy and review it.
Feedbooks
Feedbooks is a wonderful site for downloading free books in the ePub format.
Also, if you create an account, you can publish your own books too. The publish feature on the site
is very akin to a wizard: You enter the title, author, and other information about the book, then enter
create chapters and add text to them.
Right now, there are a few minor hassles. For example: if you create a chapter, then want to move it up, there's no way to do it,
and a few other such things. I've been informed that the process will be greatly improved very shortly. I'm also hoping for a
"quick and dirty" option to directly convert from a signle file (right now you can't).
Book Glutton
If your book is in the HTML format, you can use Book Glutton to convert it to ePub. It doesn't support images yet.
Stanza
Stanza is a desktop application for reading eBooks in a variety of formats. It also has the option to save files
to a variety of different formats; one of which is ePub.
Stanza does a pretty good job of exporting to ePub, but it isn't really a creation tool. It saves to an ePub file
but if you want to edit the title, author, or anything else in the output, you have to do it by hand. Still, if you
just want to read , the output is pretty good.
Calibre
Calibre is an open source book management program that is avaliable for Linux, OS X, and Windows.
It also converts between diffrent eBook formats, and does a good job of it too.
My one gripe is it has somewhat messy code output, which isn't a big deal if you just want to read a book, but
can make further hand edits 'interesting'.
eCub
I haven't used eCub yet, look for a review of it soon.
Azardi
I haven't used Azardi yet, look for a review of it soon. The website says it can read, manage,
and edit ePub files.
If you're interested, I figured out the information in this guide by a combination of reverse engineering the Sherlock Holmes book from Adobe's site, reading through the specs at the IDPF web site, and trial and error until I got a working eBook to load properly in Digital Editions. (I figure it's OK to do that since Holmes is in the public domain now...)
Tools Needed:
- A text editor. Anything that can edit text files, HTML, and XML. Example: Notepad
- A .zip program. Anything that can create .zip files. Example: Windows XP's built-in .zip support
Optional Tools:
You can make ePub files with just the programs that came with your operating system, but here are some
suggestions for tools that can make the process easier.
For Windows:
- Edit Plus (shareware)
- Notpad++
- Info-Zip
For OS X:
- Text Wrangler
I haven't found a free zip program that I love for OS X yet. Feel free to send suggestions.
Tools for cleaning up source documents:
Below are some tools for cleaning up the HTML/XHTML files often used for scources for ePub books.
Cleaner source code will produce a better looking book. Most of the ePub readers right now only support
basic tags and do strange and wonderful things when they see a tag they don't recongnize.
Tag Soup
I had a nice Windows program specificly to clean MS Word crap out of HTML pages... but I can't seem to find
what I did with it. I'll keep looking and post a link here if I can find it again...
The Process can be broken down into two parts:
1. Prepare the content
2. Put in in the container.
First, let's go check out the official specs. Yes, it's very boring and hard to follow,
but aren't they all?
These will come in handy later on though. After getting the basic structure of the file setup,
the official specs are handy to reference for tags that aren't used very often, or if you can't
remember what exactly goes in a certain tag.
Don't let them scare you though, we really only have to fiddle with two XML files, the rest is
either straight XHTML, or files that you can copy from the sample file that we'll be looking at later.
Before we start preparing our own eBook, lets look inside a sample file.
Great. Now what is all this stuff?
A .epub file contains, at a bare minimum, the following files/folders:
Lets look at each of these in more detail.
Feel free to extract these files and use them as a template...
One thing to note before we get started: the filenames are case sensitive.
This means that if you have a file named "Chapter1.xhtml" and you refer to it as
"chapter1.xhtml" in the .OPF file or .NCX file, the book will not display properly.
mimetype
This file is just a plain ASCII text file that contains the line:
"application/epub+zip"
The operating system can look at this file to figure out what a .epub
file is instead of using the file extension.
This file must be the first file in the zip file, and must not be compressed.
META-INF Folder
This contains the container.xml file, which points to the location of the
Content.opf file.
This folder is the same for every e-book, so you should be able to recycle the whole
folder from the sample file without making changes.
OEBPS Folder
Notes on the OEBPS folder:
This is the folder where the book content is stored. According to the IDPF spec, you don't have to
put your book content in here, but it is reccomended. I've come across at least two readers that won't read
the book properly if the content isn't in this folder. (If you do put your book content somewhere else, make sure
that you update container.xml to point to the correct location of the content.opf file.)
- images Folder
If you have any images for your eBook, they go in here.
- Content.opf
This file gives a list of all files in the .epub container, defines the order of files,
and stores meta data (author, genre, publisher, etc.) information.
Note that this file can be named anything you want to call it, as long as the container.xml
file mentioned above points to the correct filename.
Lots of stuff in this file. I'll go through each required tag here. Check the specs to see more information about optional meta data tags.
dc:title - Title of the book
dc:language - Identifies the language used in the book content. The content has to comply
with RFC 3066.
List of language codes.
(I'd just copy the language tag from the sample...)
dc:identifier - This is the book's unique ID. This has to be a unique identifier for every different e-book.
The spec doesn't give any sort of recommendation for what to use, but an ISBN number would be a good bet.
I used the name of my web site and the date and time.
One thing to note, because of how the file interacts with toc.ncx, just modify what's after the "
uuid:" on this line.
Next comes the manifest. This is just a listing of the files in the .epub
container, and their file type.
Each item is also assigned an item ID that's used in the spine section of content.opf.
This list does not have to be in any particular order.
The spine section lists the reading order of the contents. The spine doesn't have to list every file in the manifest, just the reading order. For example, if the manifest lists images, they do not have to be listed in the spine.
- toc.ncx
This is the table of contents. This file controls what shows up
in the left Table of Contents pane in Digital Editions
Things you need to change:
- Make sure the uid matches what you have in content.opf
- doctitle: The text inside the text tag is what will show up as the books title in the reader software
- The navpoint tag.
Each nav point is a chapter listing, the text is the chapter name, and the src is the file it links to.
If you copy a navpoint tag set to add chapters, make sure to update the id and playorder values.
Notes:
According to the spec, the ID can be anything you want, but it's easier to keep track of things if you use the same
ID you used for that file in the .OPF file. Also, some readers won't properly display the Table of Contents if
the ID doesn't match.
Also, the playorder values have to be in order. (An item with playorder 1 will be before an item with playorder
2, etc.) They also have to be listed in order, and can't have any gaps. (You'll get an error if you jump from
1 to 20, etc)
- page-template.xpgt
This file isn't part of the IDPF spec, but Adobe Digital Editions uses it for formatting
and setting column settings and whatnot. You don't need this file at all, but your book
will look nicer in Digital Editions if you include it. Other readers should just ignore it.
Note: You can use a .css style sheet file to layout styles for your book as well.
Just make sure to list it in the manifest section of Content.opf
Also of note here, any styling should be done in a CSS stylesheet, and not in the document.
- Content .xhtml files
Content files should be XML 1.1 documents
If you're not familiar with XML, it's basicly HTML with closing tags for every element, and several
style tags are not supported.
As far as how to put the content, you can have it all in one document with bookmarks at each chapter, or
each chapter in a seperate .xhtml file. The latter looks nicer in most readers.
Now we make the .epub container that all these files go in.
* The specification recommends that the books files go in an "OEBPS" folder inside the zip file. If you put them in another spot, be sure that container.xml in the META-INF folder points to the correct location of the *.opf file.
The zip file layout should look something like this:
- mimetype META-INF - container.xml OEBPS images - content.opf - toc.ncx - stylesheet.css - content.xhtml
You should now be able to open your eBook in Adobe Digital Editions, or any other reader that supports the .epub format.
If you want to cheat, download the file below. It's a zip file that has empty chapter pages,
and the content and toc files pre filled out, so all you have to do is copy and paste your content
into the empty files, and modify the OPF and NCX files.
Blank Sample file
So you've made a sample ePub book, and it won't open, or it opens with an error, or looks funky. What now?
epubcheck is a program that will scan your ePub file and display any errors it finds in the book.
You can download it here
or go to threepress's website to have it scanned.