ePub conversion

ePUB logo

ePUB to MOBI in 5 Steps on Windows Platform

Here is a simple method showing how to convert your book from ePUB into a MOBI file for Amazon Kindles. It can use the same steps as for a conversion from MS Word. If you have a lot of photos and pictures, or weird or fixed layouts then this won’t be quite right for you, but if you have a text or fiction book with a few pictures this should be fine.

  1. Download Amazon’s KindleGen program (http://www.amazon.com/gp/feature.html?docId=1000765211). Get the latest version. Save it to the same directory/folder as the epub file you  want to migrate.
  2. Create a new text file and call it runkindlegen.bat. Put the following text into it (change the two names “your html file.epub”  and “your book name.mobi”) and save the file.

    @echo off
    “kindlegen.exe” “your html file.epub” -c2 -o “your book name.mobi” -verbose
    pause

  3. To create the .MOBI just double-click on the runkindlegen.bat file you have created. It will open a small black command window and as ‘verbose’ is set it will list any errors, warnings and information messages on the screen.
  4. Go through the messages and look for any that start Wxxx or Exxx. You can ignore the ones that start Ixxxx as these are just for information. If you don’t include a cover file then there will be at least one warning message for the missing cover. For any other messages check to see what the problem is and if you can fix it.
  5. Download and install Amazon’s Kindle Previewer (http://www.amazon.com/gp/feature.html?docId=1000765261), and use it to check the .MOBI file loads and reads ok.

The KindleGen program page (http://www.amazon.com/gp/feature.html?docId=1000765211) has some help options if you want more control on the formatting.

If the ePUB includes a cover, then if you’re planning on uploading the MOBI file to Amazon then you should remove the cover from the ePUB first otherwise you end up with two covers.

Good luck.


19 October 2014

ePUB logo

QuarkXpress 9.5.4 Cascading Style Sheets (CSS) for ePUB

We have just finished creating and checking the ePUB files for our latest book: Inspector Hobbes and the Gold Diggers by Wilkie Martin. And we thought we’d share the formatting we have used for it.

In our earlier post Building an ePUB with QuarkXpress 9.5.4 and Sigil (https://witcherleybooks.com/2014/08/17/building-an-epub/) we described how we format ePUBs in QuarkXpress and how we use the pre-defined Cascading Style Sheets (CSS) style names in the QuarkXpress files to control the format of the ePUB when it is output. In this post we share what the contents of the amended style.css and toc.css files that we have used for our books when generating the ePUBs from QuarkXpress. Though please note that we then also make further amendments to the ePUB file after it is exported, so this gives us the start position and does the vast majority of the book formatting.

These are used for English fiction books with a simple layout, with one or two photos/logos and some centralized text. These books are reflow format and do not included any fixed format, tables or indexes.

We use the standard QuarkXpress style names for these areas of the books:

 

  • – body = for body
    – byline = for author name
    – figure = for figures
    – figure-caption = for some centralized text, for the copyright page, for the company name and ISBNs
    – figure-credit = for figure credit
    – indented-para = for the standard chapter text – all indented
    – pullquote = for subtitle text
    – title1 = for the book title
    – title2 = for the advert headers for othr books available
    – chapter-name = for the chapter numbers, and section titles
    – headline1 = for the first paragraph in a chapter or section (not indented)
    – headline2 = for the last paragraph in a section (indented and with a space after it to separate the sections)

To these we have added styles for centering and a style id for copyright (see our post Centering for ePUB (https://witcherleybooks.com/2014/09/03/centering-for-an-epub/) which describes why and how we’ve used this.

The QuarkXpress CSS files are found in this folder: QuarkXpress9\XTensions\DigitalPublishing\Templates\css

Here is the replacement style.css file that we have used.

.body {
margin-left: 0.5em;
margin-right: 0.5em;
orphans: 1;
widows: 1;
}
.byline {
font-size: 125%;
text-align: center;
text-indent: 0em;
margin-top: 1em;
margin-bottom: 1em;
}
.figure {
text-align: center;
text-indent: 0em;
}
.figure-caption {
font-size: 75%;
text-align: center;
text-indent: 0em;
margin-bottom:1em;
}
.figure-credit {
margin-top: 1em;
margin-bottom: 1em;
margin-left: 2em;
text-align: left;
text-indent: 2em;
}
.indented-para {
text-indent: 1.25em;
}
.pullquote {
font-size: 125%;
text-align: center;
text-indent: 0em;
margin-top: 1em;
margin-bottom: 2em;
}
.title1 {
font-size: 150%;
text-align: center;
text-indent: 0em;
line-height: 2em;
}
.title2 {
margin-top: 1em;
text-align: center;
text-indent: 0em;
margin-bottom: 1em;
}
.chapter-name {
font-size: 150%;
text-align: center;
text-indent: 0em;
margin-bottom: 1em;
}
.headline1 {
}
.headline2 {
text-indent: 1.25em;
margin-bottom : 1em;
}
.bold {
font-weight: bold;
}
.italic {
font-style: italic;
}
.underline {
text-decoration: underline;
}
.strikethrough {
text-decoration: line-through;
}
.strikethrough-and-underline {
text-decoration: line-through underline;
}
.superscript {
vertical-align: super;
}
.subscript {
vertical-align: sub;
}
.superior {
vertical-align: super;
}
p {
margin-top: 0;
margin-bottom: 0;
orphans: 1;
widows: 1;
}
p.title1 {
text-align: center;
text-indent: 0em;
}
p.title2 {
text-align: center;
text-indent: 0em;
}
p.pullquote {
text-align: center;
text-indent: 0em;
}
p.byline {
text-align: center;
text-indent: 0em;
}
p.figure-caption {
text-align: center;
text-indent: 0em;
}
p.figure {
text-align: center;
text-indent: 0em;
}
p.chapter-name {
text-align: center;
text-indent: 0em;
}
p.headline1 {
}
p.headline2 {
}
#copyright    {
font-size: 75%;
text-align: center;
text-indent: 0em;
}
span.centered {
display:block;
text-align: center;
text-indent: 0em;
}

And the replacement toc.css that we have used:

#toc-style {
font-size: 100%;
}
#toc-style li {*/
list-style-type: none;
}
#toc-style ol {
font-size: 100%;
}
#toc-style ol li {
padding-left: 10px;
list-style-type: lower-alpha;
}
#toc-style ol ol {
font-size: 100%;
}
#toc-style ol ol li {
padding-left: 5px;
list-style-type: lower-roman;
}
.space-fix {
visibility: hidden;
line-height: 0;
}

These are the formats we’ve used so far. It is possible we will find things we need to change in them later. Once we generate the ePUB we use a number of devices to check the layout is ok and matches what we want. We can’t guarantee these will give you what you want, they are just an example of what can be done.

If you notice anything wrong or can suggest a better way of laying out any of the above then please let us know, we’d be happy to hear from you.


Definitions:

centered – text/pictures that appear in the middle of the page (using American spelling)css – cascading style sheet – a common holding place for the html commands used to control the layout

ePUB – electronic publication, a zipped file containing files for each book chapter and layout files that make up an e-book

toc – table of contents

 

ePUB logo

Centering for an ePUB

This is how we center the text for our ePUB files. After creating our ePUB files we found that the centering of text didn’t always work – even on devices where we’d tested the ePUB files during the build. This is due to the ePUB that eventually ends up on the ereading device going through the proprietory load process for the device, and some of them render the centering commands differently. We therefore used a combination of approaches to centering – though it is also worth considering if you want to center at all. Some devices offer the user the chance to control how the text is shown, what justification to use, and some of the code suggested below will mean that the user’s preferences are ignored, so don’t use this approach too widely.

This is what is needed around the text you want to center within each of the text html files:
<p class=”yourclass”><span class=”centered”>Your text to center</span></p>

This is also needed, assuming that you want the whole ‘body’ centered then can add the ‘id’ centertext, but still do the spans around each individual paragraph (p):
<body class=xyz id=”centertext”>
With the main text area that makes up the ‘body’. This is the main area of the book chapter it starts with  <body> and ends with </body>
</body>

These are the controls for this that we included in the style.css:
#centertext {
text-align: center;
text-indent: 0em;
}
span.centered {
display:block;
text-align: center;
text-indent: 0em;
}

The style.css was associated to each file with the following code at the top of the html file in the head section:
<head>
css” rel=”stylesheet” type=”text/css” />
<title>Your title</title>
</head>


Definitions:

ePUB – format for an electronic book

ereader – device to read an ebook

css – cascading style sheet, set of shared formatting commands to control the appearance

html – hypertext markup language. The formatting words that go around the text to tell the ereader how you want the book to appear (within < and > characters)

0em – here used for text-indent, this gives the size of the indent. It says it is made up of no (zero) elements of the type called ’em’. The size of  em is the equivalent to the size of the ‘m’ character of the font you are using for that part of the text

#centretext – the name of this section of the css file

p, class, span, /span, /p, body, /body, head, /head, link, href, rel, type, title – html commands

text-align, text-indent, span.centered, display-block – css control commands


11 September 2014

ePUB logo

Building an ePUB with QuarkXpress 9.5.4 and Sigil

There are many ways to get to an ePUB – this covers just one way of building them and is the method we’ve used. It may not be the best or most efficient, and it also requires a number of software packages. Though if you are interested in what you are getting for your money from an ePUB conversion site, then they will be doing something along these lines (not necessarily using the same packages or applying the same checks).

Because ePUBs can be read on many different device types, on many different readers, by apps, on phones, and on computers, our start position was to make sure we had text that was as clean as possible, so before we started we first made sure the book used as few layouts and formats as it could. We’d done this already when building the files for the paperback and kindle format. First we made sure the paragraph formats were consistent, that tabs, fonts, indents, spacing etc were using as few different types as possible and that these were all the same. Then we looked for a tool to convert the ePUB. The one we decided to use was QuarkXpress 9, (which we upgraded to 9.5.4.1, but has to be at least 9.2). It was a disappointing tool as it required workarounds but did do the job in the end, and with practice is actually quicker than the number of steps listed below imply. The books being converted by the approach below are fiction English novels with only a couple of images. The aim was to put in as little html as is needed, and as far as possible use the default settings in the ereader, so that any settings the ereader lets the user update and control will be reflected in your book.

  1. start with as clean a book text as you can get to
  2. go through the layouts and see how many different paragraph and format types you need. We had the following:
    – heading page – title text
    – heading page – subtitle text
    – heading page – author byline
    – heading page – company title
    – title verso page – text
    – copyright page – text
    – chapters – heading
    – chapters – para one (no indent)
    – chapters – main pages (indent)
    – chapters – end para (indent, plus space after)
    – chapters – special text for a letter
    – figure – for pictures and logos
  3. give each format a name from the standard QuarkXpress layouts allowed in style.css (in the folder \QuarkXpress9\XTensions\DigitalPublishing\Templates\css)- which aren’t many and aren’t changeable within QuarkXpress (however with a workaround they can be changed so that they are used during the export, but not the display). The choices are:
    – body
    – byline
    – figure
    – figure-caption
    – figure-credit
    – indented-para
    – pullquote
    – title1
    – title2
    – chapter-name
    – headline1
    – headline2
  4. select one of each of these standard formats and assign one to each of the formats you want in the document
  5. QuarkXpress style.css also contains text formats which we kept as they were: bold, strikethrough, strikethrough-and-underline, superscript, subscript, superior
  6. amend the QuarkXpress style.css file to give the format you want for each of the layouts, set the indent, spacing, font size, weight etc – this will only be applied when you export the file to an ePUB – when you look at it in QuarkXpress it will show the standard layouts so it won’t look as you want it to until you export it.
  7. load your text into QuarkXPress
  8. split the text into chapters (articles and components), and fix any errors during the load – for instance cutting and pasting your text into the reflow view will lose some formatting like italics so reset those if you have/want them
  9. load your cover (if you want one, some ebook loaders want the cover separate to the epub file)
  10. go through all the text and allocate the format you want (that will link to your new format in style.css) to each of the paragraphs, or text you want to control. This is a manual process that requires going through the whole file.
  11. setup the metadata for the file
  12. enter the table of contents information, or select the data to be used ofr the table of contents (we use the article names)
  13. export the file to ePUB – this will use your new style.css
  14. check this layout in as many ePUB viewers as you can
  15. get some ePUB reading devices and test it in these too. Calibre (http://calibre-ebook.com/) is a useful tool for loading your manually built ePUB onto many different ereaders
  16. We found a few problems with the file generated that then needed a further step to fix them:
    – if underlined text included non-underlined spaces within the text these didn’t export correctly and needed to be fixed manually afterwards by amending the html,
    – any embedded non-breaking characters didn’t always render correctly depending upon the ereader so we removed them,
    – if there was a change of font (or anything in a span) followed by punctuation, then that punctuation could wrap around to the next line – we resolved this by moving the punctuation to within the span command
    – not all ereaders render centered text the same, so we had to use a combination of centered within the style.css format PLUS use a span set to centered for all the text
  17. the metadata and text wasn’t quite how we wanted it so we used Sigil (https://code.google.com/p/sigil/) to finish the job:
    – to add more metadata settings
    – to remove the any position set to absolute (in toc.css)
    – add html links to any https contained in the text
    – add descriptions to any image files
    – remove numbers in the toc.css style for the toc entries
    – remove ‘style=”padding-left:30px;”‘ which was added into the html for all formats
    – change the title from h1 to h3
    – remove the additional css files (horizontal.css and vertical.css)
    – add a span format for centered text
    – in the first text file (Flow_2.html) amend body from
    &lt;body class=”body”&gt;
    to
    &lt;body class=”body” id=”startpos”&gt;
    – in the content.opf file in the ePUB, add a control for the first text file
    &lt;reference href=”Text/Flow_2.html” title=”1″ type=”text” /&gt;
    – find all centered text and add the span command around your text
  18. depending upon the distributor service you use then it may be necessary to remove the cover image from the file as some like to handle this separately
  19. next validate the ePUB file
    – use Sigil’s option for FlightCrew validation to check the file
    – use Sigil’s options for W3 validation checks
    – use the online ePubValidator (http://validator.idpf.org/) to check your file format is ok
  20. amend anything that comes up in the checks. If necessary go back to the QuarkXpress file and amend it there, re-export it and reapply the changes post export

Even though we checked and rechecked the files we still had a final error – which rather annoyingly showed up on one of the ereaders we’d actually used to validate the files which had shown there was no problem. This was because the ePUB then goes through the formatter for the ereader delivery/retail site and they may add/amend the files slightly. This is why the centered text needs more than one approach.

Although we weren’t very impressed with needing to use a workaround to get a useable ePUB file out of QuarkXpress, in the end it was pretty straightforward to do, and with practice relatively quick as well.

You don’t especially need QuarkXpress to generate the ePUB file. Although we went that route as it is an industry standard package, we actually found that the ePUB files generated from the donation based package Calibre were actually also pretty clean (with little spurious html that could be rendered incorrectly by ereaders) – although the files generated did have much larger css files. In fact Calibre got one thing right that QuarkXpress didn’t – embedded non-underlined spaces next to underlined text was correct in Calibre.

Also, if you only have one or two ePUB to convert and you do want to use QuarkXpress (or Adobe InDesign) then have a look at their 30-day trial offers, or have a look at offers on eBay first as they are expensive packages, or both now offer options for their latest packages from the cloud (SaaS) but again this is pretty expensive.

Good luck with your conversions.


Definitions:

absolute position – an html command that gives a specific location for the text/picture

article, component – a QuarkXpress terms for subsections of the publication

centered – text/pictures that appear in the middle of the page (using American spelling)

css – cascading style sheet – a common holding place for the html commands used to control the layout

embed – contained with

ePUB – electronic publication, a zipped file containing files for each book chapter and layout files that make up an e-book

html – command language within <> brackets that controls the layout of webpages, and ePUB files

html links – the web page address for the page being referred to

h1, h3 – html commands for the main heading and the (normally) smaller heading no 3

metadata – the collection of values that are used to describe the ePUB file, its title, description, ISBN etc

SaaS – software as a service – software that is held on the cloud

span – an html command that control layout for a small set of text within the commands

toc – table of contents


17 August 2014
amended 25 September 2014