Aspose.Words

How-to: Convert a Document to EPUB

An EPUB document (short for electronic publication) is HTML-based format commonly used for electronic book distribution. This format is fully supported in Aspose.Words for exporting electronic books compatible with majority of devices used for reading. This article shows how to convert simple MS Word document to EPUB with a few lines of code. It also demonstrates what a sample document looks like after being converted to EPUB using Aspose.Words.

Converting a Document to EPUB

Example Doc2EpubSave

Converts a document to EPUB using default save options.

[Java]

 

// Open an existing document from disk.

Document doc = new Document(getMyDir() + "Document.EpubConversion.doc");

 

// Save the document in EPUB format.

doc.save(getMyDir() + "Document.EpubConversion Out.epub");

 

 

Specifying Save Options

You can specify a number of options by passing an instance of HtmlSaveOptions to the Document.Save method. The code snippet below shows a few of them in action.

Example Doc2EpubSaveWithOptions

Converts a document to EPUB with save options specified.

[Java]

 

// Open an existing document from disk.

Document doc = new Document(getMyDir() + "Document.EpubConversion.doc");

 

// Create a new instance of HtmlSaveOptions. This object allows us to set options that control

// how the output document is saved.

HtmlSaveOptions saveOptions =

        new HtmlSaveOptions();

 

// Specify the desired encoding.

saveOptions.setEncoding(Charset.forName("UTF-8"));

 

// Specify at what elements to split the internal HTML at. This creates a new HTML within the EPUB

// which allows you to limit the size of each HTML part. This is useful for readers which cannot read

// HTML files greater than a certain size e.g 300kb.

saveOptions.setDocumentSplitCriteria(DocumentSplitCriteria.HEADING_PARAGRAPH);

 

// Specify that we want to export document properties.

saveOptions.setExportDocumentProperties(true);

 

// Specify that we want to save in EPUB format.

saveOptions.setSaveFormat(SaveFormat.EPUB);

 

// Export the document as an EPUB file.

doc.save(getMyDir() + "Document.EpubConversion Out.epub", saveOptions);

 

 

A Sample Conversion

In the next few paragraphs we’ll review the results of a sample document converted to EPUB format. The screenshots below shows the key features.

Since EPUB is a publishing format for electronic books, it’s apparent that the most import features will involve text. At a glance, we can see the text and all key features in the EPUB output look identical to the source document.

The picture below shows the key text formatting features after conversion to EPUB.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\1 Text Variations.png

A wide range of paragraph formatting settings used in following example perform equal to the source document.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\2 Paragraph Formatting.png

The following picture shows how great tables are rendered despite of their complexity.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\3 Tables.png

Even complex lists from the source document are exported well to EPUB.C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\4 Lists.png

Images are essential for most publications and can be aligned differently on the screen.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\5 Images.png

This picture shows an table of contents generated from source document exported as inline text with working hyperlinks. The same headings which make up the TOC in the source document are exported to the navigation pane in the EPUB for easy navigation.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\6 TOC.png

EPUB File Validation

The EPUB documents produced by Aspose.Words pass validation which means that EPUB standards are adhered to and there are no errors with the EPUB.

Even though passing validation doesn’t guarantee that every device or EPUB viewer will display your document in exactly the same way, it does however give the highest chance that your document will be viewed as close as possible as originally intended.

The picture below shows report on the document we just converted on one of the validation services.

C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\Epub_Validation.png

Meta-data in EPUB Files

Meta-data is additional information such as Author Name, Title, Comments, etc. added to a file that’s not visible in the content of the file itself.

Word document formats have special properties dedicated to such metadata and this can exported o EPUB files as well. Metadata fields are often required by distributors and e-book stores as keywords for their search engines and providing information about books for customers.

The picture below shows the metadata after the conversion.C:\Aspose\Materials\How to Create Valid EPUB Files from Word Document Formats\EpubMetadata.png