Aspose.Words

Save in the IDPF EPUB Document (.EPUB) Format

Save in the IDPF EPUB Document Format Overview

The following tables provide details about how Aspose.Words saves document in the EPUB format. All EPUB documents produced by Aspose.Words are made to pass validation.

Document

Note that not all Microsoft Word document features are avlaible in HTML format and some features may be lost or converted to image.

If you are looking for a way to easily store documents in a database then it is suggested to use the WordML or FlatOPC format. Both formats are fully XML based making them easy to store into a database but they are native word formats which allows you to preserve full fidelty of Microsoft Word features such as WordArt, Textboxex etc.

Aspose.Words saves any loaded document that to valid HTML 4.0 or XHTML 1.0 specifcations. EPUB documents are exported as EPUB 2.0. There are plans to support HTML 5 and EPUB 3.0 specification as well. There are also numerous save options avaliable to control a document is exported to HTML. Here a some examples of what you can do:

Control the CSS style sheet type

·          Specify the directory or streams where images should be saved to.

·          Specify where how the URL for an image is constructed.

·          Split the internal HTML files when saving to HTML or EPUB to restrict HTML part size to less than 300kb. Some eReaders open EPUB files that have HTML files greater than this size slowly or not at all. Therefore it is recommended to export EPUBs using this option to allow all devices to read the file easily and correctly.

·          Export images as embedded Base64.

·          Export font size in relative units (em).

·          Save fonts with the HTML output.

Some features which are unsupported in HTML are exported as image. It is the Aspose.Words rendering engine takes care of rendering the feature to image. In such cases, the level of support for this rendered feature can be found under the "Save to Image Format" supported features section.

You can also choose to create your own HTML writer for your own custom needs by building off the Aspose.Words rich DOM. Using the DocumentVisitor you can visit each node and build the HTML node by node.

Currently most of the special Microsoft "Mso" attributes, which are normally added by Microsoft Word to HTML output to make it round-trip capable back to Word formats, are not written during export to HTML or MHTML. This makes the HTML produced by Aspose.Words much cleaner than the output produced by Microsoft Word which is often bloated with these many round-trip based attributes.

In the future we will add full support for these in import and allow an option to export control if these attributes are written at all during export.

See the following links in the documentation for further information:

·         Save a Document

·         HtmlSaveOptions

·         HtmlSaveOptions.CssStyleSheetType

·         HtmlSaveOptions.Encoding

General

Feature

Supported

Comment

See Also

Attached Template

N/A

 

 

Built-In Properties

Yes

Built-in properties such as word and character count are updated using Aspose.Words but are not updated automatically on save.

Instead you need to explictly update these properties using the appropriate Document member. We will add automatic update of these properties in a future version.

There is a save option that controls whether document properties are exported or not.

Title, Keywords, Description properties are always exported as title and meta tags to HTML or MHTML and as the appropriate Dublin Core tags when saving as EPUB.

Additional built-in properties are exported as custom <o:> tags. In EPUB format properties are also exported as Dublin Core tags.

·         Document.UpdateWordCount

·         HtmlSaveOptions.ExportDocumentProperties

Custom Properties

Yes

Custom properties are exported as custom <o:> tags to HTML.

·         HtmlSaveOptions.ExportDocumentProperties

Custom Payload Part

N/A

 

 

Custom XML Data Storage

N/A

 

 

Digital Signature

N/A

 

 

Embedded Package

N/A

Exported as a plain image.

 

Encryption

N/A

 

 

Font Table

Yes

 

 

Glossary Document/Quick Parts/Auto Text

N/A

 

 

Hyphenation

Planned

Paragraphs are exported as normal.

 

Key Map Customizations

N/A

 

 

Mail Merge Recipient Data

N/A

 

 

Office Math

Planned

It is planned to export Office Math as an image to formats that do not have native support for it.

 

Themes

Yes

Theme formatting is exported as direct formatting to HTML.

Only some theme formatting such as fonts are supported.

 

Toolbar Customizations

N/A

 

 

Variables

N/A

 

 

VBA Project (Macro)

N/A

Macros are not exported to HTML based formats.

 

VBA Project Digital Signature

N/A

 

 

Background

Yes

Only solid background is exported. Exported as style="background:xxx" on each <body> tag.

There are plans to export background shape as style-background.

 

Thumbnail

Yes

You can include a cover image on output EPUB documents either by importing an existing image or by generating a thumbnail of one of the document page's using Aspose.Words

·         InBuiltDocumentProperties.Thumbnail

Embedded Fonts

Feature

Supported

Comment

See Also

Embedding Fonts

Yes

There is an option to subset and export font resources to EPUB, MHTML and HTML.

Fonts that are embedded in the original DOCX can be optionally exported.

·         HtmlSaveOptions.ExportFontResources

·         HtmlSaveOptions.FontResourcesSubsettingSizeThreshold

·         HtmlSaveOptions.FontsFolder

·         HtmlSaveOptions.FontSavingCallback

Embed Only Non-Standard Fonts

N/A

 

 

Bibliography

Feature

Supported

Comment

See Also

Bibliography

Yes

Bibliography text is saved to HTML formats as normal text.

 

Sources/Citations

Yes

Bibliography sources are not saved to HTML.

 

Citation Style

N/A

 

 

Protection

Feature

Supported

Comment

See Also

Allow Only Comments

N/A

 

 

Allow Only Form Fields

N/A

 

 

Allow Only Revisions

N/A

 

 

Limit Formatting to Selection of Styles

N/A

 

 

Protection Password (Legacy)

N/A

 

 

Protection Password (OOXML)

N/A

 

 

Protected Sections

N/A

 

 

Protection Ranges

N/A

 

 

Read Only

N/A

 

 

Settings

Only some settings can be exported.

Feature

Supported

Comment

See Also

Asian Typography Settings

N/A

 

 

Compatibility Options

Planned

 

 

Endnote Options

N/A

 

 

Footnote Options

N/A

 

 

Mail Merge Settings

N/A

 

 

Print Settings

N/A

 

 

Show/Hide Settings

N/A

 

 

View Settings

N/A

 

 

Web Settings

N/A

 

 

XML Settings

N/A

 

 

Paragraphs

Exported to HTML as <p>. Paragraphs with built-in heading styles are exported as <h1> - <h6> elements.

See the following link in the documentation for further information:

·         HtmlSaveOptions.EpubNavigationMapLevel

General Formatting

There is a setting to export paragraph styles and formatting as inline CSS (style) only, or as a mix of inline and embedded or linked CSS style sheet (class).

Direct formatting on the paragraph (from ParagraphFormat) is exported as inline CSS (using the style attribute).

Style properties (the style applied in ParagraphFormat.Style) are exported as class styles when the appropriate save option is set and referenced using an embedded or external style sheet (using the class attribute). If inline styles only are exported then all formatting appears on the style attribute.

Feature

Supported

Comment

See Also

Paragraph Style

Yes

Note that to properly round-trip styles back to a word document format, an embedded or external style sheet must be used. On HTML import classes defined in the stylesheets are used to create styles. If there is no linked style sheet of either of these kinds then the document is imported with no styles (apart from default Normal style).

There are plans to provide a save option to save a document to HTML as pure HTML without CSS styles.

·         HtmlSaveOptions.CssStyleSheetType

Alignment

Yes

Exported as "text-align" paragraph style attribute.

There is plans to introduce export of "<center> tags as well along with an option to define which type is used on export.

 

Right to Left Paragraph

Yes

Exported as dir="rtl" attribute on paragraph.

 

Bullets and Numbers

Yes

There is a save option to control how lists are exported to HTML.

 

Outline Level

Planned

 

 

Run Properties for the Paragraph Mark

Planned

Can be implemented with Microsoft Office specific techniques.

 

Suppress Line Numbers

Planned

 

 

Suppress Hyphenation

Planned

 

 

Indents

Feature

Supported

Comment

See Also

Left Indent

Yes

Exported as "margin-left" on style attribute.

·         HtmlSaveOptions.AllowNegativeLeftIndent

Right Indent

Yes

Exported as "margin-right" on  style attribute.

 

First Line Indent

Yes

Exported as "text-indent" on style attribute.

 

Hanging Indent

Yes

Exported as a combination of "margin-left" and "text-indent" on style.

 

Mirror Indents

Yes

Exported as a combination of "margin-left" and "text-indent" on style.

 

Automatically Adjust Right Indent

N/A

 

 

Spacing

Feature

Supported

Comment

See Also

Space Before

Yes

Exported as "margin-top" of style attribute.

 

Space After

Yes

Exported as "margin-bottom" of style attribute.

 

Space Auto

Yes

Paragraph with auto spacing is exported as margin-top and margin-bottom with explict spacing based on document defaults.

 

Line Spacing

Yes

Exported as "line-height" with percent.

 

No Space between Conforming Paragraphs

Planned

 

 

Keeps and Breaks

Feature

Supported

Comment

See Also

Widow/Orphan Control

Yes

This setting is exported as "widows" and "orphans" CSS attributes.

If this setting is enabled then the paragraph is exported without this attribute set. The default of this attribute is "2" in HTML so is enabled.

If this setting is disabled then the paragraph is exported with the value "0" for both of these attributes.

 

Keep With Next

Yes

Exported as style attribute with "page-break-after:avoid".

 

Keep Lines Together

Yes

Exported as style attribute with "page-break-inside:avoid".

 

Page Break Before

Yes

Exported as "page-break-before" on style attribute.

 

Text Frames

Text frames are exported as paragraphs surronded by a border.

Feature

Supported

Comment

See Also

Text Frames

Yes

 

 

Tab Stops

Tab stops are not natively available in HTML. Aspose.Words converts tab stops into a fixed set of non-breaking spaces.

Will be improved later by simulating the correct width. Consider using a borderless table to lay out information instead of tab stops when export to HTML is required.

Feature

Supported

Comment

See Also

Absolute Position

Planned

 

 

Relative Position

Planned

Can be simulated by calculating the actual position of the tab stop.

 

Alignment: Left, Center, Right, Decimal, Bar

Planned

 

 

Leader

Planned

Leader characters is currently not exported to HTML.

 

Drop Caps

Drop cap is a frame which is exported to HTML as a paragraph with borders.

Visually the drop cap looks correct but the main text is moved to the next paragraph. This will be improved when support of text frames is improved.

Feature

Supported

Comment

See Also

Drop Caps

Yes

 

 

Borders

Borders are exported on style attribute as border-xxx-style and border-xxx-width etc.

Normally each side is exported as separate attributes even if all borders of the paragraph are the same formatting.

Feature

Supported

Comment

See Also

Border Sides

Yes

 

 

Shadow

Planned

 

 

3D Frame

Planned

 

 

Style

Yes

Not all line types are supported, only native HTML borders. Such types are converted to the closest line type supported by HTML.

 

Color

Yes

 

 

Width

Yes

 

 

Distance from Text

Yes

Exported as "padding-xxx".

 

Shading

Feature

Supported

Comment

See Also

Shading

Yes

 

 

Asian Typography

Feature

Supported

Comment

See Also

Use Asian Rules for Controlling First and Last Characters

Planned

 

 

Allow Latin Text to Wrap in the Middle of a Word

Planned

 

 

Allow Hanging Punctuation

Planned

 

 

Allow Punctuation at Start of a Line to Compress

Planned

 

 

Automatically Adjust Space between Asian and Latin Text

Planned

 

 

Automatically Adjust Space between Asian Text and Numbers

Planned

 

 

Text Vertical Alignment

Planned

 

 

Text

Text in different languages is fully supported and can be rendered to formats such as PDF and image with high fidelity.

Exported to HTML as <span> elements.

Each Run node in the model is exported as a separate span to retain formatting properly. Some documents can contain many runs that are unrequired and can be joined. In the resulting HTML document this can result in many extra span elements.

There is a method Document.JoinRunsWithSameFormatting to avoid this situation. It can be called before exporting to HTML.

See the following links in the documentation for further information:

·         HtmlSaveOptions.Encoding

·         Document.JoinRunsWithSameFormatting

Characters

Feature

Supported

Comment

See Also

Western Languages

Yes

 

 

East European Languages

Yes

 

 

East Asian Languages

Yes

 

 

Right to Left Languages

Yes

Exported as dir="rtl" attribute on span.

 

Carriage Return (not a Paragraph Break)

Yes

Exported as <br> element.

 

Non Breaking Space

Yes

Exported as "&nbsp;" entity code.

 

Non Breaking Hyphen

Yes

Exported to HTML as entity code "&#x2011;".

 

Soft Hyphen

Yes

This type of hyphen is referred to as an "Optional Hyphen" in Microsoft Word documents.

Exported to HTML as the entity code "&#xad;".

 

Symbol

Yes

Symbols are exported as encoded characters.

Depending on the encoding used when saving the document, such symbols may not appear correctly in the output HTML.

 

Tab

Yes

There is no equivalent of a tab in HTML documents. 

During conversion to HTML a tab is exported as a series of non-breaking spaces of constant length. Improvement in length calculation is planned.

 

Breaks

Feature

Supported

Comment

See Also

Line Break

Yes

Exported as <br>

 

Line Break Clear Type

Yes

Clear type "both" is output with this type of break.

 

Page Break

Yes

Exported as a <br style="page-break-before:always; clear:both">

 

Column Break

Yes

Exported as <br style="mso-column-break-before:always; clear:both" >

There are plans to make it optional since it uses a Microsoft Office specific attribute.

 

General Formatting

Feature

Supported

Comment

See Also

Character Style

Yes

There is an option to control how character style is exported as inline CSS (style) only, or a mix of inline and embedded or linked CSS style sheet (class).

Direct formatting on the run (from Font) is exported as inline CSS (using style attribute). Style properties (the style applied in Font.Style) are exported as CSS class styles when the appropriate save option is set and referenced using an embedded or external style sheet (using the class attribute). If inline styles only are exported then all formatting appears on the style attribute.

Note that to properly round-trip styles back to a word document format, an embedded or external style sheet must be used. On HTML import classes defined in the stylesheets are used to create styles.

If there is no linked style sheet of either of these kinds then the document is imported with no styles (apart from default Normal style).

There are plans to provide a save option to save a document to HTML as pure HTML without CSS styles.

·         HtmlSaveOptions.CssStyleSheetType

Color

Yes

Exported as color on style attribute.

 

East Asian Typography

Planned

Some research is needed.

 

Highlight Color

Yes

Exported as background-color on span.

 

Language

Yes

Exported as lang attribute on <span>.

·         HtmlSaveOptions.ExportLanguageInformation

Do not Check Spelling or Grammar

Planned

 

 

Border

Yes

Exported as border-style, border-width, border-color on <span>.

Normally each side is exported as separate attributes even though all sides of the border of a run must be the same.

 

Shading

Yes

Only solid fill is supported, both background and foreground. Others are converted to the nearest color.

Exported as background-color on <span>.

 

Font

Bold and italics is exported as font-weight:bold and font-style:italics on style attribute.

There is plans to make an option to export these as simple <b>, <i> tags.

There is an option to control how size is exported. Font can be exported as points or as em units. This allows fonts to be resized automatically by browsers by increasing or decreasing font size.

Currently images are not exported in the same relative way. This means that images will not resize when the "Increase Text/Decrease Text" buttons are pressed. These buttons are common in browsers and eReaders.

This feature will be supported in the future.

See the following link in the documentation for further information:

·         HtmlSaveOptions.ExportRelativeFontSize

Feature

Supported

Comment

See Also

Font

Yes

 

 

Underline

Only single line underline type is supported in native HTML. Exported as "text-decoration:underline". Underline color is not exported.

In CSS 3 different underline types are proposed and may be implemented in the future.

There is plans to make an option to export underline as  simple <u> tag.

"Words only" underline type can be simulated by splitting runs and only underlining non-space text.

Feature

Supported

Comment

See Also

Underline Type

N/A

 

 

Underline Color

Yes

 

 

Text Effects

Feature

Supported

Comment

See Also

Animated Effect

Planned

 

 

Double Strikethrough

Yes

Output as single strikethrough as HTML does not have any analog for double strikethrough.

 

Strikethrough

Yes

Exported as text-decoration:line-through.

 

Subscript/Superscript

Yes

Exported as vertical-align:sub and vertical-align:super.

There are plans to add an option to export these as <sup> and <sub> elements.

 

Shadow

Planned

 

 

Outline

Yes

Output as bold.

 

Emboss

Yes

Output as bold with color. Can be improved since in some cases we get white on white.

 

Imprint (Engrave)

Yes

Output as bold with color. Can be improved since in some cases we get white on white.

 

Small Caps

Yes

Exported as style="font-variant:small-caps".

 

All Caps

Yes

Exported as style="text-transform:uppercase".

 

Hidden Text

Yes

Exported as style="display:none".

 

Special Hidden

Planned

Special hidden and Web hidden can be made aliases of ordinary Hidden.

 

Web Hidden

Planned

 

 

Character Spacing

Feature

Supported

Comment

See Also

Scale

Planned

 

 

Expanded/Compressed

Yes

Output as absolute values in points:

style="letter-spacing:XXXpt"

 

Vertical Position

Yes

Exported as vertical-align:XXXpt.

 

Tables

Table is exported to HTML as <table> and other applicable tags.

There is a save option to control how table and cell widths are exported. You can choose to export absolute and relative values, relative values only or no width. When no width is exported then the viewer must calculate the appropriate widths for the table elements.

See the following link in the documentation for further information:

·         HtmlSaveOptions.TableWidthOutputMode

Table

Feature

Supported

Comment

See Also

Nested Tables

Yes

 

 

Right To Left Tables

Yes

 

 

Table Style

Yes

When a table style is present, it is converted to direct formatting on save so all formatting is still preserved.

There are plans to export table styles as CSS so they can be properly round-tripped.

 

Conditional Formatting Style

Yes

 

 

Table Alignment

Yes

If the table is inline then it's exported as a <table> wrapped inside a <div> formatted with text-align style.

This is done so the the text does not wrap around the table.

 

Table Indent

Yes

Exported as margin-left:XXX on table.

 

Allow AutoFit

Planned

Consider export of  such tables with "table-layout:fixed".

 

Default Cell Margins

Yes

Margins are output on every cell.

 

Default Cell Spacing

Planned

 

 

Preferred Table Width

Yes

Fixed width tables are exported as width=XXXpt on <table>.

A table with relative size (percent) is exported as a percent width e.g width=100%.

 

Table Shading

Yes

Only solid fill color is supported.

Exported as background-color style attribute on all cells in the table.

 

Hidden

Planned

Hidden table or row is currently exported as as collapsed with no content. This can produce the correct output with the exception of a row border present.

 

Floating Tables

Floating tables are saved as normal tables.

Left, right and center alignment is supported.

Feature

Supported

Comment

See Also

Horizontal Position

Planned

 

 

Horizontal Position Relative To

Planned

 

 

Vertical Position

Planned

 

 

Vertical Position Relative To

Planned

 

 

Distance from Text

Planned

 

 

Move with Text

Planned

 

 

Allow Overlap Floating Tables

Planned

 

 

Table Borders

Currently borders are output on each cells as style attribute border-XXX-style, border-XXX-color etc.

Feature

Supported

Comment

See Also

Table Borders

Yes

 

 

Rows

Feature

Supported

Comment

See Also

Allow Break Across Pages

Planned

 

 

Repeat as Header Row

Yes

A table that has heade rows are exported as <thead> and <th> elements. Normal rows exported with <tbody> and <tr> elements.

Tables without header rows are exported as <tr> elements without <tbody>.

 

Height

Yes

Exported as style attribute height on <tr>

 

Height Rule

Yes

Auto is exported with no height attribute allowing auto-resize.

At least and exact are both exported as height in points.

 

Cells

Feature

Supported

Comment

See Also

Cell Margins

Yes

Exported as padding-XXX on each cell.

 

Borders

Yes

Supported except for diagonal borders. Not all line types are exported.

Exported as <td> style attribute border-XXX-style, border-XXX-width etc.

 

Shading

Yes

Only solid fill is supported.

Exported as "background-color" style attribute on <td>.

 

Wrap Text

Planned

 

 

Fit Text

Planned

 

 

Preferred Width

Yes

Exported as style attribute width on cells as either relative (percent) or fixed (points).

 

Merged Horizontally

Yes

Exported as "row-span" attribute on <td>.

 

Merged Vertically

Yes

Exported as "col-span" attribute on <td>.

 

Vertical Alignment

Yes

Exported as "vertical-align"  attribute on cell.

 

Text Direction

Yes

Exported as "writing-mode" on style attribute.

 

Custom Markup

When converting to a document format that doesn't support custom markup features, the markup is stripped but content is preserved. The non-Microsoft Word document formats do not support custom markup and only text is exported.

There are plans to export the Custom XML found within the document structure to custom tags surronding elements in HTML output.

CustomXML

There are plans to export the Custom XML found within the document structure to custom tags surronding elements in HTML output.

Feature

Supported

Comment

See Also

CustomXML

Planned

 

 

Content Controls (Structured Document Tags)

Feature

Supported

Comment

See Also

Content Controls (Structured Document Tags)

N/A

 

 

Smart Tags

Feature

Supported

Comment

See Also

Smart Tag Properties

N/A

 

 

Sections

Each section is exported as separate <div> elements as a child of <body> when there is more than one section in the document.

If there is only one section then the document content is expoted directly to <body>.

Section-wide formatting is exported as CSS styles on <div>.

Headers and Footers

There is a save option that controls how headers and footers are output. This controls how the primary header is exported in different places in the output document.

By default the header of the first section is exported at the top of the HTML output and the last footer of the last section is output at the end of the HTML output. Any linked headers and footers are taken into account.

When embedded or external style sheet is set, regular paragraphs in the header or footer are exported with "Header" or "Footer" style.

See the following link in the documentation for further information:

·         HtmlSaveOptions.ExportHeadersFootersMode

Feature

Supported

Comment

See Also

Different First Page

N/A

 

 

Different Even and Odd Pages

N/A

 

 

Continue from Previous Section

Yes

 

 

Section Break Type

Section breaks are exported as a <br> tag which contains the special Microsoft Office attribute mso-break-type:section-break.

There is an option to skip exporting breaks to HTML at all.

Note that when exporting to EPUB such break elements are normally ignored by most EPUB readers. Instead use the DocumentSplit option to split the internal HTML of the EPUB at each page break. This will appear correctly in almost all readers.

See the following links in the documentation for further information:

·         Section.SectionStart

·         DocumentBuilder.InsertBreak

·         HtmlSaveOptions.DocumentSplitCritera

Feature

Supported

Comment

See Also

Continuous

Yes

Exported as <br> with page-break-before:auto.

 

Even Page

Yes

Exported as <br> with page-break-before:left.

 

Odd Page

Yes

Exported as <br> with page-break-before:right.

 

Next Column

Yes

Exported as <br> with mso-column-break-before:always

 

Next Page

Yes

Exported as <br> with page-break-before:always.

 

Text Columns

HTML and EPUB have no native support for text columns.

Support for this feature may be possible in a future version using CSS3 or EPUB 3.0 features.

Feature

Supported

Comment

See Also

Text Columns

N/A

 

 

Page Margins

Page settings are output optionally to HTML though the use of a save option. This is exported either as an embedded or external style sheet depending on save option used.

Section formatting is exported using the "@page" identifer along with margin and size attributes that defines the apperance of the section as seen in the source document.

Some features need Microsoft Office specific attributes, they are not currently supported.

Feature

Supported

Comment

See Also

Page Margins

Yes

 

 

Page Numbering

Feature

Supported

Comment

See Also

Number Format

N/A

 

 

Starting Number

N/A

 

 

General Formatting

Feature

Supported

Comment

See Also

Right to Left Section

Yes

Supported on HTML export only.

 

Line Numbering

Planned

 

 

Paper Source

Planned

 

·         PageSetup.FirstPageTray

·         PageSetup.OtherPageTray

Paper Size

Yes

 

 

Orientation

Yes

Currently paper size depends on orientation as width and height are switched.

In the future we can also output native CSS 3 attributes.

 

Protection

N/A

 

 

Text Direction

Planned

 

 

Vertical Alignment

N/A

 

 

Asian Document Grid

N/A

 

 

Chapter Numbering

Output as list item with ordinary list numbering.

Feature

Supported

Comment

See Also

Chapter Numbering

Yes

 

 

Page Border

HTML does not have any "page" concept so no page border is imported.

Feature

Supported

Comment

See Also

Page Border

N/A

 

 

Styles

Style type itself is not exported to CSS but it is implied by the specific attributes exported to that style.

There is a setting to export paragraph and character styles as inline CSS (style), embedded or linked CSS style sheet (class).

Character styles often are saved with .span prefix.

Style Type

Style type itself is not exported to CSS but it is implied by the specific attributes exported to that style.

There is a setting to export paragraph and character styles as inline CSS (style), embedded or linked CSS style sheet (class).

Character styles often are saved with .span prefix.

Feature

Supported

Comment

See Also

Paragraph Style

Yes

 

 

Character Style

Yes

 

 

List Style

Planned

 

 

Table Style

Planned

 

 

General

Feature

Supported

Comment

See Also

Aliases

Yes

Aliases are exported as ordinary CSS classes.

 

Based On

Planned

 

 

Built-in Styles

Yes

Built-in styles are exported specifically. For instance Normal redirects to general <p> element, Heading 1 to <h1> etc.

 

Custom Styles

Yes

A new style is created with each custom style in the document.

 

Style Name

Yes

 

 

Next Style

N/A

 

 

Paragraph Properties

Yes

 

 

Run Properties

Yes

 

 

Bullets and Numbering

Yes

Output as inline styles.

 

Document Defaults

Feature

Supported

Comment

See Also

Paragraph Properties

N/A

 

 

Run Properties

N/A

 

 

Table Style

Feature

Supported

Comment

See Also

Apply Formatting to

Planned

 

 

Table Properties

Planned

 

 

Banding

Planned

 

 

Paragraph Properties

Planned

 

 

Run Properties

Planned

 

 

Numbering

Single level lists can be output either as native HTML lists or as ordinary paragraphs. This depends on the properties of list.

There are plans to introduce an option to control if lists are exported as native or paragraph lists.

Numbering Definition

Feature

Supported

Comment

See Also

Single Level

Yes

Single level lists can be output either as native HTML lists or as ordinary paragraphs. This depends on the properties of list.

There are plans to introduce an option to control if lists are exported as native or paragraph lists.

 

Multi Level

Yes

Multiple level lists are always output as ordinary paragraphs.

 

Name

Planned

Can be exported as List style name.

 

Numbering Level

Feature

Supported

Comment

See Also

Label Alignment

Yes

Ordinary paragraphs are used for alignment control in output HTML.

 

Picture Bullet

Yes

Lists with picture bullets are always output as native lists. Other possibilities will be considered to keep formatting more precisely.

 

Restart Level

Yes

Non-native list items can have a custom label.

 

Bullet Character

Yes

Only some bullets are supported natively. For others list item is exported as spans and style formatting.

 

Label/Format String

Yes

Only label itself is output. If a label cannot be represented by native HTML list then all the list is exported as ordinary paragraphs. Label formatting string is not output.

 

Number Format

Yes

Only some number formats are supported natively. For others list item is exported as a normal spans and direct formatting.

 

Paragraph Properties

Yes

Output as embedded or as inline style attributes.

 

Font Properties

Yes

Output as inline styles.

 

Linked Paragraph Style

Planned

 

 

Starting Value

Yes

Exported as "start" attribute on list item nodes for any lists that do not start at 1.

 

Text After

Planned

 

 

Footnotes/Endnotes

Exported as hyperlink with footnote number inline of text.

Footnote text is exported at the bottom of the document separted by a horizontal ruler. The hyperlink links to here.

Footnotes

Exported as hyperlink with footnote number inline of text.

Footnote text is exported at the bottom of the document separted by a horizontal ruler. The hyperlink links to here.

Feature

Supported

Comment

See Also

Reference Mark

Yes

 

 

Custom Reference Mark

Planned

 

 

Custom Separator

Planned

 

 

Continuation Separator Mark

N/A

 

 

Document Wide Properties

Planned

 

 

Section Wide Properties

N/A

 

 

Number Format

Yes

Only document wide format is supported.

 

Restart Location

Planned

 

 

Starting Value

Planned

 

 

Placement

Planned

 

 

Endnotes

Feature

Supported

Comment

See Also

Reference Mark

Yes

 

 

Custom Reference Mark

Planned

 

 

Custom Separator

Planned

 

 

Continuation Separator Mark

N/A

 

 

Document Wide Properties

Planned

 

 

Section Wide Properties

N/A

 

 

Number Format

Yes

Only document wide format is supported.

 

Restart Location

Planned

 

 

Starting Value

Planned

 

 

Placement

Planned

 

 

Annotations

Bookmarks

All Word documents and most other formats that Aspose.Words export to only allow bookmarks with unqiue names, that is, no two bookmarks are allowed the exact same name.

If two bookmarks are given the same name in the model then no errors will occur. During export to any format the duplicate bookmarks are removed silently. The first bookmark visited in the model is the one that is retained, any other bookmarks are removed,

Bookmark is represented by <a> element. Only bookmark start is output. Nesting and overlapping is not allowed.

Feature

Supported

Comment

See Also

Bookmark Start

Yes

 

 

Bookmark End

Planned

There are plans to support bookmark end. This is good to have for roundtrip.

Currently if bookmark end is required then it is suggested to export two bookmarks instead of one to achieve this.

 

Bookmark Name

Yes

 

 

Bookmark Table Columns

N/A

 

 

Comments

There are plans to export comments to HTML as footnotes.

Feature

Supported

Comment

See Also

Comment

Yes

Currently a comment is exported as "title"  attribute on <span>.

 

Comment Range

Yes

The same comment is exported as title attribute for all spans inside a comment range.

This will be improved in the future.

 

Author

Planned

 

 

Date

Planned

 

 

Initial

Planned

 

 

Tracking Changes

You may need to accept tracked changes before saving to different formats or else the deleted revisions will still show up in the output document.

Exported as <ins> and <del> elements.

See the following link in the documentation for further information:

·         Document.AcceptAllRevisions

Feature

Supported

Comment

See Also

On/Off State

N/A

 

 

Table Cell Deletion

N/A

 

 

Table Cell Insertion

N/A

 

 

Cell Merge or Split

N/A

 

 

Run Deletion

Planned

 

 

Run Insertion

Planned

 

 

Paragraph Deletion

Planned

 

 

Paragraph Insertion

Planned

 

 

Table Row Deletion

N/A

 

 

Table Row Insertion

N/A

 

 

Numbering Insertion

N/A

 

 

Numbering Change

N/A

 

 

Moves

Planned

Will be represented just by pair of deletion and insertion.

 

Paragraph Properties Change

N/A

 

 

Run Properties Change

N/A

 

 

Section Properties Change

N/A

 

 

Table Properties Change

N/A

 

 

Cell Properties Change

N/A

 

 

Row Properties Change

N/A

 

 

RSIDs Session Identifiers

N/A

 

 

Fields

Fields are supported in the model. To check if these fields can be updated check the import section.

Even if a field can be updated, most fields are exported if they are up-to-date by MS Word. i.e if you import a doucment and export it with a field as it is then it will appear properly in the output format.

Fields with custom field codes (a field code modified to represent something different than a normal field) are retained as they are when converting to other Word document formats. These fields are lost when exporting to ODT.

When saving to rendered formats such as PDF, XPS or image some fields may be automatically updated as a part of page layout.

Fields are output as plain text in HTML. Only field result is exported.

Field Codes

Fields are output as plain text in HTML. Only field result is exported.

See the following link in the documentation for further information:

·         Document.UpdateFields.

Feature

Supported

Comment

See Also

Field Codes

Yes

 

 

Date and Time

Feature

Supported

Comment

See Also

CreateDate

Yes

 

 

Date

Yes

 

 

EditTime

Yes

 

 

PrintDate

Yes

 

 

SaveDate

Yes

 

 

Time

Yes

 

 

Document Automation

Feature

Supported

Comment

See Also

Compare

Yes

The result of the calculation is exported as static text to HTML.

 

DocVariable

Yes

The variable is exported as static text to HTML.

 

GoToButton

Yes

The display name only is exported, the field is plain text and won't jump to a part of the document when clicked.

Instead use a hyperlink to a bookmark to preserve such behavior in HTML.

 

If

Yes

Field result is output as plain text.

 

MacroButton

Yes

Field result is output as plain text.

 

Print

N/A

 

 

Document Information

Fields such as FileName or FileSize are not automatically updated on save. However they can be updated explictly by calling Document.UpdateFields.

Feature

Supported

Comment

See Also

Author

Yes

 

 

Comments

Yes

 

 

DocProperty

Yes

 

 

FileName

Yes

 

 

FileSize

Yes

 

 

Info

Yes

 

 

Keywords

Yes

 

 

LastSavedBy

Yes

 

 

NumChars

Yes

 

 

NumPages

Yes

 

 

NumWords

Yes

 

 

Subject

Yes

 

 

Template

Yes

 

 

Title

Yes

 

 

Equations and Formulas

Feature

Supported

Comment

See Also

Formula

Yes

 

 

Advance

Planned

Planned to be exported as plain text separator.

 

Eq

Yes

Exported as image.

 

Symbol

Yes

 

 

Form Fields

There is a save option to control if form fields are exported as elements like <input> etc.  or as plain text.

See the following link in the documentation for further information:

·         HtmlSaveOptions.ExportTextInputFormFieldsAsText

Feature

Supported

Comment

See Also

TextInput

Yes

Exported as <input type="text" name="XXX" />

 

CheckBox

Yes

Exported as <input type="checkbox" name="XXX" />

 

DropDown

Yes

Exported as <select name="XXX" />. Each item in the list is exported as a <option> element.

 

Calc On Exit

N/A

 

 

Checked

Yes

Exported as checked="checked" attribute on <input>.

 

Default Value

Yes

With text form fields this is exported as value="XXX" attribute on <input> element.

With a drop down list, this is exported with the selected="selected" attribute on the default <option> element.

 

Enabled

Planned

The "disabled" attribute can be used here.

 

Entry and Exit Macro

N/A

 

 

Name

Yes

Exported as "name" attribute on <input> or <select> element.

Name is only exported when exporting as elements and not as plain text.

 

Help Text

Planned

The "alt" attribute can be used.

 

Status Text

Planned

 

 

Max Length

Yes

Exported as maxlength attribute.

 

Check Box Size

Planned

There are plans to use width and height CSS attributes to increase size of checkboxes exported to HTML.

 

Text Input Type

Planned

 

 

Index and Tables

Feature

Supported

Comment

See Also

Index

Yes

 

 

RD

Yes

 

 

TA

N/A

 

 

TC

N/A

 

 

TOA (Table of Authorities)

Yes

 

 

TOC (Table of Contents)

Yes

The \h switch on TOC can be used to export the TOC as hyperlinked to HTML.

There is a save option to disable the export of page numbers on TOC as page numbers are not required in this format.

 

XE

N/A

 

 

Links and References

Feature

Supported

Comment

See Also

AutoText

Yes

 

 

AutoTextList

N/A

 

 

Bibliography

Yes

 

 

Citation

N/A

 

 

Hyperlink

Yes

Output to HTML as a clickable hyperlink.

 

IncludePicture

Yes

If the field is set to embedding the image in the document then the image is saved along with the HTML and referenced locally.

If the field is link only then the full address to the source image is written to the image tag as <img src="xxx">.

 

IncludeText

Yes

 

 

Link

Yes

Exported as an image.

 

NoteRef

Yes

Exported as plain text.

 

PageRef

Yes

Hyperlinked PageRef field is exported as plain text and not yet as a clickable hyperlink in output HTML.

There is a simple work around in the mean time to export this field as working hyperlinks.

 

Quote

Yes

 

 

Ref

Yes

Exported as plain text and not yet as a clickable hyperlink.

There is a simple work around in the mean time to export this field as a working hyperlink.

 

StyleRef

Yes

 

 

Mail Merge

Any fields that have a field result are exported to HTML as text.

Feature

Supported

Comment

See Also

AddressBlock

N/A

 

 

Ask

N/A

 

 

Compare

Yes

 

 

Database

N/A

 

 

Fill-in

N/A

 

 

GreetingLine

N/A

 

 

If

N/A

 

 

MergeField

Yes

 

 

MergeRec

Yes

 

 

MergeSeq

N/A

 

 

Next

Yes

 

 

NextIf

N/A

 

 

Set

N/A

 

 

SkipIf

N/A

 

 

Numbering

Feature

Supported

Comment

See Also

AutoNum

Planned

 

 

AutoNumLgl

Planned

 

 

AutoNumOut

Planned

 

 

BarCode

Planned

Note that this only refers to the BarCode field structure. Commonly barcodes are actually represented in Microsoft Word document as text using a special barcode font or image. These are fully supported during export to all formats.

 

ListNum

Planned

 

 

Page

Planned

 

 

RevNum

Yes

Exported as plain text.

 

Section

Yes

Exported as plain text.

 

SectionPages

Yes

Exported as plain text.

 

Seq

Yes

Exported as plain text.

 

User Information

Feature

Supported

Comment

See Also

UserAddress

Yes

 

 

UserInitials

Yes

 

 

UserName

Yes

 

 

Hyperlinks

Exported as <a> element with text or image being linked as children.

Feature

Supported

Comment

See Also

Text

Yes

 

 

Hyperlinked Shape or Image

Yes

Image is wrapped inside <a> element.

 

Hyperlink across Multiple Paragraphs

Yes

Is exported as one hyperlink for each paragraph.

This is the same as how Microsoft Word exports such hyperlinks.

 

Hyperlink to a Local Bookmark

Yes

Linked to bookmarks  exported as <a> elements.

 

Hyperlink to an External Resource

Yes

 

 

Screen Tip

Planned

 

 

Target Frame

Yes

Exported as target="_XXX" attribute on <a>.

 

Formatting Switches

Feature

Supported

Comment

See Also

Date and Time Formatting

Yes

 

 

Numbering Formatting

Yes

 

 

General Formatting

Yes

 

 

Drawing Objects

Most drawing objects are exported using the img element unless there is native support for the feature. HTML does not have support for a lot of Word graphics options therefore these features are rendered to image before export. There are options to choose a folder or streams to save images to during export to HTML. There is also an option to export image data embedded in the HTML as base64. You can export the document as MHTML to automatically embed all image data.

You can control the name of the generated image file by using the ImageSavingCallback in the HtmlSaveOptions class.

Currently images are only exported as absolute size (points) and not in percent. This means that images will not resize when the "Increase Text/Decrease Text" buttons are pressed. These buttons are common in browsers and eReaders.

This feature will be supported in the future.

Images

All images are exported as the same format that they were originally loaded as.

Most drawing objects are exported using the img element unless there is native support for the feature. HTML does not have support for a lot of Word graphics options therefore these features are rendered to image before export. There are options to choose a folder or streams to save images to during export to HTML. There is also an option to export image data embedded in the HTML as base64. You can export the document as MHTML to automatically embed all image data.

You can control the name of the generated image file by using the ImageSavingCallback in the HtmlSaveOptions class.

Currently images are only exported as absolute size (points) and not in percent. This means that images will not resize when the "Increase Text/Decrease Text" buttons are pressed. These buttons are common in browsers and eReaders.

This feature will be supported in the future.

See the following links in the documentation for further information:

·         HtmlSaveOptions.ImageFolder

·         HtmlSaveOptions.ImageSavingCallback

·         HtmlSaveOptions.ExportImagesAsBase64

·         HtmlSaveOptions.ImageResolution

Feature

Supported

Comment

See Also

PNG

Yes

 

 

JPG

Yes

 

 

WMF

Yes

Metafiles such as WMF, EMF and EMF+ are normally converted to PNG when exporting to HTML. There is an option to export metafiles as vector images.

Note that not all browsers can display metafiles properly.

·         HtmlSaveOptions.ExportMetafileAsRaster

EMF

Yes

 

 

EMF+

Yes

 

 

BMP

Yes

 

 

GIF

Yes

 

 

TIFF

Yes

 

 

Borders

Yes

Some borders can be represented natively by HTML elements and attributes. Others are output in raster form.

Native borders are exported as "border-XXX" on style attribute e.g border-style, border-width.

 

Cropping

Yes

Rectangular cropping is supported only.

Images are cropped permantly and cropping cannot be removed after export.

 

Alternative text

Yes

Exported as "alt=xxxx" on img element.

 

Image Recoloring

Feature

Supported

Comment

See Also

Brightness

Yes

 

 

Contrast

Yes

 

 

Recolor

Planned

Image is currently exported without any recoloring applied.

 

Textboxes

Textboxes are converted to raster images on export.

This provides great fidelity however text is not selectable in the output document.

Feature

Supported

Comment

See Also

Text Direction

Yes

 

 

Linked Textboxes

Planned

 

 

Internal Margins

Yes

 

 

Vertical Alignment

Yes

 

 

Resize To Fit Text

Planned

Textbox is currently exported as the size set as Width and Height properties of Shape.l

 

Text in Other Shapes

Yes

 

 

OLE Objects

OLE Objects represent embedded content in a Microsoft Word document, such as an embedded Excel or Powerpoint document. The OLE object is dynamic and can be edited or updated through Microsoft Word.

OLE objects are fully preserved when converting within different Word document formats.

OLE objects are saved as images.

There are plans to export using the <object> tag and the type of object which is embedded. This will allow export of embedded objects such as PDF to be exported as working objects.

Feature

Supported

Comment

See Also

Linked

N/A

 

 

Embedded

N/A

 

 

Draw Aspect

N/A

 

 

Auto Update

N/A

 

 

Lock

N/A

 

 

Ole Object Data

N/A

 

 

Ole Object Picture

Yes

 

 

Source Range

N/A

 

 

ActiveX Controls

Output as an image.

Feature

Supported

Comment

See Also

Persistent Properties Storage

N/A

 

 

Shapes

Shapes are converted to raster images on export.

This provides great fidelity however text is not selectable in the output document.

See the following link in the documentation for further information:

·         HtmlSaveOptions.ScaleImageToShapeSize

Feature

Supported

Comment

See Also

Lines

Yes

Vector shapes (lines, arrows, callouts etc.) are output in rasterized form.

 

Basic Shapes

Yes

 

 

Block Arrows

Yes

 

 

Flowcharts

Yes

 

 

Callouts

Yes

 

 

Stars and Banners

Yes

 

 

Group Shape

Yes

All grouped shapes are exported as one image.

 

Drawing Canvas

Yes

 

 

Signature Line

N/A

HTML does not have any signature line feature. A signature line is exported as a regular image.

Alternative text on the image of the signature line is exported as "alt=xxx" attribute.

 

Ink Annotation

Planned

Ink Annotations are exported to all formats as regular images.

 

Clip Art

Yes

 

 

Diagrams (VML)

Yes

VML graphic format is normally used in pre-OOXML formats such as DOC or RTF.

It is planned to also export the VML code along with the image to allow proper round-trip.

 

SmartArt (VML)

Yes

 

 

Charts (VML)

Yes

 

 

Shape Customizations

Yes

Exported as image.

 

Hyperlink on Shape

Yes

Exported as <img> element wrapped in <a> element with appropriate attributes.

 

Watermark

N/A

Watermark shapes will be exported as a regular image to HTML at the same level as normal content. This is because there is no native support for watermark in HTML. To increase fidelity, set the watermark wrapping to "Behind Text" so the watermark will appear behind the main document content when exported to HTML.

Also note that the image will not have the same transparency level as what is applied in Microsoft Word.

Watermarks are only exported if the export of headers and footers to HTML is enabled. Since HTML has no "page" concept, the headers and footers only appear once. There is a save option to control how headers and footers are exported to HTML.

 

DrawingML

DrawingML is only preserved in round-trip back to DOCX format.

There are plans to export DrawingML as regular shape/image to other formats. At the moment DrawingML objects are lost when converting to other formats.

Support for DrawingML is improving with every new release.

Output as an image.

Feature

Supported

Comment

See Also

Images/Shapes

Yes

Simple DrawingML images are converted to regular images upon export to non-OOXML formats.

 

Diagrams

Planned

 

 

SmartArt

Planned

 

 

Charts

Planned

 

 

WordArt

WordArt is exported as an image. You can also set the AltText property so the plain text content of the WordArt can be found in the output HTML.

Feature

Supported

Comment

See Also

Styles

Yes

 

 

Outline

Yes

 

 

Fill

Yes

 

 

3D Properties

Planned

 

 

Text Spacing

Planned

 

 

Vertical Text

Planned

Exported as rotated text.

 

Even Height

Planned

 

 

Align and Justify Text

Planned

 

 

WordArt Shape

Planned

 

 

Horizontal Line Object

Exported as <hr> tag.

Feature

Supported

Comment

See Also

Width

Yes

Exported as width:XXX% on style attribute.

 

Height

Yes

Exported as height:XXpt on style attribute.

 

Color

Yes

Exported as color:XXX along with border:none on style attribute.

Note that some browsers cannot display this properly (Chrome seems to ignore this).

We will consider using a different attribute in a future version.

 

Alignment

Yes

Exported as "text-align:XXX" on style attribute.

 

Hyperlink

Planned

<hr> tag can be wrapped in an <a> element.

 

Image

Yes

Exported as a regular <img> element instead of <hr>.

There are plans to export a horizontal line with an image as <hr> element with style="background: url(xxx.png)".

 

Position

Feature

Supported

Comment

See Also

Inline

Yes

Exported as child element child of <p>.

 

Floating

Yes

Aspose.Words attempts to export floating content to HTML. Note that the output producd may differ greatly from the Word document source as a Word document is a vastly different format. We are activly improving export of floating content in our HTML engine.

Floating content is made possible by exporting elements with margin-top and margin-left style and  position:absolute style.

In the future we will provide tips on how to design Word documents with floating content that are exported to HTML based formats well.

 

Wrap Type

Yes

Text wrapping around images and shapes is supported in all formats.

Wrap type is emulated through the use of "float" on style attribute.

·          Square, Tight and Through are exported as "float:left".

·          Top and Bottom can be emulated by adding <br style="clear:both"> around image but is currently exported as "float:left".

·          Behind-Text can be emulated using z-index:-1 but is currently such content is exported on top of text.

·          Infront of Text content is exported with a z-index greater than 0.

 

Wrap Sides

Yes

Wrap sides is exported through use of the float  style attribute.

·          Both sides and largest only wrap have no suitable analog in HTML and are exported as float:left.

·          Right side wrap is exported as float:left.

·          Left side wrap is exported as float:right.

 

Distance from Text

Yes

Exported as margin style on exported object.

 

Z-Order

Yes

Exported as "z-index:X" on style attribute.

 

Polygon Wrap Points

N/A

 

 

Rotation

Yes

Image is rotated when rendering shape to raster image.

 

Flip

Yes

Image is flipped when rendering shape to raster image.

 

Horizontal Alignment

Planned

Currently exported as absolute position. This will be supported in a future version.

An alternative to achieve the same effect is to make the image inline in a paragraph and set the paragraph alignment to center.

 

Horizontal Position Relative To

Yes

Always exported as distance from the anchor (paragraph or character).

 

Vertical Alignment

Planned

Currently exported as absolute position. This will be supported in a future version.

 

Vertical Position Relative To

Yes

Always exported as distance from the anchor (paragraph or character).

 

Anchor Lock

N/A

 

 

Allow Overlap

Yes

Floating content can overlap when "Allow Overlap" is enabled.

When disabled the HTML engine will export these as separate content that does not overlap.

 

Layout in Table Cell

N/A

Floating content is always exported as distance from the anchor (paragraph or character) regardless of whether the paragraph is in a table or not.

 

Size

Feature

Supported

Comment

See Also

Width and Height

Yes

Exported as height and width attributes in pixels (no units specified).

 

Scale

Yes

A scaled object is normally exported with scaling applied to the Shape size and the Scale property reset to 100%.

Exported as explict size as height and width attributes in pixels (no units specified).

 

Relative Size

Yes

Exported as explict size as height and width attributes in pixels (no units specified).

 

Lock Aspect Ratio

N/A

 

 

Fill

Feature

Supported

Comment

See Also

No Fill

Yes

Shape with no fill is rendered to image as tranparency enabled to achieve this effect.

 

Solid Fill

Yes

All shapes are rastered to image in HTML output.

 

Gradient Fill

Yes

Gradient is rendered to raster image.

 

Pattern Fill

Yes

Rendered to raster image.

 

Picture or Texture Fill

Yes

Rendered to raster image.

 

Line Style

Line shapes are rendered to image and embedded in the output HTML.

Feature

Supported

Comment

See Also

Line Color

Yes

 

 

Line Fill

Yes

 

 

Line Width

Yes

 

 

Compound Type

Yes

 

 

Dash Type

Yes

 

 

Cap Type

Yes

 

 

Join Type

Yes

 

 

Arrow Settings

Yes

 

 

Shadow

Shadow is currently lost on shapes on export to HTML.

There are plans to rasterize the shadow properties on the image during export to HTML.

Feature

Supported

Comment

See Also

Shadow

Planned

 

 

3D Properties

3D effects on shape are lost upon conversion to HTML.

There are plans to rasterize the 3D properties on the image during export to HTML.

Feature

Supported

Comment

See Also

3D Properties

Planned