java.lang.ObjectLoadOptions
com.aspose.words.HtmlLoadOptions
public class HtmlLoadOptions
Constructor Summary |
---|
HtmlLoadOptions()
Initializes a new instance of this class with default values. |
HtmlLoadOptions(java.lang.String password)
A shortcut to initialize a new instance of this class with the specified password to load an encrypted document. |
HtmlLoadOptions(int loadFormat, java.lang.String password, java.lang.String baseUri)
A shortcut to initialize a new instance of this class with properties set to the specified values. |
Property Getters/Setters Summary | ||
---|---|---|
java.lang.String | getBaseUri() | → inherited from LoadOptions |
void | setBaseUri(java.lang.String value) | |
Gets or sets the string that will be used to resolve relative URIs found in the document into absolute URIs when required. Can be null or empty string. Default is null. | ||
boolean | getConvertShapeToOfficeMath() | → inherited from LoadOptions |
void | setConvertShapeToOfficeMath(boolean value) | |
Gets or sets whether to convert shapes with EquationXML to Office Math objects. | ||
java.nio.charset.Charset | getEncoding() | → inherited from LoadOptions |
void | setEncoding(java.nio.charset.Charset value) | |
Gets or sets the encoding that will be used to load an HTML or TXT document if the encoding is not specified in HTML/TXT. Can be null. Default is null. | ||
FontSettings | getFontSettings() | → inherited from LoadOptions |
void | setFontSettings(FontSettings value) | |
Allows to specify document font settings. | ||
LanguagePreferences | getLanguagePreferences() | → inherited from LoadOptions |
Gets language preferences that will be used when document is loading. | ||
int | getLoadFormat() | → inherited from LoadOptions |
void | setLoadFormat(int value) | |
Specifies the format of the document to be loaded.
Default is |
||
int | getMswVersion() | → inherited from LoadOptions |
void | setMswVersion(int value) | |
Allows to specify that the document loading process should match a specific MS Word version.
Default value is |
||
java.lang.String | getPassword() | → inherited from LoadOptions |
void | setPassword(java.lang.String value) | |
Gets or sets the password for opening an encrypted document. Can be null or empty string. Default is null. | ||
int | getPreferredControlType() | |
void | setPreferredControlType(int value) | |
Gets or sets preffered type of document nodes that will represent imported <input> and <select> elements.
Default value is |
||
boolean | getPreserveIncludePictureField() | → inherited from LoadOptions |
void | setPreserveIncludePictureField(boolean value) | |
Gets or sets whether to preserve the INCLUDEPICTURE field when reading Microsoft Word formats. The default value is false. | ||
IResourceLoadingCallback | getResourceLoadingCallback() | → inherited from LoadOptions |
void | setResourceLoadingCallback(IResourceLoadingCallback value) | |
Allows to control how external resources (images, style sheets) are loaded when a document is imported from HTML, MHTML. | ||
boolean | getSupportVml() | |
void | setSupportVml(boolean value) | |
Specifies HTML parser to parse conditional comments exactly like <!--[if gte vml 1]> and not to parse conditional comments exactly like <![if !vml]>. | ||
java.lang.String | getTempFolder() | → inherited from LoadOptions |
void | setTempFolder(java.lang.String value) | |
Allows to use temporary files when reading document.
By default this property is null and no temporary files are used.
|
||
boolean | getUpdateDirtyFields() | → inherited from LoadOptions |
void | setUpdateDirtyFields(boolean value) | |
Specifies whether to update the fields with the dirty attribute.
|
||
IWarningCallback | getWarningCallback() | → inherited from LoadOptions |
void | setWarningCallback(IWarningCallback value) | |
Called during a load operation, when an issue is detected that might result in data or formatting fidelity loss. | ||
int | getWebRequestTimeout() | |
void | setWebRequestTimeout(int value) | |
The number of milliseconds to wait before the web request times out. The default value is 100000 milliseconds (100 seconds). |
Constructor Detail |
---|
public HtmlLoadOptions()
Example:
Shows how to parse HTML document with conditional comments like "<!--[if gte vml 1]>" and "<![if !vml]>".HtmlLoadOptions loadOptions = new HtmlLoadOptions(); // If value is true, then we parse "<!--[if gte vml 1]>", else parse "<![if !vml]>" loadOptions.setSupportVml(true); // Wait for a response, when loading external resources loadOptions.setWebRequestTimeout(1000); Document doc = new Document(getMyDir() + "Conditional comments.htm", loadOptions); doc.save(getArtifactsDir() + "HtmlLoadOptions.SupportVml.docx");
public HtmlLoadOptions(java.lang.String password)
password
- The password to open an encrypted document. Can be null or empty string.Example:
Shows how to encrypt an Html document and then open it using a password.// Create and sign an encrypted html document from an encrypted .docx CertificateHolder certificateHolder = CertificateHolder.create(getMyDir() + "morzal.pfx", "aw"); SignOptions signOptions = new SignOptions(); { signOptions.setComments("Comment"); signOptions.setSignTime(new Date()); signOptions.setDecryptionPassword("docPassword"); } String inputFileName = getMyDir() + "Encrypted.docx"; String outputFileName = getArtifactsDir() + "HtmlLoadOptions.EncryptedHtml.html"; DigitalSignatureUtil.sign(inputFileName, outputFileName, certificateHolder, signOptions); // This .html document will need a password to be decrypted, opened and have its contents accessed // The password is specified by HtmlLoadOptions.Password HtmlLoadOptions loadOptions = new HtmlLoadOptions("docPassword"); Assert.assertEquals(loadOptions.getPassword(), signOptions.getDecryptionPassword()); Document doc = new Document(outputFileName, loadOptions); Assert.assertEquals(doc.getText().trim(), "Test encrypted document.");
public HtmlLoadOptions(int loadFormat, java.lang.String password, java.lang.String baseUri)
loadFormat
- A password
- The password to open an encrypted document. Can be null or empty string.baseUri
- The string that will be used to resolve relative URIs to absolute. Can be null or empty string.Example:
Shows how to specify a base URI when opening an html document.// If we want to load an .html document which contains an image linked by a relative URI // while the image is in a different location, we will need to resolve the relative URI into an absolute one // by creating an HtmlLoadOptions and providing a base URI HtmlLoadOptions loadOptions = new HtmlLoadOptions(LoadFormat.HTML, "", getImageDir()); Document doc = new Document(getMyDir() + "Missing image.html", loadOptions); // While the image was broken in the input .html, it was successfully found in our base URI Shape imageShape = (Shape) doc.getChildNodes(NodeType.SHAPE, true).get(0); Assert.assertTrue(imageShape.isImage()); // The image will be displayed correctly by the output document doc.save(getArtifactsDir() + "HtmlLoadOptions.BaseUri.docx");
Property Getters/Setters Detail |
---|
getBaseUri/setBaseUri | → inherited from LoadOptions |
public java.lang.String getBaseUri() / public void setBaseUri(java.lang.String value) |
This property is used to resolve relative URIs into absolute in the following cases:
Example:
Shows how to open an HTML document with images from a stream using a base URI.// Open the stream InputStream stream = new FileInputStream(getMyDir() + "Document.html"); Document doc; try { // Pass the URI of the base folder so any images with relative URIs in the HTML document can be found // Note the Document constructor detects HTML format automatically LoadOptions loadOptions = new LoadOptions(); loadOptions.setBaseUri(getImageDir()); doc = new Document(stream, loadOptions); } finally { if (stream != null) stream.close(); }
getConvertShapeToOfficeMath/setConvertShapeToOfficeMath | → inherited from LoadOptions |
public boolean getConvertShapeToOfficeMath() / public void setConvertShapeToOfficeMath(boolean value) |
Example:
Shows how to convert shapes with EquationXML to Office Math objects.LoadOptions loadOptions = new LoadOptions(); // Use 'true/false' values to convert shapes with EquationXML to Office Math objects or not loadOptions.setConvertShapeToOfficeMath(isConvertShapeToOfficeMath); // Specify load option to convert math shapes to office math objects on loading stage Document doc = new Document(getMyDir() + "Math shapes.docx", loadOptions);
getEncoding/setEncoding | → inherited from LoadOptions |
public java.nio.charset.Charset getEncoding() / public void setEncoding(java.nio.charset.Charset value) |
This property is used only when loading HTML or TXT documents.
If encoding is not specified in HTML/TXT and this property is null
, then the system will try to
automatically detect the encoding.
getFontSettings/setFontSettings | → inherited from LoadOptions |
public FontSettings getFontSettings() / public void setFontSettings(FontSettings value) |
When loading some formats, Aspose.Words may require to resolve the fonts. For example, when loading HTML documents Aspose.Words may resolve the fonts to perform font fallback.
If set to null, default static font settings
The default value is null.
Example:
Shows how to resolve fonts before loading HTML and SVG documents.FontSettings fontSettings = new FontSettings(); TableSubstitutionRule substitutionRule = fontSettings.getSubstitutionSettings().getTableSubstitution(); // If "HaettenSchweiler" is not installed on the local machine, // it is still considered available, because it is substituted with "Comic Sans MS" substitutionRule.addSubstitutes("HaettenSchweiler", new String[]{"Comic Sans MS"}); LoadOptions loadOptions = new LoadOptions(); loadOptions.setFontSettings(fontSettings); // The same for SVG document Document doc = new Document(getMyDir() + "Document.html", loadOptions);
Example:
Shows how to set font settings and apply them during the loading of a document.// Create a FontSettings object that will substitute the "Times New Roman" font with the font "Arvo" from our "MyFonts" folder FontSettings fontSettings = new FontSettings(); fontSettings.setFontsFolder(getFontsDir(), false); fontSettings.getSubstitutionSettings().getTableSubstitution().addSubstitutes("Times New Roman", "Arvo"); // Set that FontSettings object as a member of a newly created LoadOptions object LoadOptions loadOptions = new LoadOptions(); loadOptions.setFontSettings(fontSettings); // We can now open a document while also passing the LoadOptions object into the constructor so the font substitution occurs upon loading Document doc = new Document(getMyDir() + "Document.docx", loadOptions); // The effects of our font settings can be observed after rendering doc.save(getArtifactsDir() + "Document.LoadOptionsFontSettings.pdf");
getLanguagePreferences | → inherited from LoadOptions |
public LanguagePreferences getLanguagePreferences() |
Example:
Shows how to set up language preferences that will be used when document is loading.LoadOptions loadOptions = new LoadOptions(); loadOptions.getLanguagePreferences().addEditingLanguage(EditingLanguage.JAPANESE); Document doc = new Document(getMyDir() + "No default editing language.docx", loadOptions); int localeIdFarEast = doc.getStyles().getDefaultFont().getLocaleIdFarEast(); if (localeIdFarEast == EditingLanguage.JAPANESE) System.out.println("The document either has no any FarEast language set in defaults or it was set to Japanese originally."); else System.out.println("The document default FarEast language was set to another than Japanese language originally, so it is not overridden.");
getLoadFormat/setLoadFormat | → inherited from LoadOptions |
public int getLoadFormat() / public void setLoadFormat(int value) |
It is recommended that you specify the
Example:
Shows how to load a document as HTML without automatic file format detection.LoadOptions loadOptions = new LoadOptions(); loadOptions.setLoadFormat(com.aspose.words.LoadFormat.HTML); Document doc = new Document(getMyDir() + "Document.html", loadOptions);
getMswVersion/setMswVersion | → inherited from LoadOptions |
public int getMswVersion() / public void setMswVersion(int value) |
Example:
Shows how to emulate the loading procedure of a specific Microsoft Word version during document loading.// Create a new LoadOptions object, which will load documents according to MS Word 2019 specification by default LoadOptions loadOptions = new LoadOptions(); Assert.assertEquals(MsWordVersion.WORD_2019, loadOptions.getMswVersion()); Document doc = new Document(getMyDir() + "Document.docx", loadOptions); Assert.assertEquals(12.95, doc.getStyles().getDefaultParagraphFormat().getLineSpacing(), 0.005f); // We can change the loading version like this, to Microsoft Word 2007 loadOptions.setMswVersion(MsWordVersion.WORD_2007); // This document is missing the default paragraph format style, // so when it is opened with either Microsoft Word or Aspose Words, that default style will be regenerated, // and will show up in the Styles collection, with values according to Microsoft Word 2007 specifications doc = new Document(getMyDir() + "Document.docx", loadOptions); Assert.assertEquals(13.8, doc.getStyles().getDefaultParagraphFormat().getLineSpacing(), 0.005f);
getPassword/setPassword | → inherited from LoadOptions |
public java.lang.String getPassword() / public void setPassword(java.lang.String value) |
You need to know the password to open an encrypted document. If the document is not encrypted, set this to null or empty string.
Example:
Shows how to sign encrypted document file.// Create certificate holder from a file CertificateHolder certificateHolder = CertificateHolder.create(getMyDir() + "morzal.pfx", "aw"); SignOptions signOptions = new SignOptions(); signOptions.setComments("Comment"); signOptions.setSignTime(new Date()); signOptions.setDecryptionPassword("docPassword"); // Digitally sign encrypted with "docPassword" document in the specified path String inputFileName = getMyDir() + "Encrypted.docx"; String outputFileName = getArtifactsDir() + "DigitalSignatureUtil.DecryptionPassword.docx"; DigitalSignatureUtil.sign(inputFileName, outputFileName, certificateHolder, signOptions);
getPreferredControlType/setPreferredControlType | |
public int getPreferredControlType() / public void setPreferredControlType(int value) |
Example:
Shows how to set preferred type of document nodes that will represent imported <input> and <select> elements.final String html = "\r\n<html>\r\n<select name='ComboBox' size='1'>\r\n" + "<option value='val1'>item1</option>\r\n<option value='val2'></option>\r\n</select>\r\n</html>\r\n"; HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions(); htmlLoadOptions.setPreferredControlType(HtmlControlType.STRUCTURED_DOCUMENT_TAG); Document doc = new Document(new ByteArrayInputStream(html.getBytes("UTF-8")), htmlLoadOptions); NodeCollection nodes = doc.getChildNodes(NodeType.STRUCTURED_DOCUMENT_TAG, true); StructuredDocumentTag tag = (StructuredDocumentTag) nodes.get(0);
getPreserveIncludePictureField/setPreserveIncludePictureField | → inherited from LoadOptions |
public boolean getPreserveIncludePictureField() / public void setPreserveIncludePictureField(boolean value) |
By default, the INCLUDEPICTURE field is converted into a shape object. You can override that if you need the field to be preserved, for example, if you wish to update it programmatically. Note however that this approach is not common for Aspose.Words. Use it on your own risk.
One of the possible use cases may be using a MERGEFIELD as a child field to dynamically change the source path of the picture. In this case you need the INCLUDEPICTURE to be preserved in the model.
Example:
Shows a way to update a field ignoring the MERGEFORMAT switch.LoadOptions loadOptions = new LoadOptions(); { loadOptions.setPreserveIncludePictureField(true); } Document doc = new Document(getMyDir() + "Field INCLUDEPICTURE.docx", loadOptions); for (Field field : doc.getRange().getFields()) { if (((field.getType()) == (FieldType.FIELD_INCLUDE_PICTURE))) { FieldIncludePicture includePicture = (FieldIncludePicture) field; includePicture.setSourceFullName(getImageDir() + "Transparent background logo.png"); includePicture.update(true); } } doc.updateFields(); doc.save(getArtifactsDir() + "Field.UpdateFieldIgnoringMergeFormat.docx");
getResourceLoadingCallback/setResourceLoadingCallback | → inherited from LoadOptions |
public IResourceLoadingCallback getResourceLoadingCallback() / public void setResourceLoadingCallback(IResourceLoadingCallback value) |
Example:
Shows how to handle external resources in Html documents during loading.public void loadOptionsCallback() throws Exception { // Create a new LoadOptions object and set its ResourceLoadingCallback attribute // as an instance of our IResourceLoadingCallback implementation LoadOptions loadOptions = new LoadOptions(); loadOptions.setResourceLoadingCallback(new HtmlLinkedResourceLoadingCallback()); // When we open an Html document, external resources such as references to CSS stylesheet files and external images // will be handled in a custom manner by the loading callback as the document is loaded Document doc = new Document(getMyDir() + "Images.html", loadOptions); doc.save(getArtifactsDir() + "Document.LoadOptionsCallback.pdf"); } /// <summary> /// Resource loading callback that, upon encountering external resources, /// acknowledges CSS style sheets and replaces all images with a substitute. /// </summary> private static class HtmlLinkedResourceLoadingCallback implements IResourceLoadingCallback { public int resourceLoading(ResourceLoadingArgs args) throws IOException { switch (args.getResourceType()) { case ResourceType.CSS_STYLE_SHEET: System.out.println("External CSS Stylesheet found upon loading: {args.OriginalUri}"); return ResourceLoadingAction.DEFAULT; case ResourceType.IMAGE: System.out.println("External Image found upon loading: {args.OriginalUri}"); final String NEW_IMAGE_FILENAME = "Logo.jpg"; System.out.println("\tImage will be substituted with: {newImageFilename}"); byte[] imageBytes = DocumentHelper.getBytesFromStream(new FileInputStream(getImageDir() + NEW_IMAGE_FILENAME)); args.setData(imageBytes); return ResourceLoadingAction.USER_PROVIDED; } return ResourceLoadingAction.DEFAULT; } }
getSupportVml/setSupportVml | |
public boolean getSupportVml() / public void setSupportVml(boolean value) |
Example:
Shows how to parse HTML document with conditional comments like "<!--[if gte vml 1]>" and "<![if !vml]>".HtmlLoadOptions loadOptions = new HtmlLoadOptions(); // If value is true, then we parse "<!--[if gte vml 1]>", else parse "<![if !vml]>" loadOptions.setSupportVml(true); // Wait for a response, when loading external resources loadOptions.setWebRequestTimeout(1000); Document doc = new Document(getMyDir() + "Conditional comments.htm", loadOptions); doc.save(getArtifactsDir() + "HtmlLoadOptions.SupportVml.docx");
getTempFolder/setTempFolder | → inherited from LoadOptions |
public java.lang.String getTempFolder() / public void setTempFolder(java.lang.String value) |
null
and no temporary files are used.
The folder must exist and be writable, otherwise an exception will be thrown.
Aspose.Words automatically deletes all temporary files when reading is complete.
getUpdateDirtyFields/setUpdateDirtyFields | → inherited from LoadOptions |
public boolean getUpdateDirtyFields() / public void setUpdateDirtyFields(boolean value) |
dirty
attribute.
Example:
Shows how to use special property for updating field result.Document doc = new Document(); DocumentBuilder builder = new DocumentBuilder(doc); Field fieldToc = builder.insertTableOfContents("\\o \"1-3\" \\h \\z \\u"); fieldToc.isDirty(true); doc.save(getArtifactsDir() + "Field.insertAndUpdateDirtyField.docx"); Assert.assertTrue(doc.getRange().getFields().get(0).isDirty()); LoadOptions loadOptions = new LoadOptions(); loadOptions.setUpdateDirtyFields(false); doc = new Document(getArtifactsDir() + "Field.insertAndUpdateDirtyField.docx", loadOptions);
getWarningCallback/setWarningCallback | → inherited from LoadOptions |
public IWarningCallback getWarningCallback() / public void setWarningCallback(IWarningCallback value) |
Example:
Shows how to print and store warnings that occur during document loading.public void loadOptionsWarningCallback() throws Exception { // Create a new LoadOptions object and set its WarningCallback attribute as an instance of our IWarningCallback implementation LoadOptions loadOptions = new LoadOptions(); loadOptions.setWarningCallback(new DocumentLoadingWarningCallback()); // Warnings that occur during loading of the document will now be printed and stored Document doc = new Document(getMyDir() + "Document.docx", loadOptions); ArrayList<WarningInfo> warnings = ((DocumentLoadingWarningCallback)loadOptions.getWarningCallback()).getWarnings(); Assert.assertEquals(3, warnings.size()); } /// <summary> /// IWarningCallback that prints warnings and their details as they arise during document loading. /// </summary> private static class DocumentLoadingWarningCallback implements IWarningCallback { public void warning(WarningInfo info) { System.out.println("Warning: {info.WarningType}"); System.out.println("\tSource: {info.Source}"); System.out.println("\tDescription: {info.Description}"); mWarnings.add(info); } public ArrayList<WarningInfo> getWarnings() { return mWarnings; } private ArrayList<WarningInfo> mWarnings = new ArrayList<>(); }
getWebRequestTimeout/setWebRequestTimeout | |
public int getWebRequestTimeout() / public void setWebRequestTimeout(int value) |
Example:
Shows how to parse HTML document with conditional comments like "<!--[if gte vml 1]>" and "<![if !vml]>".HtmlLoadOptions loadOptions = new HtmlLoadOptions(); // If value is true, then we parse "<!--[if gte vml 1]>", else parse "<![if !vml]>" loadOptions.setSupportVml(true); // Wait for a response, when loading external resources loadOptions.setWebRequestTimeout(1000); Document doc = new Document(getMyDir() + "Conditional comments.htm", loadOptions); doc.save(getArtifactsDir() + "HtmlLoadOptions.SupportVml.docx");