All images are stored inside Shape nodes in a Document.
To extract all images or images having specific type from the document, follow these steps:
· Use the Document.GetChildNodes method to select all Shape nodes.
· Iterate through resulting node collections.
· Check the Shape.HasImage boolean property.
· Extract image data using the Shape.ImageData property.
· Save image data to a file.
Example
Shows how to extract images from a document and save them as files.
[Java]
public void extractImagesToFiles() throws Exception
{
Document doc = new Document(getMyDir() + "Image.SampleImages.doc");
NodeCollection shapes = doc.getChildNodes(NodeType.SHAPE, true);
int imageIndex = 0;
for (Shape shape : (Iterable<Shape>) shapes)
{
if (shape.hasImage())
{
String imageFileName = java.text.MessageFormat.format(
"Image.ExportImages.{0} Out{1}", imageIndex, FileFormatUtil.imageTypeToExtension(shape.getImageData().getImageType()));
shape.getImageData().save(getMyDir() + imageFileName);
imageIndex++;
}
}
// Newer Microsoft Word documents (such as DOCX) may contain a different type of image container called DrawingML.
// Repeat the process to extract these if they are present in the loaded document.
NodeCollection dmlShapes = doc.getChildNodes(NodeType.DRAWING_ML, true);
for (DrawingML dml : (Iterable<DrawingML>) dmlShapes)
{
if (dml.hasImage())
{
String imageFileName = java.text.MessageFormat.format(
"Image.ExportImages.{0} Out{1}", imageIndex, FileFormatUtil.imageTypeToExtension(dml.getImageData().getImageType()));
dml.getImageData().save(getMyDir() + imageFileName);
imageIndex++;
}
}
}