Release of XMLmind Word To XML v1.8
September 30, 2019
Submitted by Hussein Shafie, XMLmind Software.
Important enhancements and bug fixes. Recommended upgrade, especially if you use w2x to convert DOCX to DITA. More information below.
What is XMLmind Word To XML?
XMLmind Word To XML can automatically convert DOCX files to:
- Clean, styled, valid HTML (single page or multi-page HTML, Web Help, EPUB) looking very much like the source DOCX file.
- Unstyled, but structured and valid, DITA bookmap, map, topic, DocBook, XHTML (single page or multi-page HTML, Web Help, EPUB) or XML conforming to your custom schema.
Home Page: https://www.xmlmind.com/w2x/
Download: https://www.xmlmind.com/w2x/download.shtml
Free online DOCX conversion services: https://www.xmlmind.com/w2x/online_w2x.html
Change log
Enhancements:
- XMLmind Word To XML is better at choosing user-specified bookmarks (expected to have long and descriptive names like "Edit_a_citation") over bookmarks automatically generated by MS-Word (e.g. "BM3" ).
This may have important benefits when converting a DOCX file to multiple semantic XML files (e.g. a DITA map and its associated topics) because in such case, the names of the generated files are generally inferred from user-specified bookmarks.
In order to implement this enhancement, we had to replace parameter
edit.ids.automatic-ids
by new parameterconvert.automatic-ids
. This has been done to move the detection of automatic bookmarks at an earlier stage of the conversion process. - Upgraded XMLmind Web Help Compiler (whc for short) to version 2.3.2.
- XMLmind Word To XML, which passed all non-regression tests, is now officially supported on Java™ 13 platforms.
Bug fixes:
- Some “complex” fields containing a reference to a bookmark (e.g. field
REF (Foo) \h
, where the bookmark name "(Foo)
" contains charactersU+FF08
andU+FF09
) were converted to broken links. This bug may have had important and somewhat surprising consequences, for example, on generated DITA maps, so please upgrade to version 1.8. - First index entry marking a given page range (e.g. field
XE "XML" \r "OpenXMLPageRange"
) in the DOCX document was correctly converted to an DITA or DocBook index term range (e.g.<indexterm start="OpenXMLPageRange">XML</indexterm>
and<indexterm end="OpenXMLPageRange"/>
), but subsequent index entries marking the same page range (e.g. fieldXE "Extensible Markup Language" \r "OpenXMLPageRange"
) were converted to single point index terms (e.g.<indexterm>Extensible Markup Language</indexterm>
).Now subsequent index entries marking the same page range are converted to index terms containing a redirection to the first index term referencing of this page range (e.g.
<indexterm>Extensible Markup Language<index-see>XML</index-see></indexterm>
).
Incompatibilities:
- XMLmind Word To XML now requires a Java 8+ runtime in order to compile and run.
- Parameter
edit.ids.automatic-ids
is now ignored after reporting a warning. This parameter has been replaced by new parameterconvert.automatic-ids
. See above enhancement. - On Windows, EMF graphics are now converted to PNG using a resolution of at least 300DPI. Therefore the default rule used to perform this conversion is (
resolution -300
means: use 300DPI if intrinsic resolution is less than 300DPI):.emf.png.wmf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution -300
Previously this rule was (
resolution 0
means: use intrinsic resolution).emf.png.wmf.png java:com.xmlmind.w2x_ext.emf2png.EMF2PNG resolution 0