Changes from 1.43.0 to 1.44.0
Core
- Added code in
Segments
to preserveTextPart
Properties and Annotations after segmentation. Also add code to handle deepen segmentation case properly to produce new segments with proper ids i.e., if parent segmentid=s1
and the parent is further segmented the children segment ids ares1.1
,s1.2
,s1.3
etc… - Can use space + backslash at the end for a line break.
It generates a<br />
without ending the paragraph or the list item. - Recent JDK releases (4/2022) have set xpath operator limits to smaller values to enhance security. We override these defaults to allow ITS based filters to work without limits.
- Fix to
GenericSkeleton
to allow deep copy of all parent references (not just “self”) - Updates to
ISegmenter
methods to allow preservation of inlineCode
ids when joining segments inTextUnitMerger
Connectors
-
GlobalSight
- Removed
Filters
-
PO Filter
- Fixed an issue which caused bilingual PO files not to merge correctly when a subfilter was applied, PR #605.
-
IDML Filter
- Improved: initial support for end notes provided: issue #856, styles handling for nested elements.
- Improved: custom text variables can be optionally translated: issue #1138
- Improved: index topics can be optionally translated: issue #1139
- Improved: Rainbow UI for font mappings provided: issue #1149
-
Markdown filter
- Added support for Admonition syntax: PR #621
-
OpenXML Filter
- Improved: font mapping for XLSX documents provided: issue #972
- Improved: revisions automatically accepted in XLSX documents: issue #983
- Improved: hidden styled text parts extracted as modifiable in PPTX documents: issue #1011
- Fixed: the handling of cell references in table parts clarified: issue #1143
- Fixed: differential format reading clarified: issue #1144
- Improved: Rainbow UI for complex worksheet configurations provided,
deprecated column exclusion configurations for XLSX documents removed: issue #1147 - Improved: Rainbow UI for font mappings provided: issue #1150
- Fixed: empty referent runs handling clarified: issue #1157
-
XLIFF2 Filter
- Fix xliff2 filter handling of ignorable - auto-create target (copy of source) if needed
- xliff2 segment and ingorable
TextPart
s now given auto-generated id’s - If xliff2 segment state is not initial then write target only if there is no content
Libraries
-
Serialization Library
- Add new Google Protobuffer based library to serialize TextUnits. Library is used to produce a serialized file
in three formats (1) Binary protobuffer (2) textual protobuffer (3) standard JSON. The serialized file can be
used in place of XLIFF 1.2 to facilitate extraction and merge using
OriginalDocumentTextUnitFlatMergerStep
- Add new Google Protobuffer based library to serialize TextUnits. Library is used to produce a serialized file
in three formats (1) Binary protobuffer (2) textual protobuffer (3) standard JSON. The serialized file can be
used in place of XLIFF 1.2 to facilitate extraction and merge using
Steps
-
Segmentation Step
- added
setDoNotSegmentIfHasTarget
option (default is false). If true we turn off segmentation if theTextUnit
has a target. This is to protect from producing misalignments.
- added
-
XLIFF Word-Count Splitter Step
- Fixed: context groups copied on splitting: issue #1156
Applications
-
Tikal
- Update Tikal to preserve whitespace in the extracted xliff 1.2