Changes from M28 to M29

Applications

  • Tikal

    • The scoping report option now outputs character counts in addition to word counts by default.

Filters

  • IDML Filter

    • Fixed a concurrency issue that could cause crashes when multiple instances of the filter were used simultaneously.
  • OpenXML Filter

    • The way formatting information is converted to codes has changed. The filter will now attempt to streamline code generation by considering whether the formatting applied to a text run can be considered a “nested” format within the existing formatting. For example, a bold, italic run would be considered “nested” within a bold run. This allows for a more natural code mapping that should be more intuitive for translators, and is also more closely aligned with other tools.
    • Style inheritance is now considered when calculating the formatting in effect for a run of text.
    • Right-to-left (RTL) support has been added for paragraphs, table content in DOCX files and some DrawingML constructs.
    • Fixed issue #486. Simple and complex fields are now represented as a single code for the entire field.
    • Fixed issue #487. Runs that differ only in script specified for non-overlapping codepoint ranges can now be merged. This reduces the number of inline codes produced in some cases.
    • Fixed issue #502. Cells that are in rows and columns that are hidden will no longer be exposed for translation by default. This brings the behavior of the Excel filter into alignment with the behavior of the other OpenXML filters. A new option, “Translate Hidden Rows and Columns”, has been added to the configuration for the Excel portion of the OpenXML filter.
    • The “Clean Tags Aggressively” option will now strip <w:bCs> and <w:szCs> tags from Word documents.
    • Fixed a crash that could occur when parsing files with enormous attribute values.
    • The non-breaking hyphen is now converted to a character, rather than treated as a tag.
  • ITS Filter

    • Added type for text units coming from attributes (value: x-<attribute-name>).
  • Table Filter

    • Fixed issue #511: now empty targets with delimiters are merged properly.
  • TXML Filter

    • Fixed issue #501, where segment elements commented out were deleted from the output file.
  • XLIFF Filter

    • Fixed issue #500, where alt-trans proposals with a match-quality score in decimal form (“100.00”) were treated as having a score of 0.
    • Added support to change sdlxliff original attribute values based on okf_xliff-sdl filter configuration. conf and locked attributes are also supported.

Libraries

  • XLIFFWriter

    • Added support for state-qualifier output in main <target>.

Connectors

  • Pensieve

    • **IMPORTANT: Code.codesToString() changes. ** The pensieve TM format has changed and is not backwards compatible. You will need to export your TM's and re-import them with M29.

Steps

  • Added character count Steps

    • The Character Count step calculates character counts per the GMX-V 2.0 standard and stores them in a Metrics annotation (like the Word Count step). There are also steps for counting all GMX non-translatable categories (ProtectedCharacterCount, etc.) and Okapi categories (Condordance, FuzzyMatch, MT, etc.).
  • GMX “-Only” word count Steps

    • The AlphanumericOnly, NumericOnly, and MeasurementOnly word count steps now follow the GMX standard in that they only give non-zero counts for TUs that consist solely of tokens of the relevant type. (Previously they merely counted relevant tokens.)
  • Translation Comparison Step

    • Added an option to use the target of the alt-trans element for a given origin value when processing an XLIFF file as second file. This allows to compare an MT candidate placed as alt-trans entry with the actual translation in the main target element.
  • Scoping Report Step

    • The Scoping Report step now can report character counts when the relevant annotations are present. Use both the Word Count and Character Count steps to get full detail. The default template has been updated to include character counts for the included categories.
  • Post-segmentation Inline Codes Removal Step

    • Added step that attempts to simplify (trim and merge) as many inline codes as possible by looking at each linguistically distinct segment in a TextUnit.

Connectors

  • KantanMT Support

    • Added a new connector to support KantanMT.
  • Microsoft Translation Hub

    • Fixed an issue when working with trained engines with certain target languages.