Skip to content

Fix XLIFF 1.2 parser for Symfony compatibility and standards compliance#495

Draft
daviddallet wants to merge 1 commit intomirego:masterfrom
daviddallet:fix/issue-494-xliff-1-2-standard-compliance
Draft

Fix XLIFF 1.2 parser for Symfony compatibility and standards compliance#495
daviddallet wants to merge 1 commit intomirego:masterfrom
daviddallet:fix/issue-494-xliff-1-2-standard-compliance

Conversation

@daviddallet
Copy link
Copy Markdown

DRAFT PULL REQUEST ONLY


Issue #494

Fix XLIFF 1.2 parser for standards compliance (#494)

Replace rigid pattern matching with flexible element-finding functions so the parser handles standard-compliant XLIFF files from tools like Symfony: <xliff> root wrappers, <header> elements, extra attributes and children in <trans-unit>, and reordered <source>/<target>.


Note: the diff is mostly the result of Claude Code (Opus 4.6).

I didn't made enough checks myself to send this pull request with a status different than draft.
I don't have knowledge of this product, I am not used to this tech stack so take this with appropriate precautions.

@daviddallet
Copy link
Copy Markdown
Author

Should stay in draft before updates, not ready yet.

I believed CI failed in both cases (master and this) when trying to run CI through github actions on a cloned repo but it appears that I looked at it too quickly, CI failure causes were different at last attempts.

@daviddallet daviddallet changed the title Fix XLIFF 1.2 parser for Symfony compatibility and standards compliance (#494) Fix XLIFF 1.2 parser for Symfony compatibility and standards compliance Feb 20, 2026
Issue mirego#494 about Symfony related files

Replace rigid pattern matching with flexible element-finding functions
so the parser handles standard-compliant XLIFF files from tools like
Symfony: <xliff> root wrappers, <header> elements, extra attributes
and children in <trans-unit>, and reordered <source>/<target>.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@daviddallet daviddallet force-pushed the fix/issue-494-xliff-1-2-standard-compliance branch from 9333433 to ba3d793 Compare February 20, 2026 21:47
Copy link
Copy Markdown
Author

@daviddallet daviddallet Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opus 4.6 LLM's explanation about this file choices

Analysis of parse_only_test.exs

File: test/langue/xliff_1_2/parse_only_test.exs

What it does

It's the only parse_only_test.exs in the entire repo. It tests 4 expectati
on modules (SymfonyFormat, HeaderAndBody, TransUnitWithNotes, ReorderedSo urceTarget) by running only test_parse -- no test_serialize.

The standard formatter_test.exs next to it tests the Simple expectation with
both test_parse and test_serialize.

Why "parse only"?

These 4 test cases exercise XML structures that the parser can accept but th
e serializer cannot reproduce:

  • SymfonyFormat -- input is wrapped in <xliff> root element. The seriali
    zer always outputs a bare <file> element (see serializer.ex:37), so `parse -

serialize -> compare to input` would fail.

  • HeaderAndBody -- input has a <header> element. The serializer only emi
    ts <body>, never <header>.
  • TransUnitWithNotes -- input has <note> children and xml:space attrib
    utes. The serializer strips those.
  • ReorderedSourceTarget -- input has <target> before <source>. The ser
    ializer always emits <source> first.

In all 4 cases, render -> parse -> entries works correctly, but entries -> se rialize -> render produces different XML than the original input. So a round-tr
ip assertion (assert expected_serialize == result_serialize) would fail. The f
ile exists specifically to test parser tolerance without asserting serialization
fidelity.

Does it match the repo's style?

No. It's an outlier:

  1. Unique naming -- Every other formatter (JSON, Android, Rails, Gettext, et
    c.) has exactly one formatter_test.exs that tests both parse and serialize tog
    ether. No other formatter has a parse_only_test.exs.

  2. Different loop pattern -- The standard formatter_test.exs uses Module. concat(LangueTest.Formatter.XLIFF12.Expectation, test) to resolve modules from
    short atoms like Simple. The parse_only_test.exs uses fully-qualified aliase
    s directly (SymfonyFormat etc.), a slightly different convention.

  3. Conceptual split -- The repo's convention is: one expectation module = on
    e test case that round-trips (parse + serialize). When round-tripping isn't poss
    ible, the standard approach in other formatters is simply to not include that ca
    se. The parse_only_test.exs file invents a new pattern to work around this.

Is it needed for coverage?

Yes, it provides real value. Those 4 cases test important parser robustness:
handling <xliff> wrappers, <header> elements, extra child nodes like `<note

, and reordered elements. Without this file, those parser code paths in extra
ct_bodyandparse_line` would be untested.

Summary

The file serves a legitimate purpose (testing parser tolerance of real-world XLI
FF variations) but introduces a one-off pattern that doesn't exist anywhere else
in the repo. The idiomatic alternative would be to fold these parse-only cases
into formatter_test.exs with a conditional (only assert parse, skip serialize
for certain cases), or to split the loop into two -- but neither pattern exists
in the repo today either. It's a pragmatic addition that gets the job done at
the cost of consistency.

defmodule Simple do
@moduledoc false
use Langue.Expectation.Case
use Case
Copy link
Copy Markdown
Author

@daviddallet daviddallet Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opus 4.6 explanation about this line change. Summary: due to mandatory Styler plugin

In expectation_test.exs, the diff changes the pre-existing Simple module
:

+  alias Langue.Expectation.Case
+
   defmodule Simple do
     @moduledoc false
-    use Langue.Expectation.Case
+    use Case

This change to Simple is not necessary for the fix. The original use Langue.Expectation.Case worked fine. The only reason this changed is because mix format with the Styler plugin auto-refactored it: when we added 4 new modules that also use Langue.Expectation.Case, Styler's AliasUsage rule (alias if used more than N times) kicked in and extracted the alias at the parent module level, then rewrote all 5 use calls to use Case.

Verdict: This is a cosmetic/style change forced by the formatter. It's harmless and consistent with project conventions (the project runs Styler as a mandatory format plugin), but it is not part of the bug fix itself. Reverting it would cause mix format --check-formatted to fail, so it must stay if the new modules stay.

@daviddallet
Copy link
Copy Markdown
Author

I believed CI failed in both cases (master and this) when trying to run CI through github actions on a cloned repo but it appears that I looked at it too quickly, CI failure causes were different at last attempts.

Now github CI on forked repo fails at the same step for both master and the branch of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants