-
Notifications
You must be signed in to change notification settings - Fork 2
Fix Workflow Step Output Ordering #149
Description
We probably need Option 2 here and we should research if this is an issue with Galaxy workflows - I think probably?
CWL MultipleInputFeatureRequirement: Non-deterministic Output Ordering
Test
test_conformance_v1_2_multiple_input_feature_requirement — passes ~50% of the time.
Workflow has two independent steps (step1→"hello", step2→"world") with outputSource: [step1/out, step2/out]. Expected: ["hello\n", "world\n"].
Previous Fix (1e39176)
Sorted WorkflowInvocation.to_dict() output associations by workflow_output.workflow_step.order_index. This sorts by Galaxy's internal step ordering — not by the CWL outputSource array position.
What the fix does correctly
The sort is on the right code path:
- Outputs ARE HDAs going through
output_datasets - Pydantic serialization preserves list order
- Test framework correctly reads the API response
invocation_to_output()preserves list order from the API
Full chain: to_dict() → WorkflowInvocationResponse Pydantic model → JSON → invocation_response.json() → invocation_to_output() → get_output_as_object() → cwltest.compare.compare().
The problem: order_index is the wrong sort key
The sort key workflow_output.workflow_step.order_index represents the step's position in Galaxy's topological sort, derived from the order cwltool returns the steps. The CWL spec says the outputSource ARRAY order defines the result array order — [step1/out, step2/out] means step1's output MUST be first.
Galaxy doesn't store the outputSource array position anywhere. It only stores which step each WorkflowOutput belongs to. The sort by order_index only works if step definition order matches outputSource array order — a coincidence, not a guarantee.
Root Cause: cwltool intentionally shuffles steps
cwltool/workflow.py:108:
random.shuffle(self.steps)cwltool intentionally randomizes its WorkflowStep list after construction — specifically to surface ordering-dependent bugs in workflow engines. This is exactly the kind of bug it's designed to catch.
Key distinction:
self._workflow.steps→list[WorkflowStep]— shuffled every loadself._workflow.tool["steps"]→ raw CWL document dict — never shuffled, retains document order
Galaxy's step_proxies() iterates self._workflow.steps (the shuffled one):
for i, step in enumerate(self._workflow.steps):
proxies.append(build_step_proxy(self, step, i + num_input_steps))With 2 steps, random.shuffle has exactly 2 permutations:
- Run A: shuffle gives
[step1, step2]→order_indexstep1=0, step2=1 → sort gives[hello, world]→ PASS - Run B: shuffle gives
[step2, step1]→order_indexstep2=0, step1=1 → sort gives[world, hello]→ FAIL
Each test run starts a new Galaxy server process → ~50% pass rate.
How to verify
Add a log.warning in WorkflowProxy.step_proxies() to log the step IDs in iteration order:
def step_proxies(self):
if self._step_proxies is None:
proxies = []
num_input_steps = len(self._workflow.tool["inputs"])
for i, step in enumerate(self._workflow.steps):
log.warning(f"CWL step {i}: {step.id}") # <-- add this
proxies.append(build_step_proxy(self, step, i + num_input_steps))
self._step_proxies = proxies
return self._step_proxiesRun the test several times. If step1 and step2 swap positions between runs, this confirms the theory.
Fix Options
Option 1: Sort steps at Galaxy's import boundary (quick fix, no migration)
Sort self._workflow.steps by a deterministic key in step_proxies():
for i, step in enumerate(sorted(self._workflow.steps, key=lambda s: s.tool["id"])):Or better, sort once at WorkflowProxy.__init__ and store as self._sorted_steps.
Pros: Single-line fix, zero DB changes, neutralizes cwltool's shuffle.
Cons: Alphabetical by step ID may not match outputSource array order if step names don't sort in document order. For the specific conformance test (step1/step2), alphabetical == document order.
Option 2: Store source_index on WorkflowOutput (full fix, needs migration)
Add a source_index column to the workflow_output table during CWL import, capturing each output's position in the outputSource array. Sort by source_index in to_dict().
Pros: Fully correct per CWL spec regardless of step naming.
Cons: Requires Alembic migration, heavier change.
Option 3: Sort multi-output list by outputSource position at to_dict() time
Reconstruct the outputSource order at serialization time. This is fragile because to_dict() doesn't have access to the CWL workflow definition.
Not recommended.
Recommendation
Start with Option 1. It fixes the immediate 50% flakiness with minimal risk. If a CWL workflow is later found where alphabetical step ID order doesn't match outputSource order, upgrade to Option 2.
Also audit other uses of self._workflow.steps — the runnables property (parser.py:775) also iterates it unsorted.
Key code locations
| Location | Role |
|---|---|
cwltool/workflow.py:108 |
random.shuffle(self.steps) — the root cause |
lib/galaxy/tool_util/cwl/parser.py:765 |
step_proxies() — iterates shuffled cwltool steps |
lib/galaxy/tool_util/cwl/parser.py:831 |
WorkflowProxy.to_dict() — builds Galaxy step dicts from CWL |
lib/galaxy/tool_util/cwl/parser.py:734 |
get_outputs_for_label() — iterates tool["outputs"] (stable, not shuffled) |
lib/galaxy/model/__init__.py:9876 |
to_dict() sort of output_datasets by order_index |
lib/galaxy/model/__init__.py:9637 |
add_output() dispatches to output_datasets / output_values / output_collections |
lib/galaxy/workflow/steps.py:15 |
attach_ordered_steps() — topsort + order_index assignment |
lib/galaxy/tool_util/cwl/util.py:661 |
invocation_to_output() — extracts outputs from API response |
lib/galaxy_test/base/populators.py:400 |
CwlRun.get_output_as_object() — converts to CWL JSON for comparison |