Skip to content

Fix Workflow Step Output Ordering #149

@jmchilton

Description

@jmchilton

We probably need Option 2 here and we should research if this is an issue with Galaxy workflows - I think probably?

CWL MultipleInputFeatureRequirement: Non-deterministic Output Ordering

Test

test_conformance_v1_2_multiple_input_feature_requirement — passes ~50% of the time.

Workflow has two independent steps (step1→"hello", step2→"world") with outputSource: [step1/out, step2/out]. Expected: ["hello\n", "world\n"].

Previous Fix (1e39176)

Sorted WorkflowInvocation.to_dict() output associations by workflow_output.workflow_step.order_index. This sorts by Galaxy's internal step ordering — not by the CWL outputSource array position.

What the fix does correctly

The sort is on the right code path:

  • Outputs ARE HDAs going through output_datasets
  • Pydantic serialization preserves list order
  • Test framework correctly reads the API response
  • invocation_to_output() preserves list order from the API

Full chain: to_dict()WorkflowInvocationResponse Pydantic model → JSON → invocation_response.json()invocation_to_output()get_output_as_object()cwltest.compare.compare().

The problem: order_index is the wrong sort key

The sort key workflow_output.workflow_step.order_index represents the step's position in Galaxy's topological sort, derived from the order cwltool returns the steps. The CWL spec says the outputSource ARRAY order defines the result array order — [step1/out, step2/out] means step1's output MUST be first.

Galaxy doesn't store the outputSource array position anywhere. It only stores which step each WorkflowOutput belongs to. The sort by order_index only works if step definition order matches outputSource array order — a coincidence, not a guarantee.

Root Cause: cwltool intentionally shuffles steps

cwltool/workflow.py:108:

random.shuffle(self.steps)

cwltool intentionally randomizes its WorkflowStep list after construction — specifically to surface ordering-dependent bugs in workflow engines. This is exactly the kind of bug it's designed to catch.

Key distinction:

  • self._workflow.stepslist[WorkflowStep]shuffled every load
  • self._workflow.tool["steps"] → raw CWL document dict — never shuffled, retains document order

Galaxy's step_proxies() iterates self._workflow.steps (the shuffled one):

for i, step in enumerate(self._workflow.steps):
    proxies.append(build_step_proxy(self, step, i + num_input_steps))

With 2 steps, random.shuffle has exactly 2 permutations:

  • Run A: shuffle gives [step1, step2]order_index step1=0, step2=1 → sort gives [hello, world]PASS
  • Run B: shuffle gives [step2, step1]order_index step2=0, step1=1 → sort gives [world, hello]FAIL

Each test run starts a new Galaxy server process → ~50% pass rate.

How to verify

Add a log.warning in WorkflowProxy.step_proxies() to log the step IDs in iteration order:

def step_proxies(self):
    if self._step_proxies is None:
        proxies = []
        num_input_steps = len(self._workflow.tool["inputs"])
        for i, step in enumerate(self._workflow.steps):
            log.warning(f"CWL step {i}: {step.id}")  # <-- add this
            proxies.append(build_step_proxy(self, step, i + num_input_steps))
        self._step_proxies = proxies
    return self._step_proxies

Run the test several times. If step1 and step2 swap positions between runs, this confirms the theory.

Fix Options

Option 1: Sort steps at Galaxy's import boundary (quick fix, no migration)

Sort self._workflow.steps by a deterministic key in step_proxies():

for i, step in enumerate(sorted(self._workflow.steps, key=lambda s: s.tool["id"])):

Or better, sort once at WorkflowProxy.__init__ and store as self._sorted_steps.

Pros: Single-line fix, zero DB changes, neutralizes cwltool's shuffle.
Cons: Alphabetical by step ID may not match outputSource array order if step names don't sort in document order. For the specific conformance test (step1/step2), alphabetical == document order.

Option 2: Store source_index on WorkflowOutput (full fix, needs migration)

Add a source_index column to the workflow_output table during CWL import, capturing each output's position in the outputSource array. Sort by source_index in to_dict().

Pros: Fully correct per CWL spec regardless of step naming.
Cons: Requires Alembic migration, heavier change.

Option 3: Sort multi-output list by outputSource position at to_dict() time

Reconstruct the outputSource order at serialization time. This is fragile because to_dict() doesn't have access to the CWL workflow definition.

Not recommended.

Recommendation

Start with Option 1. It fixes the immediate 50% flakiness with minimal risk. If a CWL workflow is later found where alphabetical step ID order doesn't match outputSource order, upgrade to Option 2.

Also audit other uses of self._workflow.steps — the runnables property (parser.py:775) also iterates it unsorted.

Key code locations

Location Role
cwltool/workflow.py:108 random.shuffle(self.steps) — the root cause
lib/galaxy/tool_util/cwl/parser.py:765 step_proxies() — iterates shuffled cwltool steps
lib/galaxy/tool_util/cwl/parser.py:831 WorkflowProxy.to_dict() — builds Galaxy step dicts from CWL
lib/galaxy/tool_util/cwl/parser.py:734 get_outputs_for_label() — iterates tool["outputs"] (stable, not shuffled)
lib/galaxy/model/__init__.py:9876 to_dict() sort of output_datasets by order_index
lib/galaxy/model/__init__.py:9637 add_output() dispatches to output_datasets / output_values / output_collections
lib/galaxy/workflow/steps.py:15 attach_ordered_steps() — topsort + order_index assignment
lib/galaxy/tool_util/cwl/util.py:661 invocation_to_output() — extracts outputs from API response
lib/galaxy_test/base/populators.py:400 CwlRun.get_output_as_object() — converts to CWL JSON for comparison

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions