Minimize e2e brittleness by christianbaroni · Pull Request #7406 · rainbow-me/rainbow

christianbaroni · 2026-04-28T22:01:06Z

What changed (plus any additional context for devs)

Fixes several sources of e2e brittleness that were showing up as intermittent Android and iOS shard failures.

E2E setup

Runs transaction flows against a pinned Foundry/Anvil setup shared across shards
Downloads workflow artifacts through a retrying helper that validates the zip before extraction
Waits for Anvil RPC readiness before Maestro starts a transaction flow
Resets Anvil to ANVIL_FORK_BLOCK_NUMBER before each transaction attempt, with an Anvil restart if reset fails
Handles test wallet import URLs delivered at launch through Linking.getInitialURL()
Transaction setup waits on app-level state exposed through E2EStatusMarker:
- wallet import exposes e2e-wallet-importing, e2e-wallet-ready, and e2e-wallet-error
- Anvil connection exposes e2e-anvil-connected
- Anvil funding exposes e2e-anvil-funded after the provider is reachable and the funding transaction confirms

Maestro flows

Transaction and onboarding flows wait on stable screen/testID anchors instead of rendered balance text
Android PIN entry uses the numpad testID rather than relying on visible-text matching against 1
Swap and wrap iOS command blocks use Maestro's case-sensitive iOS platform value
Send and wrap flows wait for target asset/NFT rows, action buttons, and toast contents before continuing

NFT collection loading

The SendNft recordings showed the Rainbow Pooly collection with the collection row visible, but Rainbow Pooly 16 was missing from the expanded collection.

fetchNftCollection(...) accepts { expectedCount, force }
Non-forced collection fetches retry until the response includes the requested collection and at least the expected number of NFTs
Collection open/mount paths pass the known collection total as expectedCount
Forced refreshes after NFT sends and ENS registration bypass expectedCount, because those refreshes can legitimately remove NFTs from the collection
buildBriefUniqueTokenList(...) normalizes collection totals before building placeholder NFT rows

github-actions · 2026-04-28T22:25:13Z

🧪 Flashlight Performance Report (AWS Device Farm)

🔀 Commit: 9ac8297

📎 View Artifacts

Metric	Current	Δ vs Baseline
Time to Interactive (TTI)	5610 ms	–
Average FPS	56.66	–
Average RAM	398.5 MB	–

github-actions · 2026-04-28T22:47:51Z

Launch in simulator or device for e2cb651

github-actions · 2026-04-28T23:24:20Z

Launch in simulator or device for 3d92e7d

github-actions · 2026-04-29T02:13:06Z

Launch in simulator or device for d3c5a64

This reverts commit bdb4be7.

github-actions · 2026-04-29T03:09:30Z

Launch in simulator or device for 82a89f7

github-actions · 2026-04-29T04:00:42Z

Launch in simulator or device for 61dde53

github-actions · 2026-04-29T04:50:40Z

Launch in simulator or device for 6ebf168

This reverts commit 1be576e.

github-actions · 2026-04-29T05:23:31Z

Launch in simulator or device for 2ca541e

janicduplessis

Nice, I remember seeing random failures related to anvil download. Left a few minor comments, but also fine as is.

janicduplessis · 2026-04-30T05:16:45Z

    try {
-      const currentValue = useConnectedToAnvilStore.getState().connectedToAnvil;
-      setConnectedToAnvil(!currentValue);
+      setConnectedToAnvil(true);


Prob not a big deal, but this means we can't disconnect from anvil if enabled in dev?

janicduplessis · 2026-04-30T05:17:57Z

        return;
      }

+      if (handledUrls.current.has(url)) return;


Is this needed? My idea for commands is it could be used for anything so it would make sense to be able to execute the same one multiple times.

janicduplessis · 2026-04-30T05:21:48Z

        uses: actions/checkout@08eba0b27e820071cde6df949e0beb9ba4906955

+      - name: Cache Foundry binaries
+        if: github.event_name != 'issue_comment' || startsWith(github.event.comment.body, '/e2e')


I think this might not be needed, also iOS doesnt have it.

olerass · 2026-04-30T11:01:25Z

Overall looks good with the specific fixes and foundry centralization, but I wonder if there's a better approach do achieve the same fixes in some of the cases?

re: remoteConfig short-circuit
We're stubbing out initializeRemoteConfig / fetchRemoteConfig entirely under IS_TEST, which means e2e no longer covers the boot-pat fetch at all (e.g init crashes, bad responses, the 3s timeout race against welcome-screen, any flag-gated behavior). Afaics the underlying issue that config fetch can race the welcome-screen render is a real production bug, not a test bug. Acuals users with bad networks would see the same race we're now "hiding" here. So I'd actually argue the e2e suite is the oppsiite of flaky in this regard, it's surfacing a real issue that the app itself is flaky (?)

We may want to consider either making config init non-blocking liek I think has been discussed previpusly wth a proper fallback UI meanwhile, which fixes it for both tests AND users, or we could inject a test confnig client at the boundary or use MSW so the init code still runs against known fixtures. Both are bigger lifts ofc, but stubbing it out doesn't fix the race, it just hides it from this one consumer, and I'm not usre it's the right path tbh. Curious what your thoughts are here @christianbaroni ?

re: the new e2e marker views
they're a bit ugly but tbh I can't find alternatives that are bviously better and afaiu the production cost is zero since they're behind IS_TEST && status !== 'idle' right?

re: the broader direction
IMO we should be generally trying to reduce IS_TEST usage in the app, not grow it. The markers in this regard are a small step the wrong way; the remoteConfig stub is a bigger step the wrong way. Worth being explicit about whether we're ok with that tradeoff for the immediate flake win (I;d argue no, but curious what others think), especially given we'll soon be running the e2e suite outside PRs entirely, so flakes will ofc still be a thing but much less prevalent and blocking.

github-actions · 2026-04-30T19:18:37Z

Launch in simulator or device for ddce447

github-actions · 2026-05-01T04:18:59Z

Launch in simulator or device for 9ac8297

christianbaroni · 2026-05-01T07:11:23Z

Overall looks good with the specific fixes and foundry centralization, but I wonder if there's a better approach do achieve the same fixes in some of the cases?

re: remoteConfig short-circuit

We're stubbing out initializeRemoteConfig / fetchRemoteConfig entirely under IS_TEST, which means e2e no longer covers the boot-pat fetch at all (e.g init crashes, bad responses, the 3s timeout race against welcome-screen, any flag-gated behavior). Afaics the underlying issue that config fetch can race the welcome-screen render is a real production bug, not a test bug. Acuals users with bad networks would see the same race we're now "hiding" here. So I'd actually argue the e2e suite is the oppsiite of flaky in this regard, it's surfacing a real issue that the app itself is flaky (?)

We may want to consider either making config init non-blocking liek I think has been discussed previpusly wth a proper fallback UI meanwhile, which fixes it for both tests AND users, or we could inject a test confnig client at the boundary or use MSW so the init code still runs against known fixtures. Both are bigger lifts ofc, but stubbing it out doesn't fix the race, it just hides it from this one consumer, and I'm not usre it's the right path tbh. Curious what your thoughts are here @christianbaroni ?

re: the new e2e marker views

they're a bit ugly but tbh I can't find alternatives that are bviously better and afaiu the production cost is zero since they're behind IS_TEST && status !== 'idle' right?

re: the broader direction

IMO we should be generally trying to reduce IS_TEST usage in the app, not grow it. The markers in this regard are a small step the wrong way; the remoteConfig stub is a bigger step the wrong way. Worth being explicit about whether we're ok with that tradeoff for the immediate flake win (I;d argue no, but curious what others think), especially given we'll soon be running the e2e suite outside PRs entirely, so flakes will ofc still be a thing but much less prevalent and blocking.

Removed the remote config changes which weren't needed. The current loading design there works pretty well in production — it's only blocking (up to 3s) when the app version changes which is generally the only time it needs to block, and even then it's typically fast to load.

Also cleaned up the marker component code so it's at least not scattered within production code.

Regarding IS_TEST usage, generally agree we should avoid scattering it all over and prefer lower-level control points. The remaining uses are I think justified.

With the recent commits the brittleness seems to be resolved, or at least is far less common, so broadly I think these changes are worthwhile.

There were a couple real NFT bugs that showed up in the run artifact recordings. Might be useful to have those posted somewhere on failure rather than being buried in the CI views.

olerass · 2026-05-04T07:13:17Z

Thanks for simplifying @christianbaroni! The core E2E fixes here look really good!

A few more things worth considering before merge, mostly around minimizing what's actually in this PR vs. what could ship separately or even get dropped.

1.Split out the NFT fixes
These are real bug fixes not really related to the E2E suite right? Just because the e2e suite found them, doesn't mean we should fix it in an E2E PR. Pulling them into their own fix PR will make review more focused, will also attribute this correctly in our changelog generation for releases, and make revert easier (if we need to).

2. platform: ios → platform: iOS casing fix
Afaiu this is not really reducing flakiness but more so activating what was previousluy dead test code that noop before? Couldn't this in theory cause more flakiness (or finding more real issues) since it's now asserting properly?
E.g if iOS shards start failing on those specific assertions you want bisect to land here directly, not folded into a 32-file e2e infra PR.

3. Dormant marker states
Afaics only e2e-anvil-connected and e2e-anvil-funded are used anywhere? The e2e-wallet-* ones look like leftover from the prev PR shape, but not sure about the rest?

What if we make it such that:

TestDeeplinkHandler: drop the commandStatus state machine, E2ECommandStatus type, and marker render. Component returns null again. Keep getInitialURL, param validation etc since those are the real fixes.
RainbowContext: type unions shrink to 'idle' | 'connected' and 'idle' | 'funded'.

Removes a bunch of code that nothing reads.

4. Used markers -> dev button state?
For the 2 actually used markers, the dev buttons that trigger them already exist in the dev/test surface. Could those buttons just encode the operation state in their label/style? Like "Connect to Anvil" → "Connecting…" → "Anvil ✓" / "Anvil ✗"; same for fund.

If we do this maestro will be asserting on the existing button testID (with the new label state), no need for invisible 2x2 marker views anywhere, devs pressing the buttons get visible feedback they don't have today as a bonus even. Alsow ould remove the the IS_TEST code.

Bigger refactor ofc, and fine to defer if you'd rather keep the markers, but if we do this it looks to m elike we can drop e2e markers entirely and solve it simpler.

christianbaroni added 2 commits April 28, 2026 17:53

Harden e2e shard synchronization and setup

39ec324

Fix e2e import readiness checks

fcf5873

Fix WrapTransaction unwrap action selector

b4c1692

christianbaroni changed the title ~~[Test] Attempt to minimize e2e brittleness~~ [Test] Minimize e2e brittleness Apr 28, 2026

[Temporary] Trigger CI run

bdb4be7

Revert "[Temporary] Trigger CI run"

afc1c5a

This reverts commit bdb4be7.

Share Foundry binaries across e2e shards

4a41792

[Temporary] Trigger CI run

1be576e

Revert "[Temporary] Trigger CI run"

47e04ef

This reverts commit 1be576e.

janicduplessis approved these changes Apr 30, 2026

View reviewed changes

christianbaroni added 2 commits April 30, 2026 14:33

Reset Anvil before transaction e2e attempts

a3c961f

Restore remote config in e2e, centralize readiness markers

003306c

Fix NFT bugs causing intermittent failures

a770c73

christianbaroni mentioned this pull request May 1, 2026

CI speedups #7407

Open

christianbaroni changed the title ~~[Test] Minimize e2e brittleness~~ Minimize e2e brittleness May 1, 2026

Conversation

christianbaroni commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed (plus any additional context for devs)

E2E setup

Maestro flows

NFT collection loading

Uh oh!

github-actions Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 28, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

janicduplessis left a comment

Choose a reason for hiding this comment

Uh oh!

janicduplessis Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

janicduplessis Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

janicduplessis Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

olerass commented Apr 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 30, 2026

Uh oh!

github-actions Bot commented May 1, 2026

Uh oh!

christianbaroni commented May 1, 2026

Uh oh!

olerass commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

christianbaroni commented Apr 28, 2026 •

edited

Loading

github-actions Bot commented Apr 28, 2026 •

edited

Loading

olerass commented Apr 30, 2026 •

edited

Loading

olerass commented May 4, 2026 •

edited

Loading