Skip to content

fix(chain): prefer block producer-matching parent in skip approval resolution#15597

Open
ssavenko-near wants to merge 6 commits intomasterfrom
slavas/fix-fork-parent
Open

fix(chain): prefer block producer-matching parent in skip approval resolution#15597
ssavenko-near wants to merge 6 commits intomasterfrom
slavas/fix-fork-parent

Conversation

@ssavenko-near
Copy link
Copy Markdown
Contributor

@ssavenko-near ssavenko-near commented Apr 17, 2026

When resolving the parent block for a Skip(parent_height) approval, collect_block_approval previously picked an arbitrary block hash from the set of known blocks at that height. Near epoch boundaries, multiple blocks at the same height can belong to different epochs, each mapping to a different block producer for the target height. Picking the wrong one causes the receiver to incorrectly conclude it is not the intended producer and silently drop a valid approval.

Instead, iterate all known blocks at parent_height and prefer one where the current node is the block producer for target_height. Fall back to the first hash if none match, preserving existing behavior for non-producers.

Had to disable the test_skip_approval_prefers_producer_matching_parent test on SPICE.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.54%. Comparing base (50b343f) to head (670cb1e).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #15597      +/-   ##
==========================================
+ Coverage   69.53%   69.54%   +0.01%     
==========================================
  Files         940      940              
  Lines      214389   214412      +23     
  Branches   214389   214412      +23     
==========================================
+ Hits       149065   149121      +56     
+ Misses      59410    59382      -28     
+ Partials     5914     5909       -5     
Flag Coverage Δ
pytests-nightly 1.14% <0.00%> (-0.01%) ⬇️
unittests 68.98% <100.00%> (+0.01%) ⬆️
unittests-nightly 69.09% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ssavenko-near ssavenko-near changed the title fix: prefer block producer-matching parent in skip approval resolution fix(chain): prefer block producer-matching parent in skip approval resolution Apr 17, 2026
@ssavenko-near ssavenko-near changed the title fix(chain): prefer block producer-matching parent in skip approval resolution fix(chain): prefer block producer-matching parent in skip approval resolution Apr 17, 2026
@ssavenko-near ssavenko-near force-pushed the slavas/fix-fork-parent branch from cdb27f0 to c85403f Compare April 17, 2026 14:00
@ssavenko-near ssavenko-near requested a review from Copilot April 17, 2026 14:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a subtle failure mode in Client::collect_block_approval when resolving the parent hash for Skip(parent_height) approvals: instead of picking an arbitrary block at parent_height, it searches the known candidates and prefers the one whose epoch makes the current node the block producer for target_height, avoiding silently dropping valid approvals around epoch-boundary forks. It also adds an integration test covering handling of unresolvable epoch data among candidates.

Changes:

  • Update skip-approval parent resolution to prefer the candidate whose derived epoch assigns this node as block producer at target_height.
  • Add an integration test that simulates multiple BlockPerHeight candidates including an unresolvable hash and verifies the approval still reaches Doomslug when a valid candidate exists.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
chain/client/src/client.rs Adjusts Skip(parent_height) parent hash selection to try all candidates and prefer one matching the node’s producer assignment for target_height.
integration-tests/src/tests/client/doomslug.rs Adds a regression test for skip approvals when some parent-height candidates have unresolvable epochs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread chain/client/src/client.rs Outdated
Comment thread integration-tests/src/tests/client/doomslug.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread integration-tests/src/tests/client/doomslug.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ssavenko-near ssavenko-near force-pushed the slavas/fix-fork-parent branch 2 times, most recently from 6d7c0a4 to 8455fb9 Compare April 17, 2026 15:33
@ssavenko-near ssavenko-near requested a review from Copilot April 17, 2026 15:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ssavenko-near ssavenko-near marked this pull request as ready for review April 17, 2026 16:01
@ssavenko-near ssavenko-near requested a review from a team as a code owner April 17, 2026 16:01
Comment thread chain/client/src/client.rs Outdated
Comment thread integration-tests/src/tests/client/doomslug.rs Outdated
@ssavenko-near ssavenko-near force-pushed the slavas/fix-fork-parent branch from 5350e40 to c081627 Compare April 20, 2026 10:24
Comment on lines +208 to +215
tested += 1;
// We repeat across 4 epoch boundaries. Each trial independently
// has ~50% chance of the right parent being first in HashMap iteration
// order, so 4 trials give ~6% false-pass probability.
if tested >= 4 {
return;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: I am not against to simplify the test here and not do retries. In that case, a regression could cause a flaky test rather than failing, but we could have a simpler test.
It also seems fine to keep it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to have it less flaky. False negatives are especially annoying cause some code breaking the test may sneak in and pushing after it would become difficult.

Comment thread chain/client/src/client.rs Outdated
…solution

When resolving the parent block for a Skip(parent_height) approval, collect_block_approval previously picked an arbitrary block hash from the set of known blocks at that height. Near epoch boundaries, multiple blocks at the same height can belong to different epochs, each mapping to a different block producer for the target height. Picking the wrong one causes the receiver to incorrectly conclude it is not the intended producer and silently drop a valid approval.

Instead, iterate all known blocks at parent_height and prefer one where the current node is the block producer for target_height. Fall back to the first hash if none match, preserving existing behavior for non-producers.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 2163 to +2184
/// Collects block approvals.
///
/// We send the approval to doomslug given the epoch of the current tip iff:
/// 1. We are the block producer for the target height in the tip's epoch;
/// 2. The signature matches that of the account;
/// If we are not the block producer, but we also don't know the previous block, we add the
/// approval to `pending_approvals`, since it could be that the approval is from the next epoch.
///
/// # Arguments
/// * `approval` - the approval to be collected
/// * `approval_type` - whether the approval was just produced by us (in which case skip validation,
/// only check whether we are the next block producer and store in Doomslug)
/// Resolve the parent block hash for a skip approval.
///
/// Multiple blocks can exist at `parent_height` when there are forks
/// (e.g. around an epoch boundary). Each fork block may belong to a
/// different epoch with different block-producer assignments for
/// `target_height`. We prefer the parent whose epoch makes us the block
/// producer so that the approval is not silently dropped later.
///
/// Returns `None` when no blocks exist at `parent_height`.
fn resolve_skip_parent(
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc-comment block starting with /// Collects block approvals. is now immediately followed by the new resolve_skip_parent helper, so Rustdoc will attach all of those /// lines to resolve_skip_parent and collect_block_approval ends up undocumented. Suggest splitting the docs: keep the collect_block_approval docs directly above pub fn collect_block_approval(...), and leave only the skip-parent docs above resolve_skip_parent (or convert one of the blocks to // comments).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssavenko-near I guess the copilot is right about the doc comments being merged together.

Comment on lines +2190 to +2208
let hashes = self.chain.chain_store().get_all_block_hashes_by_height(parent_height);
let mut first = None;
let chosen = hashes.values().flatten().find(|hash| {
if first.is_none() {
first = Some(**hash);
}
my_account_id.is_some_and(|id| {
self.epoch_manager
.get_epoch_id_from_prev_block(hash)
.and_then(|epoch_id| {
self.epoch_manager.get_block_producer(&epoch_id, target_height)
})
.ok()
.as_ref()
== Some(id)
})
});
chosen.copied().or(first)
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolve_skip_parent always iterates over all hashes when my_account_id is None because the .find(..) predicate can never succeed. This is a small but unnecessary performance regression vs the previous next() behavior. Consider an early return when my_account_id.is_none() (or restructure to compute first via iter.next() and only do the producer-matching scan when Some).

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +30
#[test]
fn test_skip_approval_prefers_producer_matching_parent() {
init_test_logger();
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says test_skip_approval_prefers_producer_matching_parent was disabled on SPICE, but the test currently has no #[cfg_attr(feature = "protocol_feature_spice", ignore)] (or equivalent guard). This will cause the test to still run under --features protocol_feature_spice unlike other SPICE-incompatible tests in this crate. Please add the ignore/guard or update the PR description if it’s no longer needed.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@darioush darioush left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Comment on lines 2163 to +2184
/// Collects block approvals.
///
/// We send the approval to doomslug given the epoch of the current tip iff:
/// 1. We are the block producer for the target height in the tip's epoch;
/// 2. The signature matches that of the account;
/// If we are not the block producer, but we also don't know the previous block, we add the
/// approval to `pending_approvals`, since it could be that the approval is from the next epoch.
///
/// # Arguments
/// * `approval` - the approval to be collected
/// * `approval_type` - whether the approval was just produced by us (in which case skip validation,
/// only check whether we are the next block producer and store in Doomslug)
/// Resolve the parent block hash for a skip approval.
///
/// Multiple blocks can exist at `parent_height` when there are forks
/// (e.g. around an epoch boundary). Each fork block may belong to a
/// different epoch with different block-producer assignments for
/// `target_height`. We prefer the parent whose epoch makes us the block
/// producer so that the approval is not silently dropped later.
///
/// Returns `None` when no blocks exist at `parent_height`.
fn resolve_skip_parent(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssavenko-near I guess the copilot is right about the doc comments being merged together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants