fix: PositionalEncoding Shape Mismatch on Odd Dimensions by agam263 · Pull Request #609 · gc-os-ai/pyaptamer

agam263 · 2026-05-01T00:59:08Z

Fix: PositionalEncoding Shape Mismatch on Odd Dimensions

📖 Summary

This PR resolves a persistent runtime crash in the PositionalEncoding layer that occurred whenever the model dimension (d_model) was an odd number. The fix ensures that the layer correctly handles arbitrary embedding dimensions, making the AptaTrans architecture more flexible and robust.

🔍 Technical Root Cause

The PositionalEncoding layer generates fixed sinusoids to inject positional information into embeddings. The implementation uses a frequency-divisor vector (div_term) that is shared between sine and cosine operations:

Index Split: Sine waves are assigned to even indices (0, 2, 4...) and cosine waves to odd indices (1, 3, 5...).
The Mismatch:
- The size of div_term is calculated based on torch.arange(0, d_model, 2), which has a length of ceil(d_model / 2).
- If d_model is 128 (even):
  - Even indices: 64 slots
  - Odd indices: 64 slots
  - div_term: 64 values (No error)
- If d_model is 127 (odd):
  - Even indices: 64 slots
  - Odd indices: 63 slots
  - div_term: 64 values
The Crash: The assignment pe[0, :, 1::2] = torch.cos(position * div_term) fails for odd d_model because PyTorch cannot fit 64 values into a tensor with 63 slots.

🛠️ Proposed Changes

1. Corrected Frequency Mapping

Adjusted the cosine assignment to correctly slice the div_term to match the number of available odd-indexed slots.

Logic: Used div_term[: d_model // 2] to ensure the number of frequencies exactly matches the number of odd positions, regardless of whether d_model is even or odd.

2. Dimensionality Regression Tests

Introduced pyaptamer/aptatrans/tests/test_pe_robustness.py to safeguard against future regressions:

Odd/Even Verification: Tests both standard (even) and non-standard (odd) dimensions.
Extreme Edge Cases: Verified stability for minimal dimensions (e.g., d_model=1).

⚠️ Impact & Risks

Architectural Flexibility: Developers can now experiment with any embedding size without the model crashing during the initialization of the Transformer blocks.
Stability: This is a safe change that does not modify the output for existing models where d_model is even.

✅ Verification Results

Unit Tests: pytest pyaptamer/aptatrans/tests/test_pe_robustness.py -> Passed.
Inference Stability: Verified that the change preserves identical output values for even-dimension embeddings.

direkkakkar319-ops · 2026-05-01T08:38:47Z

hi @agam263 your pr looks similar to #527

probably correct as you patched on the older file

fix: PositionalEncoding crash on odd d_model

8949fcb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: PositionalEncoding Shape Mismatch on Odd Dimensions#609

fix: PositionalEncoding Shape Mismatch on Odd Dimensions#609
agam263 wants to merge 1 commit intogc-os-ai:mainfrom
agam263:fix-pe-odd-dimension

agam263 commented May 1, 2026

Uh oh!

direkkakkar319-ops commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

agam263 commented May 1, 2026

Fix: PositionalEncoding Shape Mismatch on Odd Dimensions

📖 Summary

🔍 Technical Root Cause

🛠️ Proposed Changes

1. Corrected Frequency Mapping

2. Dimensionality Regression Tests

⚠️ Impact & Risks

✅ Verification Results

Uh oh!

direkkakkar319-ops commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants