- Replace per-block thread spawning with persistent thread pool for merkleization #6344
- Eliminate stack-frame spill in Stack::push for zero-upper-limb values #6390
- Use const-generic big-endian conversion in PUSH opcodes #6390
- Doubly pipelined merkleization with self-coordinating shard workers #6278
- Switch hot EVM and mempool HashMaps to FxHashMap for faster hashing #6303
- SIMD-accelerate trie nibble operations for block execution #6286
- Use FxHashMap in call frame backup #6286
- Use nested storage originals, FxHashMap call frame backup, and sstore-specific storage access helper #6265
- Refactor LEVM opcode handlers to avoid expensive matches #4791
- Speed up snap sync validation with parallelism and deduplication #6191
- Disable balance check for prewarming to avoid early reverts #6259
- Expand fast-path dispatch in LEVM interpreter loop #6245
- Check self before parent in Substate warm/cold lookups #6244
- Add precompile result cache shared between warmer and executor threads #6243
- Optimize storage layer for block execution by reducing lock contention and allocations #6207
- Defer KZG blob proof verification from P2P to mempool insertion #6150
- Cache ECDSA sender recovery in transaction structs #6153
- Reuse cache in prewarm workers #5999
- Optimize
debug_executionWitnessby pre-serializing RPC format at storage time #5956 - Use fastbloom as the bloom filter #5968
- Improve snap sync logging with table format and visual progress bars #5977
- Remove
ethrex-threadpoolcrate and moveThreadPooltoethrex-trie#5925 - Add frame pointers setting to makefiles #5746
- Remove
Mutex<Box<_>>fromDatabaseLogger::storeto reduce contention #5930
- Reduce state iterated when calculating partial state transitions #5864
- Remove needless allocs in CALLDATACOPY/CODECOPY/EXTCODECOPY #5810
- Inline common opcodes #5761
- Improve ecrecover precompile by removing heap allocs and conversions #5709
- Refactor
ecpairingusing ark #5792
- Remove needless allocs on store api #5709
- Avoid double parsing and extra clones in doc signature formatting #9285
- Make HashSet use fxhash in discv4 peer_table #5688
- Validate tx blobs after checking if it's already in the mempool #5686
- Parallelize storage merkelization #6079
- Avoid unnecessary hashing of init codes and already hashed codes #5397
- Change some calls from
encode_to_vec().len()to.length()when wanting to get the rlp encoded length #5374 - Use our keccak implementation for receipts bloom filter calculation #5454
- Use unchecked swap for stack #5439
- Improve rlp encoding by avoiding extra loops and remove unneeded array vec, also adding a alloc-less length method the default trait impl #5350
- Parallelize merkleization #5377
- Avoid temporary allocations when decoding and hashing trie nodes #5353
- Use specialized DUP implementation #5324
- Avoid recalculating blob base fee while preparing transactions #5328
- Use BlobDB for account_codes column family #5300
- Only mark individual values as dirty instead of the whole trie #5282
- Separate Account and storage Column families in rocksdb #5055
- Avoid copying while reading account code #5289
- Cache
BLOBBASEFEEopcode value #5288
- Insert instead of merge for bloom rebuilds #5223
- Replace sha3 keccak to an assembly version using ffi #5247
- Fix
FlatKeyValuegeneration on fullsync mode #5274
- Disable RocksDB compression #5223
- Reuse stack pool in LEVM #5179
- Merkelization backpressure and batching #5200
- Pipeline Merkleization and Execution #5084
- Add bloom filters to snapshot layers #5112
- Make trusted setup warmup non blocking #5124
- Run "engine_newPayload" block execution in a dedicated worker thread. #5051
- Reusing FindNode message per lookup loop instead of randomizing the key for each message. #5047
- Move trie updates post block execution to a background thread. #4989.
- Instead of lazy computation of blocklist, do greedy computation of allowlist and store the result, fetch it with the DB. #4961
- Remove duplicate subgroup check in ecpairing precompile #4960
- Replaces incremental iteration with a one-time precompute method that scans the entire bytecode, building a
BitVec<u8, Msb0>where bits mark validJUMPDESTpositions, skippingPUSH1..PUSH32data bytes. - Updates
is_blacklistedto O(1) bit lookup.
- Improve get_closest_nodes p2p performance #4838
- Remove explicit cache-related options from RocksDB configuration and reverted optimistic transactions to reduce RAM usage #4853
- Remove unnecesary mul in ecpairing #4843
- Improve block headers vec handling in syncer #4771
- Refactor current_step sync metric from a
Mutex<String>to a simple atomic. #4772
- Change remaining_gas to i64, improving performance in gas cost calculations #4684
- Downloading all slots of big accounts during the initial leaves download step of snap sync #4689
- Downloading and inserting intelligently accounts with the same state root and few (<= slots) #4689
- Improving the performance of state trie through an ordered insertion algorithm #4689
- Remove
OpcodeResultto improve tight loops of lightweight opcodes #4650
- Avoid dumping empty storage accounts to disk #4590
- Improve instruction fetching, dynamic opcode table based on configured fork, specialized push_zero in stack #4579
-
Fix caching mechanism of the latest block's hash #4479
-
Add
jemallocas an optional global allocator used by default #4301 -
Improve time when downloading bytecodes from peers #4487
- Add
RocksDBas an optional storage engine #4272
- Implement fast partition of
TrieIteratorand use it for quickly respondingGetAccountRangesandGetStorageRanges#4404
- Refactor substrate backup mechanism to avoid expensive clones #4381
- Use x86-64-v2 cpu target on linux by default, dockerfile will use it too. #4252
- Process JUMPDEST gas and pc together with the given JUMP JUMPI opcode, improving performance. #4220
- Improve P2P mempool gossip performance #4205
- Improve precompiles further: modexp, ecrecover #4168
- Improve memory resize performance #4117
- Improve calldatacopy opcode further #4150
-
Improve Memory::load_range by returning a Bytes directly, avoding a vec allocation #4098
-
Improve ecpairing (bn128) precompile #4130
-
Improve BLS12 precompile #4073
-
Improve blobbasefee opcode #4092
-
Make precompiles use a constant table #4097
-
Improve addmod and mulmod opcode performance #4072
-
Improve signextend opcode performance #4071
-
Improve performance of calldataload, calldatacopy, extcodecopy, codecopy, returndatacopy #4070
- Use malachite crate to handle big integers in modexp, improving perfomance #4045
-
Cache chain config and latest canonical block header #3878
-
Batching of transaction hashes sent in a single NewPooledTransactionHashes message #3912
-
Make
JUMPDESTblacklist lazily generated on-demand #3812 -
Rewrite Blake2 AVX2 implementation (avoid gather instructions and better loop handling).
-
Add Blake2 NEON implementation.
- Add a secondary index keyed by sender+nonce to the mempool to avoid linear lookups #3865
-
Refactor current callframe to avoid handling avoidable errors, improving performance #3816
-
Add shortcut to avoid callframe creation on precompile invocations #3802
- Use
rayonto recover the sender address from transactions #3709
-
Migrate EcAdd and EcMul to Arkworks #3719
-
Add specialized push1 and pop1 to stack #3705
-
Improve precompiles by avoiding 0 value transfers #3715
-
Improve BlobHash #3704
Added push1 and pop1 to avoid using arrays for single variable operations.
Avoid checking for blob hashes length twice.
-
Use a lookup table for opcode execution #3669
-
Improve CodeCopy perfomance #3675
-
Improve sstore perfomance further #3657
- Improve levm memory model #3564
- Add sstore bench #3552
- Add AVX256 implementation of BLAKE2 #3590
- Improve sstore opcodes #3555
- Improve blake2f #3503
- Use a stack pool #3386
- Refactor jump opcodes to use a blacklist on invalid targets.
-
Improved the performance of shift instructions. 2933
-
Refactor Patricia Merkle Trie to avoid rehashing the entire path on every insert 2687
- Add immutable cache to LEVM that stores in memory data read from the Database so that getting account doesn't need to consult the Database again. 2829
- Reduce account clone overhead when account data is retrieved 2684
- Reduce transaction clone and Vec grow overhead in mempool 2637
- Make TrieDb trait use NodeHash as key 2517
-
Avoid calculating state transitions after every block in bulk mode 2519
-
Transform the inlined variant of NodeHash to a constant sized array 2516
-
Removed some unnecessary clones and made some functions const: 2438
-
Asyncify some DB read APIs, as well as its users #2430
- Fix an issue where the table was locked for up to 20 sec when performing a ping: 2368
- Fix a bug where RLP encoding was being done twice: #2353, check
the report under
docs/perf_reportsfor more information.
- Asyncify DB write APIs, as well as its users #2336
- Faster block import, use a slice instead of copy #2097
- Don't recompute transaction senders when building blocks #2097
- Process blocks in batches when syncing and importing #2174
- Compute tx senders in parallel #2268