Add packfile library design doc by tamirms · Pull Request #633 · stellar/stellar-rpc

tamirms · 2026-03-18T17:55:58Z

Summary

Adds a design document for a compact, immutable file format that provides O(1) random access to ordinal-indexed data (events, bitmaps, ledgers) with minimal I/O
This is a building block for a full history implementation of RPC (v2)
The doc covers the file format, read/write paths, integrity model, and API reference
The goal is to align on this design before proceeding with implementation — please flag any concerns or blockers so we can address them during the design stage

Note: This PR targets main for now but will be retargeted to the full history RPC / v2 branch once we align on that before merging.

Pull request overview

Adds a design document proposing a “packfile” immutable, ordinal-indexed file format and Go library API intended as a building block for Stellar RPC v2 full-history storage (compact storage with O(1) random access and minimal I/O).

Changes:

Introduces the packfile/intpack concepts, goals/non-goals, and usage examples.
Specifies the on-disk layout (records, index, trailer) plus integrity/content-hash model.
Documents proposed Go APIs for writer/reader, concurrency behavior, and error surface.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

- Clarify trailer flag descriptions: Bit 0 means not zstd-compressed (not implying CRC is always present), and document the Bit 0 + Bit 2 combination as the Raw format - Fix ErrIndexRange message from "record index" to "item index" to match the public API contract Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

graydon

I got curious so I had claude spit out a simpler program (https://github.com/graydon/xdrpack -- both go and rust though I only tested the rust version) that just does a zstd train on an xdr stream's frames to build a shared dictionary, and then compresses each frame with that dictionary and indexes the offsets of each compressed frame a single table, with offsets sized to minimum to span the file. seems to work, though who knows if it's comparable to the benchmark numbers you're after. I guess I'm just a bit unsure whether the complexity of the structure you've built here buys you much, but perhaps it does -- happy to talk more about it!

graydon · 2026-03-24T03:12:25Z

+
+- **O(1) random access by ordinal index.** Every `ReadItem` call maps index to record via arithmetic, then reads and decodes a single record.
+- **Minimal I/O.** The full index loads in one disk read on open (~112KB for 68K records). After that, one disk read per record, exact size, no over-read.
+- **Compact index.** Index size depends on max record size, not file size. A file with 20KB records uses 15-bit deltas whether the file is 500MB or 50GB.


I think this is not true. I think the index size described in this document is the size of an index entry times the number of index entries, which grows without bound. The size of an index entry (as a delta) is small, but the 500MB file mentioned here would have 25,000 index entries whereas the 50GB file would have 2.5 million index entries.

graydon · 2026-03-24T03:19:11Z

+
+## Goals
+
+- **O(1) random access by ordinal index.** Every `ReadItem` call maps index to record via arithmetic, then reads and decodes a single record.


This is .. depending on how you look at it "not achieved" by the design in this doc. The index is delta-encoded which means the entire index needs decoding on file-open (which is O(n) despite having a fairly healthy divisor on n).

Once open and decoded, it's O(1), but that's not guaranteed to be easy or cheap (eg. if you have lots of files you might be constantly reading/decoding/dropping their indexes from memory). I'd recommend just making your index be seekable itself, so you:

open the file

readAt a position in the index you calculate from the ordinal, to get the offset of the variable-sized record

readAt the variable-sized record

"not achieved" by the design in this doc

I do agree with this in a sense.
As in,
the offset index is delta-encoded, so you cant seek to a single entry — you need to decode ALL preceding deltas and cumulative-sum them to reconstruct absolute offsets.
thats O(N) on open, not O(1).
Granted N here is only ever 10k, atleast in the ledgers case, but still..
after open it's an array lookup, sure., but the doc says "O(1) random access" - as a generaliztion without quantification.

would be more honest to say "O(1) after O(n) initialization"?

graydon · 2026-03-24T04:34:17Z

+
+## How It Works
+
+Items are grouped into fixed-size **records** (default 128 items per record). Small items like events don't compress well individually, but 128 of them together do. Large items like ledgers are stored one per record since they compress well on their own. Each record is compressed and written as a contiguous block on disk.


I'm unclear on whether the design of bundling items into groups for compression is ideal. I think it's buying you two things?

Space for zstd to be able to achieve some compression

A secondary FOR base for the item-FOR-offsets-within-the-record, but exploiting that requires yet another FOR_index appended to each record.

I don't know, the whole thing seems a bit elaborate -- the FOR-and-delta coded primary index, the secondary FOR_index on each record, even the use of zstd.

I think you can get to a similarly-good place with less complexity if you:

Figure out a way to adequately individual items (eg. zstd --train for a while and then reuse that dictionary across all items; or perhaps just cook up a fixed-model compression scheme for XDR, I think we've tried this before in the past?)

FOR-encode the offsets in the index if you like, but:

Encode an offset for every item in a single index at the end, not a 2-level index-of-indexes

Don't also delta-encode the primary index entries

Figure out a single width for al the index entries in the file so you can calculate the index entry to readAt

Reset the FOR-base at those same 128-offset fixed intervals so you can pick that out with another readAt

Honestly .. even the FOR-encoding of offsets seems like it might well be overkill. I would just do plain leading-null suppression / picking a per-file offset width.

Like unless your files are truly gargantuan the offsets will probably be like 3-5 bytes most of the time (16MiB - 1TiB) and all the FOR wrangling will probably only get your offsets down to 1-2 bytes. So like the index is half as big, but .. is it worth an extra readAt on the hot path? Probably not. I would just figure out what the max offset you need is for all offsets in a file, write that number at the end -- eg. "this file is <= 16MiB so offsets in this file are each 3 bytes long", say -- and then just readAt(ordinal * 3) from your index, and that's your item offset, don't FOR or delta encode anything.

@graydon this is a great question and I think this gets at the core design decision in packfile.

Space for zstd to be able to achieve some compression

A secondary FOR base for the item-FOR-offsets-within-the-record, but exploiting that requires yet another FOR_index appended to each record.

That assessment is correct and I will elaborate on the specific benefits.

The constraint driving everything is: the events API returns up to 1,000 events per response, and in the worst case those events are scattered across the entire file. The file of events we've been benchmarking with spans 10,000 ledgers and contains 8.7M events (average 221 bytes each uncompressed) — it's recent and probably representative of current event density.

Compression: early on in our design phase @urvisavla actually tried out per event dictionary compression and she found training the dictionary allowed us to achieve ~2x compression. I also ran your xdrpack tool on the dataset of 8.7M events and confirmed the ~2x compression ratio using that method.

Using zstd on groups of 128 events per record gets us a a ~4.4x compression ratio. Individual events just don't give zstd enough context even with a 64 KB trained dictionary. Aside from a better compression ratio there are two more benefits:

faster ingestion times because we don't need to train the zstd dictionary

no need to read the 64 KB trained dictionary as a prerequisite step before querying individual events from the file

Index size: Without grouping, the index has one entry per event: 8.7M × 3 bytes = 26 MB. With grouping at 128, it's one entry per record: 68K entries. Using leading-null-suppressed offsets as you suggested, that's 68K × 3 bytes = 204 KB — fits in a single EBS IOP. FOR encoding shrinks it further to 112 KB, which gives us room to grow if event density increases or we want each packfile to span wider ledger ranges than 10,000.

Why this matters on EBS: On gp3 (3,000 IOPS baseline, 125 MB/s throughput), each random read costs one IOP regardless of size (up to 256 KB):

Without grouping: the 26 MB index is too large to preload, so each event lookup needs two IOPS — one index seek, one data read. Worst case is 1,000 scattered events = 2,000 IOPS = 667 ms.

With grouping: the 112 KB index loads in one IOP, then each event lookup is one data read. 1,001 IOPS = 334 ms. You might worry that grouped records are larger (~6.4 KB each vs a single compressed event), but 1,000 × 6.4 KB = 6.4 MB total, which takes 51 ms at 125 MB/s. Since IOPS and bandwidth are consumed simultaneously — each IOP transfers its bytes — the bottleneck is whichever takes longer: 334 ms for IOPS vs 51 ms for bandwidth. So it's IOPS-bound either way.

For the EBS case, I think we should strive to make the index at the end of the file as compact as possible so that we can load it with as few IOPs as possible. IMO, the FOR implementation (125 LoC) is justified for that but I'm definitely open to other suggestions.

However, it is not necessary for the index contained within each record to be super compact since the index size for 128 events is tiny in comparison to the event payload sizes. I agree that FOR is totally overkill there. I only used it in the packfile design because we're already using it for the primary index at the trailer. A simpler scheme like length-prefixed entries would work just as well but I figured we might as well just implement it as another call to the FOR encode / decode function.

In conclusion, I agree the overall design is more elaborate than a flat per-item approach. But, I think it's justified because we're getting more than 2x better compression, a 230x smaller index (112 KB vs 26 MB), and 2x lower worst case query latency on EBS (334 ms vs 667 ms).

ok! I didn't realize you were going to be reading off EBS -- I'm more used to thinking in terms of local NVMe IO patterns -- and that does change the calculus a bit.

graydon · 2026-03-24T04:35:31Z

+
+**Non-blocking Open.** `Open` returns a `*Reader` immediately. A background goroutine performs all I/O: open, stat, speculative read, trailer parse, CRC verification, index decode, app data read. A `sync.OnceValue` drains the result on the first query call. Errors are deferred to query time — `Open` itself never fails. This enables overlapped initialization: start loading an MPHF or opening other files while the goroutine runs.
+
+**Speculative Read.** On open, one pread of the last `min(256KB, fileSize)` bytes. This usually captures the trailer, app data, and index in one IOP. If the tail exceeds 256KB, a single fallback read fetches the rest.


I think this isn't true. The index will be as large as the index is. It's not guaranteed to fit in 256KB or 512KB or anything.

karthikiyer56 · 2026-03-24T21:39:04Z

+w.Finish(nil) // flushes partial record, writes index + trailer, fsyncs
+```
+
+Items are appended in order. `Finish` flushes any partial record, writes the offset index, optional app data, and a 64-byte trailer, then fsyncs. `Close` after `Finish` is a no-op; `Close` without `Finish` removes the incomplete file.


I read ahead and didnt see any mention of a viable uscase/way in which an application can use this, nor can I think of one that relates to either evnts or ledger usage.

Does it make sense to include it in the packfile format?

I'm not sure what you're referring to. Are you taking about the use case of calling Close() / Finish() ?

My bad.
I was talking about the section for app data in the packfile.

app data is useful for storing any type of data / metadata which is relevant for that specific packfile.

it is used for events to store a table mapping ledgers to cumulative event counts per ledger. The ledger counts are used so we can filter for events matching the getEvents ledger range

karthikiyer56 · 2026-03-24T22:04:57Z

+
+Packfile stores **record sizes** (deltas between consecutive offsets) instead. Deltas depend on the maximum record size, not total file size. A file with 20KB records uses 15-bit deltas whether the file is 500MB or 50GB.
+
+Deltas are encoded using **Frame of Reference (FOR)** compression in groups of 128. FOR subtracts a per-group minimum from every value, then bit-packs the residuals at the minimum bit width needed. Each group is self-contained:


im a bit confused by the 128s in this doc — theres RecordSize which defaults to 128 items per record,
And then thre is this line here that says - compression in groups of 128. for the FOR encoding here, which is a separate hardcoded constant somewhere, again mentioned at line 280 somewhere - The group size (128) is a library constant, independent of RecordSize. If it changes, the format version is bumped.

whats the rationale for the FOR group size being 128?
the doc doesn't say — it just appears as a bare number. Is it based on some empirical benchmarking you did for events (i know thats how you landed on the default recordsize=128)

can you make this more explicit?
reading through the doc, i kept thinking they were the same thing, and its only by accident that i noticed the mention saying theyre independent.
i think it deserves to be called out upfront — maybe give it a name like IndexGroupSize, explain why 128 was chosen, and make it clear early on that its a separate constant from default RecordSize.

also, if the group size is required to decode the index, it should probably be in the trailer?
theres 2 reserved bytes at offset 58 that could hold it?
right now a reader has to hardcode 128, and if that ever changes the only signal is a version bump — but a reader for version 1 would silently decode garbage (right?) instead of knowing it cant handle the file.

good point. the FOR group size for the index doesn't really matter so much based on the benchmarking I did. there were diminishing returns after a certain point:

Group Size Groups Index Size vs Flat (267 KB)

32 2,135 118.6 KB -55.5%

64 1,068 113.6 KB -57.4%

128 534 111.0 KB -58.4%

256 267 109.7 KB -58.9%

512 134 109.1 KB -59.1%

Including in the group size in the trailer makes sense, I can make that change

karthikiyer56 · 2026-03-24T22:15:41Z

+Items are grouped into fixed-size **records** (default 128 items per record). Small items like events don't compress well individually, but 128 of them together do. Large items like ledgers are stored one per record since they compress well on their own. Each record is compressed and written as a contiguous block on disk.
+
+An **offset index** at the end of the file maps record numbers to byte offsets. On open, the entire index is decoded into a flat `[]int64` array. Looking up item `i` is arithmetic: `offsets[i / RecordSize]` gives the record's byte offset, then a single disk read + decode extracts the item.
+


Can you include 1-2 exampes here?

one for events (RecordSize=128) and one for ledgers (RecordSize=1)?
something like - "here are 5 items (item being an individual event or lcm) of these sizes, heres what the records look like on disk, heres what the offset index looks like, heres the FOR encoding step by step."

i have read through this a few times and I am still struggling to understand the format here.
There are like 6-7 concepts/terms here - item, record, offset index, FOR group, delta, W, min - all in abstract terms across different sections. i had to read it three times to piece together how they relate and I still have some visualization issues 🙈

especially useful would be showing the two modes side by side —
events with RecordSize=128 where you get the per-record FOR_index AND the file-level offset index
VS
ledgers with RecordSize=1 where the per-record FOR_index disappears entirely. that difference is non-obvious from the current text.

karthikiyer56 · 2026-03-24T22:34:34Z

@tamirms : I wanted to highlight that the doc is slightly hard to follow the way it is currently structured.
it shows API code before explaining what records, FOR groups, or the offset index even are.
by the time you hit "How It Works" , you've already seen ReadItem, ReadItems, MPHF, bitmaps, fingerprints — none of which make sense yet.

perhspa, it flow better as:
Problem -> Goals -> Concepts/Terminology -> How It Works (with examples) -> File Format spec -> Usage/API -> Implementation Notes?
that way the reader builds understanding before seeing code.

also, intpack feels like an afterthought at the very end, but its FOR encoding is fundamental to both the offset index and the per-record item index. it deserves more prominence - maybe a dedicated section in "How It Works" explaining FOR with a small example before diving into the file format spec.

And then maybe have a glosssary of terms at he end?
I CCed something and it gave me something like this, which helped me big time.
Pleaes do fact check the table, but it looks acurate to me

Term	What it is	Book analogy
Item	One event or one ledger. The thing you're storing.	One page
Record	A container of 1 or more items, stored as one blob on disk. `RecordSize` controls how many items per record.	A chapter (1 page per chapter, or 128 pages per chapter)
Payload	The concatenated raw bytes of all items in a record, before compression.	The chapter text before printing
FOR_index	A mini-index appended to each multi-item record, storing each item's byte length so you can find individual items inside the decompressed payload. Only exists when RecordSize > 1.	A chapter's own table of contents showing where each page starts within that chapter
Offset index	The file-level index at the end of the file mapping record numbers to byte positions on disk.	The book's table of contents showing where each chapter starts
Delta	The byte size of one record on disk. The offset index stores these instead of absolute byte positions.	"Chapter 3 is 14 pages long" instead of "Chapter 3 starts on page 42"
FOR group	A batch of 128 deltas encoded together using FOR compression. The 128 here is a separate hardcoded constant from RecordSize (which also defaults to 128 — confusing coincidence).	Grouping every 128 chapters' page-counts together for compact storage
min	The smallest delta in a FOR group. Subtracted from every delta to shrink the numbers.	"The shortest chapter in this batch is 12 pages, so record everything relative to 12"
W	Bit width needed to store the largest residual (delta - min) in a FOR group.	"After subtracting the minimum, the biggest remaining number fits in 9 bits"

karthikiyer56 · 2026-03-24T23:26:25Z

+
+### Content Hash
+
+When `ContentHash: true`, the writer computes a chunked SHA-256 over the logical item stream:


can we rephrase this perhaps to say - "Each record's contenst are hashed, and then the all the record digests are hashed together" instead of introducing K and talking about chunk boundaries?
K is just RecordSize - as in, a "chunk" of K items is just... a record.

IMO, the section

chunkDigest_i = SHA-256([4B len][item_{i*K}] ... [4B len][item_{i*K+K-1}]) finalHash = SHA-256(chunkDigest_0 || ... || chunkDigest_M) K = RecordSize

can be replaced with

RecordSize = 128 record_0_digest = SHA-256([len][item_0][len][item_1]...[len][item_127]) record_1_digest = SHA-256([len][item_128]...[len][item_255]) ... finalHash = SHA-256(record_0_digest || record_1_digest || ...)

karthikiyer56 · 2026-03-24T23:26:28Z

+K = RecordSize
+```
+
+The hash is independent of compression and format — same items in the same order with the same RecordSize always produce the same hash. Note that changing RecordSize changes the chunk boundaries and therefore the hash.


independent of compression and format reads like the hash depends only on the data itself. but the very next sentence says changing RecordSize changes the hash.

Can we rephrase it to something like:
"""
The hash is independent of compression and format, but depends on RecordSize.
This means you can't use the content hash to verify "these two packfiles contain the same data" unless they were written with the same RecordSize
"""

…w fixes - Add events cold segment as third process_chunk output (PR #635) - Switch LFS from .data+.index to .pack format (PR #633) - Add chunk:{C}:events meta store key, atomic 3-flag WriteBatch - Add events_base to config Optional Sections table - Add events/ to directory structure - Add DAG setup pseudocode with explicit BUILD_READY handling - Replace ASCII dependency diagram with Mermaid flowchart - Expand LFS, BSB, MPHF acronyms on first use - Explain 10,000 multiplier in validation rules - Remove "Future: getEvents" section (events now first-class) - Remove dead pseudocode branch, hedging language

Major restructuring to improve clarity: - Reorder doc: Problem → Concepts → Usage → API → File Format → Implementation - Add Concepts section explaining items, records, offset index, and ItemsPerRecord tradeoffs upfront before any code - Add terminology table at start of File Format section - Give FOR encoding its own section before index and record descriptions - Rename RecordSize to ItemsPerRecord for clarity - Store FOR group size in trailer (offset 58) instead of hardcoding - Simplify content hash section with concrete record numbering - Clarify flag descriptions for Uncompressed/Raw format mapping - Rename ErrIndexRange to ErrPositionOutOfRange to match API contract - Add concrete ReadItem walkthrough in Implementation Notes - Clarify speculative read is not guaranteed to capture full index Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tamirms · 2026-04-01T14:35:23Z

@karthikiyer56 can you take another look at the packfile doc ? I have updated it to address the review feedback

- Add byte-level worked example showing ItemsPerRecord=2 and ItemsPerRecord=1 side by side with FOR encoding math, record layouts, and file layout (addresses review feedback) - Minor wording trims in FOR Encoding and Index Encoding sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 18, 2026 17:56

Copilot started reviewing on behalf of tamirms March 18, 2026 17:56 View session

Fix packfile design doc to match original exactly

435a11b

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI reviewed Mar 18, 2026

View reviewed changes

Comment thread design-docs/packfile-library.md

Comment thread design-docs/packfile-library.md Outdated

Comment thread design-docs/packfile-library.md Outdated

Comment thread design-docs/packfile-library.md Outdated

Comment thread design-docs/packfile-library.md Outdated

tamirms requested a review from a team March 18, 2026 18:08

tamirms changed the base branch from main to feature/full-history March 19, 2026 13:12

karthikiyer56 mentioned this pull request Mar 21, 2026

Checking in full-history design docs into stellar-rpc for review #617

Merged

graydon reviewed Mar 24, 2026

View reviewed changes

karthikiyer56 reviewed Mar 24, 2026

View reviewed changes

tamirms requested a review from karthikiyer56 April 1, 2026 14:34

This was referenced Apr 1, 2026

Add intpack FOR encoding and packfile offset index codec #649

Open

Add thin CGo zstd wrapper for packfile compression #650

Merged

karthikiyer56 approved these changes Apr 14, 2026

View reviewed changes


		## Goals

		- O(1) random access by ordinal index. Every `ReadItem` call maps index to record via arithmetic, then reads and decodes a single record.


		## How It Works

		Items are grouped into fixed-size records (default 128 items per record). Small items like events don't compress well individually, but 128 of them together do. Large items like ledgers are stored one per record since they compress well on their own. Each record is compressed and written as a contiguous block on disk.


		Non-blocking Open. `Open` returns a `*Reader` immediately. A background goroutine performs all I/O: open, stat, speculative read, trailer parse, CRC verification, index decode, app data read. A `sync.OnceValue` drains the result on the first query call. Errors are deferred to query time — `Open` itself never fails. This enables overlapped initialization: start loading an MPHF or opening other files while the goroutine runs.

		Speculative Read. On open, one pread of the last `min(256KB, fileSize)` bytes. This usually captures the trailer, app data, and index in one IOP. If the tail exceeds 256KB, a single fallback read fetches the rest.


		Packfile stores record sizes (deltas between consecutive offsets) instead. Deltas depend on the maximum record size, not total file size. A file with 20KB records uses 15-bit deltas whether the file is 500MB or 50GB.

		Deltas are encoded using Frame of Reference (FOR) compression in groups of 128. FOR subtracts a per-group minimum from every value, then bit-packs the residuals at the minimum bit width needed. Each group is self-contained:

Group Size	Groups	Index Size	vs Flat (267 KB)
32	2,135	118.6 KB	-55.5%
64	1,068	113.6 KB	-57.4%
128	534	111.0 KB	-58.4%
256	267	109.7 KB	-58.9%
512	134	109.1 KB	-59.1%

		Items are grouped into fixed-size records (default 128 items per record). Small items like events don't compress well individually, but 128 of them together do. Large items like ledgers are stored one per record since they compress well on their own. Each record is compressed and written as a contiguous block on disk.

		An offset index at the end of the file maps record numbers to byte offsets. On open, the entire index is decoded into a flat `[]int64` array. Looking up item `i` is arithmetic: `offsets[i / RecordSize]` gives the record's byte offset, then a single disk read + decode extracts the item.


		### Content Hash

		When `ContentHash: true`, the writer computes a chunked SHA-256 over the logical item stream:

Conversation

tamirms commented Mar 18, 2026

Summary

Further reading

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

graydon left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karthikiyer56 Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karthikiyer56 commented Mar 24, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karthikiyer56 Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tamirms commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

graydon left a comment •

edited

Loading

karthikiyer56 Mar 24, 2026 •

edited

Loading

karthikiyer56 Mar 24, 2026 •

edited

Loading