Skip to content

Add persistent_term support#2323

Open
petermm wants to merge 3 commits into
atomvm:release-0.7from
petermm:codex/add-persistentterm-support
Open

Add persistent_term support#2323
petermm wants to merge 3 commits into
atomvm:release-0.7from
petermm:codex/add-persistentterm-support

Conversation

@petermm
Copy link
Copy Markdown
Contributor

@petermm petermm commented May 31, 2026

Note the lack of GC in this implementation for erased/replaced persistent_terms, that is for followup..

01-page-persistent-term-pr

Implement a VM-level persistent_term table with NIFs for get/0,1,2, put/2, put_new/2, erase/1, and info/0.

Persistent terms are copied into VM-owned heaps on write and returned without copying on read. Replaced and erased entries are retained until VM shutdown so previously returned terms remain valid without an OTP-style global GC pass, and info/0 reports retained memory.

Also share the ETS term hashing helper for persistent_term lookup and add coverage for put/get/erase behavior, retained values, info/0, complex keys, and function keys.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@petermm petermm force-pushed the codex/add-persistentterm-support branch from 97739cb to 51ab669 Compare May 31, 2026 11:16
Comment thread src/libAtomVM/term_hash.c Fixed
@petermm
Copy link
Copy Markdown
Contributor Author

petermm commented Jun 1, 2026

PR Review: persistent_term + term_hash extraction

Scope: last two commits on this branch.

  • 51ab669e — Add persistent_term support
  • 67707b01 — Fix CodeQL warning in 64-bit term hash folding

Verdict

No blocking soundness issues found.

The headline design choice — skipping OTP's global GC pass on
replace/erase — is safe specifically because retired/erased entries
are retained until VM shutdown
. A previously returned raw term can
never become a use-after-free, because nothing ever frees the heap it
lives in until the GlobalContext is destroyed. This trade-off is
documented in libs/estdlib/src/persistent_term.erl.

The CodeQL fix in src/libAtomVM/term_hash.c hash_uint64() is correct:
promoting h to uint64_t before the multiply silences the warning
while preserving identical mod‑2³² results after the cast back to
uint32_t.

Findings below are prioritized.


P1 — Document the retention invariant near retire_entry()

The "no global GC" design is only sound because retired/erased
entries are kept alive until VM shutdown. This invariant lives only in
the Erlang wrapper's edoc today. A future cleanup PR could easily
"optimize" retire_entry() into an immediate entry_destroy() and
silently introduce a UAF: process heaps may still contain raw term
pointers into the retired heap.

Add a comment at the C call site so the constraint is visible to
anyone touching this code.

--- a/src/libAtomVM/persistent_term.c
+++ b/src/libAtomVM/persistent_term.c
@@ -341,6 +341,14 @@ static void entry_destroy(struct PersistentTermEntry *entry, GlobalContext *glob
     free(entry);
 }

+/*
+ * Retire an entry instead of freeing it.
+ *
+ * persistent_term:get/1 returns the value term without copying, so process
+ * heaps may contain raw pointers into entry->heap after a subsequent
+ * put/2 or erase/1. AtomVM does not implement OTP's global GC pass, so
+ * retired entries must remain allocated until globalcontext destruction.
+ */
 static void retire_entry(PersistentTerm *persistent_term, struct PersistentTermEntry *entry)
 {
     entry->next = persistent_term->retired_entries;

Effort: trivial.


P1 — persistent_term:info/0 doesn't check the count atom allocation

src/libAtomVM/nifs.c nif_persistent_term_info() calls
globalcontext_make_atom(...) and feeds the result straight into
term_set_map_assoc(). On atom-table OOM that returns
term_invalid_term() and the resulting map is malformed.

--- a/src/libAtomVM/nifs.c
+++ b/src/libAtomVM/nifs.c
@@ -4636,12 +4636,16 @@ static term nif_persistent_term_info(Context *ctx, int argc, term argv[])
         RAISE_ERROR(OUT_OF_MEMORY_ATOM);
     }

+    term count_atom = globalcontext_make_atom(ctx->global, ATOM_STR("\x5", "count"));
+    if (UNLIKELY(term_is_invalid_term(count_atom))) {
+        RAISE_ERROR(OUT_OF_MEMORY_ATOM);
+    }
+
     term map = term_alloc_map(2, &ctx->heap);
     term_set_map_assoc(
         map,
         0,
-        globalcontext_make_atom(ctx->global, ATOM_STR("\x5", "count")),
+        count_atom,
         term_make_maybe_boxed_int64(count_i, &ctx->heap));
     term_set_map_assoc(
         map,

(The memory key already uses the pre-defined MEMORY_ATOM, so only
count is affected.)

Effort: trivial.


P2 — Retention test should force erlang:garbage_collect()

The retention check in tests/erlang_tests/test_persistent_term.erl
verifies that an old binding survives a subsequent put/2 and
erase/1, but it never forces a process GC between the bindings.
That's the one path most likely to break if someone "improves"
retire_entry(). Forcing GC makes the test exercise the actual
soundness invariant.

--- a/tests/erlang_tests/test_persistent_term.erl
+++ b/tests/erlang_tests/test_persistent_term.erl
@@ -45,11 +45,16 @@ test_retained_old_values() ->
     Key = {retained, key},
     ok = persistent_term:put(Key, {old, value, [1, 2, 3]}),
     OldValue = persistent_term:get(Key),
+    %% Force a GC while OldValue still references the retired heap.
+    erlang:garbage_collect(),
+    {old, value, [1, 2, 3]} = OldValue,

     ok = persistent_term:put(Key, {new, value}),
+    erlang:garbage_collect(),
     {old, value, [1, 2, 3]} = OldValue,
     {new, value} = persistent_term:get(Key),

     true = persistent_term:erase(Key),
+    erlang:garbage_collect(),
     {old, value, [1, 2, 3]} = OldValue,
     ok.

A second, smaller follow-up would be a refc-binary value to cover the
MSO/refcount retention path; defer unless we want it now.

Effort: small.


P2 — info().memory accounting: clarify or switch to allocated capacity

src/libAtomVM/persistent_term.c entry_new() sizes the per-entry
heap from memory_estimate_usage(key) + memory_estimate_usage(value)
but accounts memory using the post-copy heap pointer delta:

size_t size = memory_estimate_usage(key) + memory_estimate_usage(value);
if (UNLIKELY(memory_init_heap(heap, size) != MEMORY_GC_OK)) { ... }

entry->key   = memory_copy_term_tree(heap, key);
entry->value = memory_copy_term_tree(heap, value);
entry->memory = sizeof(struct PersistentTermEntry) + sizeof(Heap)
              + sizeof(HeapFragment)
              + ((size_t) (heap->heap_ptr - heap->heap_start) * sizeof(term));

If memory_estimate_usage() ever overestimates (canonical/shared
sub-terms), info().memory reports terms actually used rather than
heap allocated for them. Two reasonable options:

Option A — document as "logical copied-term usage" in
libs/estdlib/src/persistent_term.erl.

Option B — switch accounting to allocated capacity (closer to OTP's
intent):

--- a/src/libAtomVM/persistent_term.c
+++ b/src/libAtomVM/persistent_term.c
@@ -327,8 +327,9 @@ static struct PersistentTermEntry *entry_new(term key, term value)
     entry->key = memory_copy_term_tree(heap, key);
     entry->value = memory_copy_term_tree(heap, value);
     entry->heap = heap;
-    entry->memory = sizeof(struct PersistentTermEntry) + sizeof(Heap) + sizeof(HeapFragment)
-        + ((size_t) (heap->heap_ptr - heap->heap_start) * sizeof(term));
+    /* Report allocated heap capacity, not just the live copied portion. */
+    entry->memory = sizeof(struct PersistentTermEntry) + sizeof(Heap)
+        + sizeof(HeapFragment) + size * sizeof(term);
     entry->next = NULL;

Either is acceptable; just pick one and make it explicit.

Effort: small.


P3 — Concurrency / put_new/2 test gaps

Current tests cover behavior on a single scheduler. Worth adding under
SMP:

  • concurrent put_new/2 from N processes for the same missing key —
    exactly one should win, all others should observe the winning value;
  • concurrent get/1 interleaved with put/2/erase/1 of the same
    key — readers must never observe a half-replaced or freed value;
  • repeated replace/erase cycles asserting info().count stabilizes
    while info().memory grows monotonically (consistent with the
    retention design).

These don't catch a known bug; they protect the spinlock + retention
invariants from future refactors.

Effort: medium.


Other notes (no action required)

  • get/0 holds the spinlock through GC + key copies
    (persistent_term.c persistent_term_get_all_maybe_gc()).
    Correct, and simple. Only worth changing if get/0 shows up in a
    profile. Any future "snapshot under lock, copy outside lock"
    refactor still relies on the retention invariant above.

  • term_hash extraction. src/libAtomVM/ets_multimap.c call
    sites all switched to the shared term_hash(); float -0.0 is
    normalized to +0.0 to stay consistent with term_compare(). Looks
    good.

  • hash_uint64 CodeQL fix (term_hash.c). The cast through
    uint64_t before truncating back to uint32_t produces the same
    bits as the original uint32_t arithmetic, so existing hashes are
    preserved.

  • Unsupported-term fprintf(stderr, ...) fallback in term_hash.c.
    Confirm all user-reachable term kinds are covered (current set
    looks complete); if any path can reach it from user input it will
    spam stderr.


Recommended minimum follow-up patch

  1. Apply the P1 comment on retire_entry().
  2. Apply the P1 count atom check in nif_persistent_term_info.
  3. Apply the P2 GC-forcing retention test.
  4. Pick a side on info().memory semantics and either doc or
    change.

Concurrency stress and a refc-binary retention case can land as a
follow-up.

Copy link
Copy Markdown
Collaborator

@pguyot pguyot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should ship a r/w persistent term feature that leaks memory. I would suggest a first version with everything except erase and replace.

I want to challenge usage of locking.

I am not sure I understand why we copy keys with get_all. It seems inherited from ets, but maybe I didn't spend enough time yet on this review.

Comment thread src/libAtomVM/persistent_term.c Outdated
Comment thread src/libAtomVM/persistent_term.c Outdated
Comment thread src/libAtomVM/persistent_term.c
Comment thread src/libAtomVM/persistent_term.c Outdated
petermm added 3 commits June 2, 2026 08:05
Implement a VM-level persistent_term table with NIFs for get/0,1,2,
put/2, put_new/2, erase/1, and info/0.

Persistent terms are copied into VM-owned heaps on write and returned without
copying on read. Replaced and erased entries are retained until VM shutdown so
previously returned terms remain valid without an OTP-style global GC pass, and
info/0 reports retained memory.

Also share the ETS term hashing helper for persistent_term lookup and add
coverage for put/get/erase behavior, retained values, info/0, complex keys, and
function keys.

Signed-off-by: Peter M <petermm@gmail.com>
Use an explicit 64-bit intermediate when folding bytes from uint64_t values
into the 32-bit term hash accumulator.

This avoids CodeQL's "multiplication result converted to larger type" warning
while preserving the intended uint32_t wraparound behavior of the hash.

Signed-off-by: Peter M <petermm@gmail.com>
Use the project RWLock API directly for persistent_term instead of wrapping
spinlock calls in local read/write lock macros.

Also reduce the persistent_term critical sections by precomputing the bucket
hash and allocating replacement entries before taking the write lock.

Finally, stop copying keys in persistent_term:get/0. The returned list now
allocates only the outer list and tuple cells on the caller heap, while both
keys and values reference the persistent-term entry heap directly. This matches
get/1 semantics, since retired entries are retained so previously returned
persistent terms remain valid.

Add a regression assertion that a persistent_term:get/0 result remains valid
after the corresponding entries are erased.

Signed-off-by: Peter M <petermm@gmail.com>
@petermm petermm force-pushed the codex/add-persistentterm-support branch from 7af87d4 to fda1420 Compare June 2, 2026 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants