Upgrade to Zod 4 by ebroder · Pull Request #2749 · deathandmayhem/jolly-roger

ebroder · 2026-01-30T18:59:41Z

This is still very much a work in progress, but I wanted to go ahead and push it up so it wasn't living exclusively in my head and on my computer.

Here's what I currently know to be broken on this branch, although at this point the brokenness is likely covering up additional issues:

Zod 4 + zod-to-mongo-schema don't currently support outputting schemas for z.date() or z .instanceof(Uint8Array<ArrayBuffer>) (which we use for BSON binary data). Like I say in the commit message below, we don't have any differentiated requirements around generating a MongoDB-compatible JSON schema from a Zod schema, so this should be a task that we can outsource to existing open-source software. I'm going to try and work with Zod and zod-to-mongo-schema upstreams to figure out how best to support this. It's possible that I will have to give up on my dream and re-implement schema generation in-house, but I don't want to do it just yet

For the time being, I've worked around this by switching to a fork of zod-to-mongo-schema
The current approach for Model.autoPopulatedFields() runs into some issues invoking Meteor.userId() in contexts where a user ID isn't available, and calling Meteor.userId() outside of a method or publication throws an exception. (As an easy example, in tests/unit/imports/server/Flags.ts we provide a fake createdBy to try and side-step this, but autoPopulatedFields queries for a value unconditionally)

I've fixed this by checking DDP._Current{Method,Publication}Invocation to make sure we're in a context where Meteor.userId should work.
We're hitting an assortment of issues with numeric types. Some of this is zod-to-mongo-schema attempting to be more specific about BSON "int" vs. "long" (instead of just using "number", which is mostly what we've done historically). Some of this is a Zod bug (it doesn't correctly conflict between minimum and exclusiveMinimum or maximum and exclusiveMaximum when encoding a draft-04 JSON schema)
Zod 4 does not emit a regex pattern in JSON Schema when encoding z.string().url(), which is a regression from our previous behavior. These patterns on Puzzle and Hunt are arguably load-bearing right now, since we don't do any other validation of user-provided URLs (other than ensuring that they're a string). This came up in Reduce noise from logged errors for bad user inputs #2284 and is maybe not ideal behavior, but we should probably tackle those orthogonally.

I've fixed this by re-introducing our regex and attaching it to any z.url() fields.

The new approach to discriminated unions is buggy. The Zod 4 schema ends up with something like:

allOf: [
  { oneOf: [... variants with {name, value} ...], additionalProperties: false },
  { properties: {createdAt, updatedAt, createdBy, updatedBy}, additionalProperties: false },
  { properties: {deleted}, additionalProperties: false },
  { properties: {_id}, additionalProperties: false }
]

which is unsatisfiable, because each entry rejects every other.

Here's the commit message, which describes the approach:

In truth, this is largely a ground-up reimplementation of our typed model layer that takes advantage of a few years of experience working with the original implementation. Upgrading to Zod 4 is somewhat orthogonal.

My primary goal with this upgrade was to take advantage of Zod 4's native support for JSON Schemas. There already exists a library (zod-to-mongo-schema) which can utilize Zod's native JSON Schema support to generate MongoDB-compatible JSON Schemas. Given that our needs here are fairly undifferentiated, it doesn't feel like there's significant benefit to maintaining this code ourselves. This allows us to clean up the old code to generate JSON Schemas.

As always, the most complex element of our typed model layer is handling fields which should be automatically populated. Our previous approach (documented in #1394) was powerful and flexible, but abusing Zod's input vs. output schemas and lying to the type system with transforms broke down with Zod's native JSON Schemas, which refuses to serialize transforms.

Instead of trying to modernize that approach, this commit uses two systems to track automated fields. Within the type system, we use Zod's branded types to mark fields which should be auto-populated (effectively as a boolean marker), which we can then use to filter auto-populated fields, making them optional at insertion-time. At runtime, we use schema metadata, which captures when a field should be populated (insert, update, or both) and what value should be used (which has been limited to the specific types of values we actually use).

(In bringing in branded types, we do return to our old friend of input/output schemas — auto-populated types are only branded on the input side, which is only used for computing the insert type, since we don't actually want nominal typing for our auto-populated fields.)

This has the effect of dropping support for arbitrary transformation functions. In practice, we were only using that function to apply the answerify transform to answer fields, but we had previously concluded that answers were already uppercased at input time. In exchange for dropping transforms, we no longer need to analyze the entire update operation to make sure transformations are applied properly, and can instead just rely on MongoDB to enforce that the result matches our schema. This in turn lets us remove a significant amount of code around updates (including the mechanisms around relaxSchema and parseMongoModifierAsync).

In addition to dropping support for transform functions, I also dropped support for the bypassSchema option for inserts, updates, and upserts. This wasn't a great abstraction, since it attempted to otherwise preserve the behavior of Meteor Collections' native methods, but had to do some extra work to do that. Instead, users can just reach for rawCollection. This only comes up in migration code anyway, which generally requires a fair amount of abstraction bypass regardless.

And finally, I pulled the code into imports/lib/typedModel, rather than leaving it strewn about with actual model declarations.

And in terms of review sequence, I'd recommend something like this:

autoPopulate.ts
customTypes.ts
validateSchema.ts
Model.ts
SoftDeletedModel.ts

zarvox · 2026-02-08T06:21:20Z

I read through all of this and it all seemed pretty reasonable!

For your two noted issues:

The main pain point here is that zod-to-mongo-schema only allows overriding the bsonType on z.unknown(), and we want to still benefit from zod's type hinting for things like z.date() or z.instanceOf() without having to do explicit casts all around, right? Ah, I see you filed Adding support for dates and binary data udohjeremiah/zod-to-mongo-schema#11 which explains pretty well
Instead of calling Meteor.userId() unconditionally, we can do some very mild reaching into Meteor internals to guard against calling it when we're not in a method invocation or publication:

import { DDP } from "meteor/ddp"; 

function getCurrentUserIdOrUndefined() {
  const currentInvocation = DDP._CurrentMethodInvocation.get() || DDP._CurrentPublicationInvocation.get();
  if (currentInvocation) {
    return Meteor.userId(); // or return `currentInvocation.userId` if you want to lean on internals harder
  } else {
    return undefined;
  }
}

We already rely on DDP._CurrentInvocation in the API authenticator so it's not even truly new internals surface area, and in general seems unlikely to change.

zarvox · 2026-02-09T09:28:55Z

Cool work! Neat to see your patched version of zod-to-mongo-schema.

I tried running this (I know it's just a draft, I'm just excited), and hit a couple errors in short order:

Servers pid

The insertion into the Servers collection on startup trips over the type of the pid field:

W20260209-00:49:57.555(-8)? (STDERR) Document failed validation: [{"operatorName":"properties","propertiesNotSatisfied":[{"propertyName":"pid","details":[{"operatorName":"bsonType","specifiedAs":{"bsonType":"long"},"reason":"type did not match","consideredValue":39607,"consideredType":"int"}]}]}]

The validator believes that this field is a long, but when we perform the insertion it appears that the field is getting serialized as an int. I worked around this by swapping the field definition from z.number().int() to z.int32().

MonitorConnectAcks portNumber

Attaching the schema for the mediasoup MonitorConnectAcks collection failed:

I20260209-00:57:46.990(-8)? Error: Failed to attach schema to collection jr_mediasoup_monitor_connect_acks: MongoServerError: Parsing of collection validator failed :: caused by :: $jsonSchema keyword 'minimum' must be a present if exclusiveMinimum is present

This is erroring out on the portNumber field. I replaced the .positive() with a .min(1), and that seemed to work.

Puzzles expectedAnswerCount

That got the server to start up. I tried adding a new puzzle, and inserting the new puzzle object failed validation:

I20260209-01:13:49.238(-8)? Exception while invoking method 'Puzzles.methods.create' Document failed validation: [{"operatorName":"properties","propertiesNotSatisfied":[{"propertyName":"expectedAnswerCount","details":[{"operatorName":"bsonType","specifiedAs":{"bsonType":"long"},"reason":"type did not match","consideredValue":0,"consideredType":"int"}]}]}]

Again, numbers getting specified in the mongo schema as long in the bson type, but then the actual values getting serialized as an int.

I think the numeric types in particular are giving us some repeated grief?

I remain excited for this overall, and to that end I think it'd be wise for our test plan for this change to include diffing the new validation schemas against our existing validation schemas and either trying to get them to match or at least exercising insertions for any collection where this patchset generates a different validation schema.

ebroder · 2026-02-09T16:18:50Z

Yeah I observed those as well. I was going to leave a comment and update the description but was too tired to actually write it up. I think there are at least three things happening here:

Zod has a bug when generating draft-04 JSON schemas where it doesn't correctly deconflict a schema that specifies both a minimum and an exclusiveMinimum (or max), so given z.number().int().positive(), it emits minimum: -9007199254740991 (from int) and exclusiveMinimum: true (from positive).
When zod-to-mongo-schema is attempting to infer numeric types, it sees the minimum of -9007199254740991 and cleans it up to try and only include values in the schema that the user provided, but it leaves exclusiveMinimum: true, making the schema invalid. (It's still unclear to me if this behavior actually needs to be fixed, or is just a complication on the first issue)
zod-to-mongo-schema attempts to infer a stricter datatype for numeric types than our old code did. (We would only ever set bsonType to "int" or "number"). I think swapping from .int() to .int32() will fix most of these

So far, I've mostly just been trying to get things to a point that the app would run, so haven't thought about the actual testing regimen yet, but I agree that diffing the schemas and at least being confident in the changes seems like the right thing to do.

I'm working on a patch for the Zod issue to submit upstream. We could theoretically work around it, but Zod is reasonably active so I'm hoping we don't have to.

ebroder · 2026-02-09T17:05:48Z

Created colinhacks/zod#5700 for the first of those issues.

ebroder · 2026-02-09T21:32:37Z

I asked Claude to dump out the schemas and analyze the differences. Here's what it came up with:

Schema Diff Report: Zod 3 (main) vs Zod 4

Systematic Changes (all/most collections, superficial)

These appear across virtually every collection and are safe:

bsonType → type for standard types: "string", "object", "array", "bool"→"boolean" all switch from MongoDB's bsonType to standard JSON Schema type. BSON-specific types ("date", "int", "binData") are preserved. MongoDB accepts both.
deleted field simplified: allOf: [{not: {bsonType: "null"}}, {bsonType: "bool"}] → {type: "boolean"}. Semantically equivalent since boolean already excludes null. Same pattern applies to other defaulted fields (openSignups, mailingLists in jr_hunts).
deleted, updatedAt, tags, mailingLists, openSignups now required: Fields with .default() values are now listed as required. These will always be populated on insertion.
Enum fields gain explicit type: "string": e.g. {enum: ["audio","video"]} → {enum: ["audio","video"], type: "string"}. More explicit but equivalent.
exclusiveMinimum: false / exclusiveMaximum: false removed: These were the default values and had no effect. Affects jr_blobs.size, jr_guesses.confidence/direction, jr_puzzles.expectedAnswerCount, and port fields.
anyOf → oneOf for discriminated unions (jr_documents, jr_settings): More correct for mutually exclusive variants. Equivalent in practice with single-value enum discriminators.
Flatter allOf structure (jr_documents, jr_settings): Zod 3 produced deeply nested allOf with {} placeholder properties; Zod 4 produces a cleaner flat structure. Semantically equivalent.

Concerning Changes

`port` field lost `minimum: 0` (high risk)

Affected: jr_mediasoup_monitor_connect_acks.port, jr_mediasoup_monitor_connect_requests.port

	main	zod-4
Schema	`{minimum: 0, exclusiveMinimum: true, maximum: 65535, bsonType: "int"}`	`{exclusiveMinimum: true, maximum: 65535, bsonType: "int"}`

In JSON Schema draft-04, exclusiveMinimum is a boolean modifier of minimum. Without minimum, exclusiveMinimum: true is meaningless — the port > 0 constraint is lost. This is the Zod z.positive() bug already reported upstream.

`answer`/`guess` fields gained constraints (medium risk)

Affected: jr_puzzles.answers[], jr_guesses.guess, jr_bookmark_notifications.answer

	main	zod-4
Schema	`{bsonType: "string"}`	`{type: "string", minLength: 1, pattern: "^[^a-z]+$"}`

The answer custom type likely uses a .pipe() that Zod 3's JSON Schema converter couldn't serialize but Zod 4's can. This is more correct (answers should be uppercase), but it's a real validation tightening. Existing documents with lowercase answers would fail validation on update.

Email regex changed (medium risk)

Affected: users.emails[].address, jr_folder_perms.googleAccount

	main	zod-4
Pattern	^[a-zA-Z0-9.!#$%&'+/=?^_`{⏐}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)$	`^(?!\.)(?!.\.\.)([A-Za-z0-9_'+\-\.])[A-Za-z0-9_+-]@([A-Za-z0-9][A-Za-z0-9\-]*\.)+[A-Za-z]{2,}$`

Zod 4 ships a different z.string().email() regex. The new one disallows leading/consecutive dots, has a narrower local-part character set, and requires a TLD of 2+ chars. Could reject previously-valid email addresses.

URL patterns lost (low risk)

Affected: jr_puzzles.url, jr_hunts.homepageUrl, jr_chatmessages/jr_chatnotifications image child url

Zod 3 emitted a URL-matching regex from z.string().url(). Zod 4 emits no pattern — just {type: "string"}. This is a validation loosening (won't reject valid documents, but won't catch invalid URLs at the MongoDB level). Likely z.toJSONSchema() doesn't serialize URL checks.

UUID regex updated (low risk)

Affected: All UUID fields across mediasoup collections (consumerId, producerId, routerId, transportId)

	main	zod-4
Version digit	`[1-5]`	`[1-8]`
Variant nibble	unconstrained	`[89abAB]`
Max UUID	not allowed	allowed

More permissive on version range (6-8), slightly more restrictive on variant nibble. All standard RFC 4122 UUIDs have variant bits in [89abAB], so existing data should be fine.

Summary

Change	Risk	Action needed?
`bsonType` → `type`	None	No
`deleted`/`updatedAt`/`tags` etc. now required	None (have defaults)	No
`allOf[not null, bool]` → `boolean`	None	No
Enums gain `type: "string"`	None	No
`anyOf` → `oneOf`	None	No
Flatter `allOf` nesting	None	No
`exclusiveMinimum/Maximum: false` removed	None	No
UUID regex updated	Low	Monitor, likely fine
URL patterns lost	Low	Upstream: `z.toJSONSchema()` / `zod-to-mongo-schema`
Email regex changed	Medium	Verify existing emails pass new regex
`answer`/`guess` gained constraints	Medium	Correct behavior, but verify no lowercase data exists
`port` lost `minimum: 0`	High	Needs fix (Zod bug already reported)

ebroder · 2026-02-09T23:02:45Z

Couple of notes on that:

The URL regex patterns on Puzzle and Hunt are arguably load-bearing right now, since we don't do any other validation of user-provided URLs (other than ensuring that they're a string). This came up in Reduce noise from logged errors for bad user inputs #2284 and is maybe not ideal behavior, but we should probably tackle those orthogonally.

The new approach to discriminated unions is buggy. The Zod 4 schema ends up with something like:

allOf: [
  { oneOf: [... variants with {name, value} ...], additionalProperties: false },
  { properties: {createdAt, updatedAt, createdBy, updatedBy}, additionalProperties: false },
  { properties: {deleted}, additionalProperties: false },
  { properties: {_id}, additionalProperties: false }
]

which is unsatisfiable, because each entry rejects every other.

ebroder · 2026-02-10T01:02:17Z

Opened colinhacks/zod#5702 which I believe will fix the discriminated unions.

In truth, this is largely a ground-up reimplementation of our typed model layer that takes advantage of a few years of experience working with the original implementation. Upgrading to Zod 4 is *somewhat* orthogonal. My primary goal with this upgrade was to take advantage of Zod 4's native support for JSON Schemas. There already exists a library (zod-to-mongo-schema) which can utilize Zod's native JSON Schema support to generate MongoDB-compatible JSON Schemas. Given that our needs here are fairly undifferentiated, it doesn't feel like there's significant benefit to maintaining this code ourselves. This allows us to clean up the old code to generate JSON Schemas. As part of this swap, we need to deal with some differences in how zod-to-mongo-schemas represents schemas: * Zod's `z.int()` can technically capture any (53-bit) integer value representable in floating point, so zod-to-mongo-schema represents it as a "long"; we previously used "int". The MongoDB Javascript driver by default will serialize integer values to the BSON "int" type, so this causes conflicts. Switch to `z.int32()` instead to get the desired "int" type in the generated JSON Schema. As always, the most complex element of our typed model layer is handling fields which should be automatically populated. Our previous approach (documented in #1394) was powerful and flexible, but abusing Zod's input vs. output schemas and lying to the type system with transforms broke down with Zod's native JSON Schemas, which refuses to serialize transforms. Instead of trying to modernize that approach, this commit uses two systems to track automated fields. Within the type system, we use Zod's branded types to mark fields which should be auto-populated (effectively as a boolean marker), which we can then use to filter auto-populated fields, making them optional at insertion-time. At runtime, we use schema metadata, which captures when a field should be populated (insert, update, or both) and what value should be used (which has been limited to the specific types of values we actually use). (In bringing in branded types, we do return to our old friend of input/output schemas — auto-populated types are only branded on the input side, which is only used for computing the insert type, since we don't actually want nominal typing for our auto-populated fields.) This has the effect of dropping support for arbitrary transformation functions. In practice, we were only using that function to apply the `answerify` transform to answer fields, but we had previously concluded that answers were already uppercased at input time. In exchange for dropping transforms, we no longer need to analyze the entire update operation to make sure transformations are applied properly, and can instead just rely on MongoDB to enforce that the result matches our schema. This in turn lets us remove a significant amount of code around updates (including the mechanisms around `relaxSchema` and `parseMongoModifierAsync`). In addition to dropping support for transform functions, I also dropped support for the `bypassSchema` option for inserts, updates, and upserts. This wasn't a great abstraction, since it attempted to otherwise preserve the behavior of Meteor Collections' native methods, but had to do some extra work to do that. Instead, users can just reach for rawCollection. This only comes up in migration code anyway, which generally requires a fair amount of abstraction bypass regardless. And finally, I pulled the code into imports/lib/typedModel, rather than leaving it strewn about with actual model declarations.

ebroder force-pushed the evan/zod-4 branch from 97d01bc to f26aa77 Compare February 9, 2026 06:09

ebroder force-pushed the evan/zod-4 branch from f26aa77 to cf5ad43 Compare February 9, 2026 21:33

ebroder force-pushed the evan/zod-4 branch 2 times, most recently from 86dfe93 to 5040e89 Compare February 21, 2026 06:12

ebroder force-pushed the evan/zod-4 branch from 5040e89 to 0a8f386 Compare March 19, 2026 17:25

ebroder force-pushed the evan/zod-4 branch from 0a8f386 to 9e150f1 Compare May 1, 2026 14:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade to Zod 4#2749

Upgrade to Zod 4#2749
ebroder wants to merge 1 commit intomainfrom
evan/zod-4

ebroder commented Jan 30, 2026 •

edited

Loading

Uh oh!

zarvox commented Feb 8, 2026

Uh oh!

zarvox commented Feb 9, 2026

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 9, 2026 •

edited

Loading

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Conversation

ebroder commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zarvox commented Feb 8, 2026

Uh oh!

zarvox commented Feb 9, 2026

Servers pid

MonitorConnectAcks portNumber

Puzzles expectedAnswerCount

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Schema Diff Report: Zod 3 (main) vs Zod 4

Systematic Changes (all/most collections, superficial)

Concerning Changes

port field lost minimum: 0 (high risk)

answer/guess fields gained constraints (medium risk)

Email regex changed (medium risk)

URL patterns lost (low risk)

UUID regex updated (low risk)

Summary

Uh oh!

ebroder commented Feb 9, 2026

Uh oh!

ebroder commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

ebroder commented Jan 30, 2026 •

edited

Loading

ebroder commented Feb 9, 2026 •

edited

Loading

`port` field lost `minimum: 0` (high risk)

`answer`/`guess` fields gained constraints (medium risk)