Skip to content

Commit 609a434

Browse files
authored
More doc stuff (#965)
1 parent 8307207 commit 609a434

File tree

12 files changed

+236
-43
lines changed

12 files changed

+236
-43
lines changed

doc/.vitepress/config.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,11 @@ export default defineConfig({
5050
{
5151
text: "Standards",
5252
link: "/concept/standard/",
53-
items: [{ text: "MoqTransport", link: "/concept/standard/moq-transport" }],
53+
items: [
54+
{ text: "MoqTransport", link: "/concept/standard/moq-transport" },
55+
{ text: "MSF", link: "/concept/standard/msf" },
56+
{ text: "LOC", link: "/concept/standard/loc" },
57+
],
5458
},
5559
{
5660
text: "Use Cases",
@@ -59,7 +63,8 @@ export default defineConfig({
5963
{ text: "Contribution", link: "/concept/use-case/contribution" },
6064
{ text: "Distribution", link: "/concept/use-case/distribution" },
6165
{ text: "Conferencing", link: "/concept/use-case/conferencing" },
62-
{ text: "Exotic", link: "/concept/use-case/exotic" },
66+
{ text: "AI", link: "/concept/use-case/ai" },
67+
{ text: "Other", link: "/concept/use-case/other" },
6368
],
6469
},
6570
],

doc/concept/index.md

Lines changed: 1 addition & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -8,34 +8,8 @@ Welcome to my favorite section.
88
MoQ has been a multi-year journey to solve some very real problems in the industry and now it's time to flex the design.
99

1010
## Layers
11-
12-
The design philosophy of MoQ is to make things simple, composable, and customizable.
11+
MoQ is carefully broken into layers to make it simple, composable, and customizable.
1312
We don't want you to hit a brick wall if you deviate from the standard path (*ahem* WebRTC).
14-
We also want to benefit from economies of scale (like HTTP), utilizing generic libraries and tools whenever possible.
15-
16-
To accomplish this, MoQ is broken into layers:
17-
18-
```text
19-
┌─────────────────┐
20-
│ Application │ 🏢 Your business logic
21-
│ │ - authentication, non-media tracks, etc.
22-
├─────────────────┤
23-
│ Media Format │ 🎬 Media-specific encoding/streaming
24-
│ (hang) │ - codecs, containers, catalog
25-
├─────────────────├
26-
│ MoQ Transport │ 🚌 Generic pub/sub transport
27-
│ (moq-lite) │ - broadcasts, tracks, groups, frames
28-
├─────────────────┤
29-
│ WebTransport │ 🌐 Browser-compatible QUIC
30-
│ │ - HTTP/3 handshake
31-
├─────────────────┤
32-
| QUIC | 🌐 Underlying transport protocol
33-
│ │ - streams, datagrams, prioritization, etc.
34-
└─────────────────┘
35-
```
36-
37-
You get to choose which layers you want to use and which layers you want to replace.
38-
It's like a cake but reusable.
3913

4014
See [Layers](/concept/layer/) for more information.
4115

doc/concept/layer/index.md

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,36 @@
11
---
22
title: Layering
3-
description: It's like a cake but reusable.
3+
description: It's like a cake; you choose if you want frosting.
44
---
55

66
# Layers
7-
You need to have some understanding of the responsibility and purpose of each layer to best utilize MoQ.
8-
Let's dive in, starting at the bottom of the stack.
7+
The design philosophy of MoQ is to make things simple, composable, and customizable.
8+
We don't want you to hit a brick wall if you deviate from the standard path (*ahem* WebRTC).
9+
We also want to benefit from economies of scale (like HTTP), utilizing generic libraries and tools whenever possible.
10+
11+
To accomplish this, MoQ is broken into layers:
12+
13+
```text
14+
┌─────────────────┐
15+
│ Application │ 🏢 Your business logic
16+
│ │ - authentication, non-media tracks, etc.
17+
├─────────────────┤
18+
│ Media Format │ 🎬 Media-specific encoding/streaming
19+
│ (hang) │ - codecs, containers, catalog
20+
├─────────────────├
21+
│ MoQ Transport │ 🚌 Generic pub/sub transport
22+
│ (moq-lite) │ - broadcasts, tracks, groups, frames
23+
├─────────────────┤
24+
│ WebTransport │ 🌐 Browser-compatible QUIC
25+
│ │ - HTTP/3 handshake
26+
├─────────────────┤
27+
| QUIC | 🌐 Underlying transport protocol
28+
│ │ - streams, datagrams, prioritization, etc.
29+
└─────────────────┘
30+
```
31+
32+
You get to choose which layers you want to use and which layers you want to replace.
33+
It's like a cake; you choose if you want frosting.
934

1035
## QUIC
1136
QUIC is the core protocol that powers HTTP/3, designed to fix head-of-line blocking that plagues TCP and thus HTTP/2.

doc/concept/layer/moq-lite.md

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,29 @@ description: A fraction of the calories with none of the fat.
44
---
55

66
# moq-lite
7-
A subset of the [MoqTransport](/concept/standard/moq-transport) specification.
8-
The useless/optional cruft has been removed so more time can be spent on the core functionality.
7+
[moq-lite](https://www.ietf.org/archive/id/draft-lcurley-moq-lite-02.html) is a subset of the [MoqTransport](/concept/standard/moq-transport) specification.
8+
The goal is to keep the core transport layer simple and focused on practical use-cases.
99

10-
See the draft: [draft-lcurley-moq-lite](https://www.ietf.org/archive/id/draft-lcurley-moq-lite-02.html).
10+
There's too much fringe functionality in the MoqTransport draft that's not practical to implement.
11+
Most of it is specific to Cisco's implementation and bizarre requirements anyway.
12+
13+
## Compatibility
14+
Keep in mind that moq-lite is forward compatible with the IETF draft.
15+
For every moq-lite API, there's a corresponding moq-transport API.
16+
So fortunately, it doesn't matter if I get hit by a bus and moq-lite ceases to exist.
17+
18+
The moq.dev libraries negotiate the `moq-lite` or `moq-transport` version as part of the QUIC/WebTransport handshake (via ALPN).
19+
When `moq-transport` wire format is negotiated, we implement a compatibility layer that enforces the moq-lite API.
20+
For example, if there's a gap in a group (valid in moq-transport), we drop the tail of the group instead of erroring.
21+
22+
|---------------|---------------|-----------|----------------------------------------------------------------------|
23+
| client | relay | supported | |
24+
|---------------|---------------|:---------:|----------------------------------------------------------------------|
25+
| moq-lite | moq-lite || |
26+
| moq-lite | moq-transport || |
27+
| moq-transport | moq-lite | ⚠️ | Can't use moq-transport specific features. |
28+
| moq-transport | moq-transport | ⚠️ | Depends on the implementation; nobody has implemented every feature. |
29+
|---------------|---------------|-----------|----------------------------------------------------------------------|
1130

1231
## Definitions
1332
- **Broadcast** - A named and discoverable collection of **tracks** from a single publisher.
@@ -22,15 +41,14 @@ It's less ambiguous and closer to media terminology:
2241

2342
## Major Differences
2443
The main goal is to reduce complexity and make the protocol easier to implement.
25-
When a feature has limited use-cases, it's removed (for now).
2644

27-
- **No Request IDs**: A bidirectional stream for each request to avoid HoLB.
45+
- **No Request IDs**: A bidirectional stream for each request to avoid HoLB. (NOTE: likely to be upstreamed into moq-transport)
2846
- **No Push**: A subscriber must explicitly subscribe to each track.
29-
- **No FETCH**: The plan is to use HTTP for VOD instead of reinventing the wheel.
47+
- **No FETCH**: Use HTTP for VOD instead of reinventing the wheel.
3048
- **No Joining Fetch**: Subscriptions start at the latest group, not the latest frame.
3149
- **No sub-groups**: SVC layers should be separate tracks.
32-
- **No gaps**: Makes life easier for a relay.
50+
- **No gaps**: Makes life much easier for the relay and every application.
3351
- **No object properties**: Encode your metadata into the frame payload.
3452
- **No pausing**: Unsubscribe if you don't want a track.
3553
- **No binary names**: Uses UTF-8 strings instead of arrays of byte arrays.
36-
- **No datagrams**: Maybe in the future.
54+
- **No datagrams**: Maybe one day.

doc/concept/standard/loc.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
title: LOC - Low Overhead Container
3+
description: A low-overhead container format for MoQ.
4+
---
5+
6+
# LOC - Low Overhead Container
7+
We originally wanted to use [CMAF](/concept/standard/msf) but there's a lot of overhead.
8+
Like 100 bytes per frame sort of overhead (`moof` + `mdat`), the type of overhead that kills audio-only streams.
9+
10+
LOC is a super simple container format that's designed to be lightweight.
11+
It's similar to the [hang container](../layer/hang) and we'll probably merge them in the future.
12+
13+
[See the draft](https://www.ietf.org/archive/id/draft-ietf-moq-loc-00.html) for the latest details.

doc/concept/standard/msf.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
---
2+
title: MSF - MoQ Streaming Format
3+
description: A catalog format for MoQ.
4+
---
5+
6+
# MSF - MoQ Streaming Format
7+
HLS/DASH playlists suck.
8+
WebRTC SDP is even worse.
9+
MSF is a replacement for both, utilizing MoQ live streams.
10+
11+
[MSF](https://www.ietf.org/archive/id/draft-ietf-moq-msf-00.html) is a catalog format for MoQ.
12+
It's similar to the [hang catalog](../layer/hang) and we'll probably merge them in the future.
13+
14+
[See the draft](https://www.ietf.org/archive/id/draft-ietf-moq-msf-00.html) for the latest details.

doc/concept/use-case/ai.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
---
2+
title: AI
3+
description: Welcome to the future, old man.
4+
---
5+
6+
# AI
7+
Hopefully you had this square on your buzzword bingo card.
8+
9+
WebRTC is a great protocol for conferencing, but it's not designed for AI.
10+
But I haven't personally worked in this space either so take my suggestions with a grain of salt.
11+
12+
## Latency
13+
Inference is still quite slow and expensive, even for the big players.
14+
If you're going to spend >300ms and literal dollars on expensive inference, you want at least *some* reliability guarantees.
15+
16+
Unfortunately, WebRTC will never try to retransmit audio packets.
17+
A single lost packet will cause noticeable audio distortion.
18+
And if you have the audacity to generate audio/video separately, WebRTC won't synchronize them for you.
19+
Frames are rendered on receipt, so unless you introduce a delay, audio will be out of sync with video.
20+
21+
One of the core tenets of MoQ is adjustable latency.
22+
The viewer (and thus your application) controls how long it's willing to wait for content before it gets skipped/desynced.
23+
The latency budget of the network protocol can match the latency budget of the application.
24+
25+
## On-Demand
26+
MoQ is pull-based, so nothing is transmitted over the network until there's at least one subscriber.
27+
You can further extend this by not generating/encoding content either.
28+
29+
Both of these were mentioned briefly on the [contribution](/concept/use-case/contribution) page if you want to read more.
30+
31+
### Inference
32+
If you want to save compute resources, you can defer inference until it's actually needed.
33+
34+
For example, let's say you're publishing a `captions` track populated by Whisper or something.
35+
If nobody has enabled captions, then nobody will subscribe to the `captions` track.
36+
You can stop generating the track (or use a smaller model) until it's actually requested.
37+
38+
### Simulcast
39+
If you want to save bandwidth, you can publish media in a format expected by the AI model.
40+
41+
For example, let's say you're doing object detection on a bunch of security cameras.
42+
The model inputs video at 360p and 10fps or something like that, so that's what you publish.
43+
But if a human (those still exist) wants to audit the full video, you can separately serve the full resolution video.
44+
Since this is on-demand, you will only encode/transmit the 1080p video when it's actually needed.
45+
46+
## Browser Control
47+
One of the perks of using WebSockets/MoQ instead of WebRTC is that you get full control over the media pipeline.
48+
49+
[WebCodecs](https://developer.mozilla.org/en-US/docs/Web/API/WebCodecs_API) is used to encode/decode media within the browser.
50+
- For video, you use [VideoFrame](https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame) which directly maps to a texture on the GPU. You can use WebGPU to perform inference, encoding, rendering, etc without ever touching the CPU.
51+
- For audio, you get [AudioData](https://developer.mozilla.org/en-US/docs/Web/API/AudioData) which is (usually) just a float32 array. You control exactly how these are processed, captured, emitted, etc.
52+
53+
It's more work to do this instead of using a `<video>` element of course, but it opens the door to more possibilities.
54+
Run additional inference in the browser, render your media to textures on a model, etc.
55+
56+
And note that all of this is possible with WebRTC and [insertable streams](https://developer.mozilla.org/en-US/docs/Web/API/Insertable_Streams_for_MediaStreamTrack_API).
57+
However, you're really not gaining much by using WebRTC only for networking... just use MoQ instead.
58+
59+
## Non-Media
60+
MoQ is not just for media.
61+
62+
Send your prompts over the same WebTransport connection as the media.
63+
Or send non-media stuff like vertex data for 3D models, separate from the texture data.
64+
It's a versatile protocol with a wide range of use-cases.
65+
66+
## Simplicity
67+
You're working with AI, so you're probably building something new.
68+
69+
If you don't want to deal with SDP, or connections that take 10 RTTs, or unsupported media encodings, or STUN/TURN servers, then give MoQ a try.
70+
It's a lot closer to WebSockets than WebRTC, but with the ability to skip and scale.

doc/concept/use-case/contribution.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,77 @@ That's an over-generalization of course, but it's very interesting to see the di
3333
SRT is built into modern production equipment (hardware) while RTMP is used in consumer software.
3434

3535
Why? IDK.
36+
37+
## Pull vs Push
38+
Existing contribution protocols are push-based.
39+
Even Youtube's weird HLS ingest thing operates via POST requests.
40+
41+
However, MoQ is fundamentally a pull-based protocol.
42+
Technically, MoqTransport supports push too (via PUBLISH), but hear me out for a second.
43+
44+
### The Push Problem
45+
I would say there is one major problem with push: **There's no "optional" content.**
46+
47+
When a publisher creates multiple tracks, like 360p and 1080p, it needs to simultaneously encode and transmit both tracks.
48+
There's no way of knowing if anything downstream *actually* wants the 1080p track; it might go straight to `/dev/null` on the media server.
49+
50+
This doesn't matter for huge events like a concert or sports game.
51+
With enough viewers, we can assume that at least one viewer will want the content.
52+
But it can be a significant cost for long-tail content that nobody watches.
53+
54+
For example, consider a facility with hundreds of security cameras.
55+
We might be able to afford uploading 360p for every camera (recording to disk), but anything more than that would over-saturate the network.
56+
Ideally, we could only stream 1080p from individual cameras when a human wants a closer look...
57+
58+
### The Pull Solution
59+
The first thing a MoQ viewer does is subscribe to the `catalog.json` track for a broadcast.
60+
This lists all of the available tracks and their properties.
61+
62+
If a viewer wants the 1080p track, it subscribes to it.
63+
The subscription makes its way upstream (combining with duplicates) until one subscription reaches the publisher.
64+
When no more viewers want the 1080p track, the subscription is cancelled.
65+
66+
The publisher won't transmit a track until there's an active subscription, saving bandwidth.
67+
The publisher can go the extra mile and not even encode the content without a subscription, saving compute.
68+
This is especially useful for expensive AI models, for example only running whisper when captions are needed.
69+
70+
Note that media services can also benefit from the same behavior.
71+
If nobody currently wants the 1080p track, then don't transcode it.
72+
The "publisher" in this case is any entity that understands the media format on top of MoQ.
73+
74+
## Multiple Connections
75+
Another issue with push-based protocols is that each connection is expensive.
76+
If every connection needs its own copy of the content, we quickly run out of bandwidth.
77+
Redundant ingest is mostly limited to large events that have bandwidth to spare (active-active).
78+
79+
Once again, MoQ solves this via the pull model.
80+
A publisher can establish multiple connections that *might* be used.
81+
A subscription will only be issued if the connection needs a specific track.
82+
83+
For example, a service can implement primary/secondary ingest via two connections to separate endpoints.
84+
All subscriptions are issued over the primary connection but if it fails, the subscriptions are moved to the secondary connection.
85+
The endpoints don't even have to be part of the same CDN and MoQ publisher is completely oblivious; it just knows it was told to connect to two URLs.
86+
87+
Another example is P2P streaming.
88+
A client can establish a connection to each peer, transmitting tracks as requested.
89+
If one peer has the video minimized, then it can unsubscribe from the video track and save bandwidth.
90+
Again there's no business logic for this built into MoQ: it's automatic.
91+
92+
But what about clients that don't support P2P?
93+
Each client can also establish a connection to a MoQ CDN as a fallback.
94+
This works because the client discovers all available broadcasts available on a connection via the built-in [announce mechanism](/feature/announce).
95+
If two connections can serve the same content, the subscription goes to the "best" connection (ie. P2P > CDN).
96+
97+
## Economies of Scale
98+
A subtle problem with contribution protocols is that they're not used for distribution.
99+
100+
This might silly: "of course distribution and contribution are different!"
101+
But when you really sit down and break down the requirements, they're not that different.
102+
One is client-server while the other is server-client, one is 1:1 while the other is 1:N.
103+
104+
By designing a protocol that works for both contribution and distribution, we can share implementations and optimizations.
105+
There are other benefits of supporting 1:N too, as mentioned in the previous section, so it seems like a no-brainer.
106+
107+
The other way we benefit from economies of scale is by using QUIC.
108+
We're not implementing our own UDP-based protocol and rediscovering the rough edges of the internet all over again.
109+
A QUIC library with BBR will out-perform the system TCP stack and likely out-perform any custom UDP thing (ex. SRT).

doc/concept/use-case/index.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,5 @@ description: How MoQ should be used in the wild
77
- [Contribution](/concept/use-case/contribution): A publisher (ex. OBS) sends data to a service (ex. Twitch).
88
- [Distribution](/concept/use-case/distribution): A service (ex. Twitch) distributes data to viewers.
99
- [Conferencing](/concept/use-case/conferencing): A service (ex. Zoom) facilitates a conference between multiple participants.
10-
- [Exotic](/concept/use-case/exotic): Some ideas for other use cases that might be viable.
10+
- [AI](/concept/use-case/ai): Generative AI, overlays, voice agents, and more.
11+
- [Other](/concept/use-case/other): Some ideas for other use cases that might be viable.
File renamed without changes.

0 commit comments

Comments
 (0)