feat: REPL with tracing and isolated benchmarking

Add a REPL to:
- manage the main API server and lets the user discover and handle worker nodes
- manage local model storage and download of restricted models from HF
- trace callstack and get performance information at different levels (including interpreter-level with sys.setprofile)
- trace different subsystems (prefill, compute, prefetch) and specific components (tokenizer, attention, etc.)
- isolate and benchmark different subsystems and components with dummy data

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: REPL with tracing and isolated benchmarking #10

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: REPL with tracing and isolated benchmarking #10

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions