OpenCowork

An open-source desktop AI work system for browser automation, reusable task runs, templates, MCP-native tooling, and real local execution.

Why OpenCowork

OpenCowork is built for people who want an agent that does more than chat. It can open websites, operate a headed browser, call CLI tools, run reusable skills, persist task history, and now connect to or expose standard MCP servers.

It is designed for fast iteration on real desktop workflows: research, operations, internal tools, demos, browser automation, and repeatable task execution.

Compared with many "chat-first" agent demos, OpenCowork is moving toward a result-first workflow:

every serious task should produce a reusable run record,
successful work should be reviewable as a result,
useful work should become a template,
repeated work should be schedulable or triggerable from IM.

Current Product Direction

The current work stream is converging around a result-centric task model:

task runs are recorded as reusable TaskRun records,
completed work persists into TaskResult,
history is shifting toward outcomes, artifacts, and rerun links,
templates can be created from successful runs and executed with parameters,
scheduler and IM surfaces now reuse the same task/result semantics.

What's New in v0.12.5

Added a first working Hybrid CUA browser runtime with explicit visual execution support.
Added a dedicated visual_browser agent tool for complex UI tasks that are not stable with DOM selectors alone.
Added approval-aware visual execution with approve-and-continue and takeover flows.
Added a visual debug entry point in the desktop UI.
Added visual trace review in execution steps, result delivery, task run details, and history.
Added regression tests for visual routing, approval continuation, and visual trace rendering.

Highlights in v0.10.10

Standard MCP client support for remote streamable-http endpoints such as LangChain Docs MCP.
Standard MCP server mode with a /mcp endpoint, while keeping legacy /tools compatibility.
A clearer MCP UI split into Clients and Server Mode.
Better follow-up continuity across agent turns using thread reuse.
Safer long-running conversations by preventing screenshot payloads from blowing up model context.
Improved browser search flows with pressEnter support for input actions.
Stronger memory, task history, and restore foundations for real multi-step work.

Core Capabilities

Capability	What it enables
Desktop Agent	Multi-step task execution through a ReAct-style agent
Browser Automation	Navigate, click, type, extract, wait, and capture screenshots
Skills	Install and run reusable capabilities like `ppt-creator`
MCP Client	Connect external MCP tools and use them inside the agent
MCP Server	Expose OpenCowork capabilities to other MCP clients
Task History	Persist task results, steps, and recovery state
Task Templates	Save successful work as reusable, parameterized task flows
IM File Workflow	Send tasks and files through Feishu and receive result files
Vision Analysis	OCR and multimodal understanding for local images
Human-in-the-loop	Pause, resume, interrupt, and take over tasks
International UI	English-first UI with Chinese support

Who This Is For

OpenCowork is a good fit if you are:

building a desktop AI copilot with real browser and local execution,
evaluating MCP-native agent UX beyond CLI-only demos,
automating recurring research, operations, or reporting workflows,
experimenting with reusable agent templates and result-centric history,
contributing to an open-source desktop agent stack that is still moving fast.

Quick Start

Requirements

Node.js 18+
npm 9+
Python 3.8+ for selected skills
A valid LLM API configuration in config/llm.json

Install

git clone https://github.com/LeonGaoHaining/opencowork.git
cd opencowork
npm install

Configure your model

Create config/llm.json:

{
  "provider": "openai",
  "model": "gpt-5.4-mini",
  "apiKey": "your-api-key",
  "baseUrl": "https://api.openai.com/v1",
  "timeout": 60000,
  "maxRetries": 3
}

For image analysis through IM, use a model deployment that supports image input on chat/completions.

Local config safety

Keep config/ local to your machine.
config/ is git-ignored and should never be committed.
Feishu credentials such as config/feishu.json must not be pushed to GitHub.

Run the desktop app

npm run electron:dev

Example Prompts

Open Baidu, search for a company, and summarize what it does.
Create a company overview PPT from the information on the page.
Connect an MCP tool and use it to fetch LangChain docs examples.
Open the generated PPT file.
Turn the successful task into a reusable template and schedule it weekly.

MCP Support

OpenCowork now supports both sides of MCP:

As an MCP client, it can connect to standard remote MCP servers.
As an MCP server, it can expose tools through a standard /mcp endpoint.

Examples:

Connect to https://docs.langchain.com/mcp from the MCP client panel.
Enable server mode and expose selected OpenCowork tools to external clients.

Documentation

CHANGELOG.md — release history
USER_GUIDE.md — product usage guide
docs/ARCHITECTURE.md — architecture overview
docs/ROADMAP.md — product direction
CONTRIBUTING.md — contribution workflow
SECURITY.md — security reporting policy

Development

# Main desktop development flow
npm run electron:dev

# Build all targets
npm run build

# Test
npm run test:run

# Lint and format
npm run lint
npm run format

Open Source Status

OpenCowork is moving from an internal fast-iteration agent into a stronger open-source developer product. The current release is best suited for builders who want:

a desktop automation foundation,
an MCP-native local agent shell,
a skill-based extensibility layer,
a result-centric task system with reusable templates,
and a project that is actively shipping core agent infrastructure.

Current Release Notes

v0.12.5 is the current recommended tag.

v0.12.0 introduced the task-result-template workflow convergence.
v0.12.1 fixed the missing overview panel files from that release.
v0.12.2 adds follow-up stabilization for result delivery, i18n, run scoping, overview safety, and reusable workflow UX.
v0.12.3 adds bidirectional Feishu file workflows and real image analysis for IM-driven tasks.
v0.12.4 fixes Feishu IM reply routing.
v0.12.5 introduces the first working Hybrid CUA feature slice with explicit visual browser execution and persisted visual trace review.

Community

Issues: https://github.com/LeonGaoHaining/opencowork/issues
Discussions: https://github.com/LeonGaoHaining/opencowork/discussions
Website: https://opencowork.me

License

Apache-2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github		.github
backup		backup
config		config
docs		docs
src		src
test		test
.eslintrc.cjs		.eslintrc.cjs
.gitignore		.gitignore
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
AI浏览器方案.html		AI浏览器方案.html
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PRD.md.bak.20260329_002534		PRD.md.bak.20260329_002534
PRD.md.bak.20260329_081901		PRD.md.bak.20260329_081901
README.md		README.md
SECURITY.md		SECURITY.md
SPEC_v0.3.md.bak.20260329_082448		SPEC_v0.3.md.bak.20260329_082448
USER_GUIDE.md		USER_GUIDE.md
USER_GUIDE.md.bak		USER_GUIDE.md.bak
USER_GUIDE.md.bak.20260329_000255		USER_GUIDE.md.bak.20260329_000255
electron-builder.json		electron-builder.json
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json
tsconfig.main.json		tsconfig.main.json
tsconfig.preload.json		tsconfig.preload.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenCowork

Why OpenCowork

Current Product Direction

What's New in v0.12.5

Highlights in v0.10.10

Core Capabilities

Who This Is For

Quick Start

Requirements

Install

Configure your model

Local config safety

Run the desktop app

Example Prompts

MCP Support

Documentation

Development

Open Source Status

Current Release Notes

Community

License

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenCowork

Why OpenCowork

Current Product Direction

What's New in v0.12.5

Highlights in v0.10.10

Core Capabilities

Who This Is For

Quick Start

Requirements

Install

Configure your model

Local config safety

Run the desktop app

Example Prompts

MCP Support

Documentation

Development

Open Source Status

Current Release Notes

Community

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages