An open-source desktop AI work system for browser automation, reusable task runs, templates, MCP-native tooling, and real local execution.
OpenCowork is built for people who want an agent that does more than chat. It can open websites, operate a headed browser, call CLI tools, run reusable skills, persist task history, and now connect to or expose standard MCP servers.
It is designed for fast iteration on real desktop workflows: research, operations, internal tools, demos, browser automation, and repeatable task execution.
Compared with many "chat-first" agent demos, OpenCowork is moving toward a result-first workflow:
- every serious task should produce a reusable run record,
- successful work should be reviewable as a result,
- useful work should become a template,
- repeated work should be schedulable or triggerable from IM.
The current work stream is converging around a result-centric task model:
- task runs are recorded as reusable
TaskRunrecords, - completed work persists into
TaskResult, - history is shifting toward outcomes, artifacts, and rerun links,
- templates can be created from successful runs and executed with parameters,
- scheduler and IM surfaces now reuse the same task/result semantics.
- Added a first working Hybrid CUA browser runtime with explicit visual execution support.
- Added a dedicated
visual_browseragent tool for complex UI tasks that are not stable with DOM selectors alone. - Added approval-aware visual execution with approve-and-continue and takeover flows.
- Added a visual debug entry point in the desktop UI.
- Added visual trace review in execution steps, result delivery, task run details, and history.
- Added regression tests for visual routing, approval continuation, and visual trace rendering.
- Standard MCP client support for remote
streamable-httpendpoints such as LangChain Docs MCP. - Standard MCP server mode with a
/mcpendpoint, while keeping legacy/toolscompatibility. - A clearer MCP UI split into
ClientsandServer Mode. - Better follow-up continuity across agent turns using thread reuse.
- Safer long-running conversations by preventing screenshot payloads from blowing up model context.
- Improved browser search flows with
pressEntersupport for input actions. - Stronger memory, task history, and restore foundations for real multi-step work.
| Capability | What it enables |
|---|---|
| Desktop Agent | Multi-step task execution through a ReAct-style agent |
| Browser Automation | Navigate, click, type, extract, wait, and capture screenshots |
| Skills | Install and run reusable capabilities like ppt-creator |
| MCP Client | Connect external MCP tools and use them inside the agent |
| MCP Server | Expose OpenCowork capabilities to other MCP clients |
| Task History | Persist task results, steps, and recovery state |
| Task Templates | Save successful work as reusable, parameterized task flows |
| IM File Workflow | Send tasks and files through Feishu and receive result files |
| Vision Analysis | OCR and multimodal understanding for local images |
| Human-in-the-loop | Pause, resume, interrupt, and take over tasks |
| International UI | English-first UI with Chinese support |
OpenCowork is a good fit if you are:
- building a desktop AI copilot with real browser and local execution,
- evaluating MCP-native agent UX beyond CLI-only demos,
- automating recurring research, operations, or reporting workflows,
- experimenting with reusable agent templates and result-centric history,
- contributing to an open-source desktop agent stack that is still moving fast.
- Node.js 18+
- npm 9+
- Python 3.8+ for selected skills
- A valid LLM API configuration in
config/llm.json
git clone https://github.com/LeonGaoHaining/opencowork.git
cd opencowork
npm installCreate config/llm.json:
{
"provider": "openai",
"model": "gpt-5.4-mini",
"apiKey": "your-api-key",
"baseUrl": "https://api.openai.com/v1",
"timeout": 60000,
"maxRetries": 3
}For image analysis through IM, use a model deployment that supports image input on chat/completions.
- Keep
config/local to your machine. config/is git-ignored and should never be committed.- Feishu credentials such as
config/feishu.jsonmust not be pushed to GitHub.
npm run electron:devOpen Baidu, search for a company, and summarize what it does.
Create a company overview PPT from the information on the page.
Connect an MCP tool and use it to fetch LangChain docs examples.
Open the generated PPT file.
Turn the successful task into a reusable template and schedule it weekly.
OpenCowork now supports both sides of MCP:
- As an MCP client, it can connect to standard remote MCP servers.
- As an MCP server, it can expose tools through a standard
/mcpendpoint.
Examples:
- Connect to
https://docs.langchain.com/mcpfrom the MCP client panel. - Enable server mode and expose selected OpenCowork tools to external clients.
CHANGELOG.md— release historyUSER_GUIDE.md— product usage guidedocs/ARCHITECTURE.md— architecture overviewdocs/ROADMAP.md— product directionCONTRIBUTING.md— contribution workflowSECURITY.md— security reporting policy
# Main desktop development flow
npm run electron:dev
# Build all targets
npm run build
# Test
npm run test:run
# Lint and format
npm run lint
npm run formatOpenCowork is moving from an internal fast-iteration agent into a stronger open-source developer product. The current release is best suited for builders who want:
- a desktop automation foundation,
- an MCP-native local agent shell,
- a skill-based extensibility layer,
- a result-centric task system with reusable templates,
- and a project that is actively shipping core agent infrastructure.
v0.12.5 is the current recommended tag.
v0.12.0introduced the task-result-template workflow convergence.v0.12.1fixed the missing overview panel files from that release.v0.12.2adds follow-up stabilization for result delivery, i18n, run scoping, overview safety, and reusable workflow UX.v0.12.3adds bidirectional Feishu file workflows and real image analysis for IM-driven tasks.v0.12.4fixes Feishu IM reply routing.v0.12.5introduces the first working Hybrid CUA feature slice with explicit visual browser execution and persisted visual trace review.
- Issues: https://github.com/LeonGaoHaining/opencowork/issues
- Discussions: https://github.com/LeonGaoHaining/opencowork/discussions
- Website: https://opencowork.me
Apache-2.0. See LICENSE.