screenpipeComputer Agent

AI agent that controls computer with OS-level tools, MCP compatible, works with any model

We saw a lot of companies creating agents, including famous companies like Claude and OpenAI, and Google's Project Mariner, but all of them work really badly. We were really puzzled why, so we decided to create our own. It turns out that we did quite a good one. And the coolest thing is it's the fastest agent, it's also agnostic to any model – you can run the agent on any model, including local on-premise ones.

We're reviewing inbound requests from various medium-sized and large customers as design partners to give you early access and provide you with hands-on support.

Features

Application Control

Launch and activate applications using name, bundle ID, or file path.

Use openApplication to seamlessly open and interact with any macOS application.

UI Element Inspection

Deep traversal of application UI elements via macOS Accessibility API.

Access elements and attributes with traverseAccessibilityTree and filter with onlyVisibleElements.

Input Simulation

Execute precise user actions with mouse and keyboard controls.

Simulate clicks, typing, and key combinations with clickMouse, writeText, and pressKey.

UI State Diffing

Track UI changes before and after actions are performed.

Compute TraversalDiff to identify precisely which elements changed after an action.

Visual Feedback

Enhance debugging with visual indicators of system interactions.

Show animations for inputs and highlight UI elements with showAnimation and drawHighlightBoxes.

Error Handling

Robust error reporting for reliable automation.

Receive detailed ActionResult objects with specific error messages for each step of the process.

How It Works

Core Technology

Built natively in Swift, leveraging the macOS Accessibility API (AXUIElement) for standardized access to UI elements across applications.

Input Simulation

Low-level mouse and keyboard events generated via CoreGraphics (CGEvent) for accurate user input simulation.

Architecture

Runs as a stdio server process using the MCP protocol, can be run from Claude desktop or any MCP Client, any NexJS app, with MacosUseSDK translating commands to system interactions.

Use Cases for Engineers

UI Automation Testing

Develop robust end-to-end tests for macOS applications, scripting user flows and asserting application state.

Robotic Process Automation

Automate repetitive, GUI-driven tasks across different applications, even those without scriptable APIs.

Agentic Systems & LLMs

Serve as the execution layer for AI agents, allowing them to perceive and interact with the macOS desktop environment.

Data Extraction & Analysis

Scrape data directly from application UIs and analyze accessibility structures when traditional APIs are unavailable.

Excited as we are?

Dive into the project repository to explore the code, build the agent, and start automating your macOS workflows.