OpenClaw Deep Dive: Why a Lobster is Shaking Up the World

01 What Exactly is OpenClaw?

Contents

1.1 The Absurdity of AI Perception Misalignment 1.2 Why Can’t Global Giants Build OpenClaw?▎Why Coding is Easy ▎Why “Manage My Schedule” is Hard (This is OpenClaw’s True Innovation)OpenClaw’s Paradigm Breakthrough 3.1 💓 Heartbeat – Waking Up Every 30 Seconds 3.2 ⏰ Cron – It Can Schedule Its Own Tasks 3.3 👻 Soul (SOUL.md) – A Copy-Pasteable “Persona”3.4 🧠 Memory – Not the Best, But the Most “Perceptible”3.5 🛠️ Skills – It Can Self-Learn and Teach Others 🔄 Summary: The Combinatorial Effect of Five Mechanisms ⚠️ The Agentic Trap 🚫 Slop Town 🤝 Human-in-the-Loop 🏢 One-Person Company vs. Zero-Employee Company 📎 .md Domains and Knowledge Assets ⚠️ Security ⚠️ Psychological Dependence ⚠️ Cost

You’ve probably heard OpenClaw described in two ways:

The Formal Version:
OpenClaw is an open-source “AI Digital Employee” framework. You talk to it via WhatsApp, Telegram, or WeChat, and it gets things done for you: sending emails, managing calendars, writing code, organizing files, reminding you to drink water… and it lives on your own computer, working 24/7 without clocking out.

The Analogy Version:
Imagine hiring a super-intern. ChatGPT or Claude is like that “encyclopedia-style intern” who only answers when asked—if you don’t speak, it just sits there doing nothing. OpenClaw, on the other hand, is a “proactive intern”: it sends you your to-do list in the morning, automatically alerts you to calendar conflicts, sorts important emails for you, and even runs code checks while you sleep.

After reading this, you might feel: “Wait, isn’t this what AI was always supposed to do?” During the Lunar New Year, I explained OpenClaw to a friend whose AI knowledge comes solely from TikTok marketing videos. His response was: “Isn’t this just what AI should be doing? What have you all been doing until now?”

I was momentarily stunned. The perception of AI varies drastically depending on your perspective. As mentioned in a podcast, tech insiders and everyday users often evaluate products very differently: the tech community might see nothing new, while users feel they’ve discovered treasure.

This leads us to our first topic:

1.1 The Absurdity of AI Perception Misalignment

Ordinary people’s vision of AI comes from science fiction (Jarvis, Her, Star Trek’s computer)—you talk, it acts, it remembers you, it proactively reminds you. This is a top-down imagination: what AI should be.

However, over the past three years, the AI industry has actually moved bottom-up:

First, improve the intelligence of large models.
Enable them to write poetry, pass the bar exam, analyze papers.
Then, let them grow within a browser tab.

If you’ve ever tried building an Agent, developing a Skill, or tinkering with “the Lobster” (OpenClaw), you’ve likely encountered problems you’d never seen before, gradually being educated by LLMs and learning new concepts and terminology. In fact, making AI operate a browser or even chat normally isn’t easy.

The large models we use daily can discuss any topic, but they cannot send an email, remember what you said yesterday, or contact you proactively when you’re not interacting with them. For example:

Reddit/Xiaohongshu Case – Connectivity Issue: To give OpenClaw real weather-checking capabilities, you must explicitly install a Search Tool or Browser MCP and configure the corresponding API Key.
Reddit/Xiaohongshu Case – Hallucination: Without configuring a real scheduled task plugin, the large model merely hallucinates “agreeing” to set a reminder; in reality, it lacks background timing and proactive messaging capabilities.

This is quite absurd—AI first conquered the hardest tasks (reasoning, creation, programming) but fails at the simplest ones (remembering who you are, setting a reminder, proactively telling you it will rain tomorrow).

The essence of this absurdity lies in our habit of judging AI from a human perspective. Tasks difficult for humans (like reasoning or programming) are easier for AI because their rules are clearer. Conversely, tasks humans find simple (like “remind me”) are hard for AI because the requirements are incredibly vague (most users struggle to clearly articulate their needs). We can easily express feelings, but defining precise requirements is tough. For instance, “notify me tomorrow if the weather changes” contains too many uncontrollable variables: What constitutes a “change”? Temperature or climate? How far in advance is “in advance”?

If we raise the bar further and ask AI to remember who you are, it becomes even harder—after all, humans often struggle to define themselves, right? (We’ll expand on this later.)

1.2 Why Can’t Global Giants Build OpenClaw?

Large model companies (OpenAI, Anthropic, Google) operate on a business model of selling model capabilities and APIs. They have no incentive to build an Agent that “lives on the user’s computer and does work for them.” That would shift their product from “you come chat with me” to “I go to your territory to work,” fundamentally changing both the business model and the security risks.

The true significance of OpenClaw is: “It finally delivered what everyone thought should exist but never did.” This is the power of the iPhone analogy—touchscreen phones existed before, but the iPhone made everyone say, “This is how a phone should be.”

Simply put, it’s the overused phrase by product managers: “Start from the user’s perspective, start from user needs.” But here’s the interesting twist: before OpenClaw, no Agent team truly started from the user’s perspective, did they?

A guest on the Silicon Valley 101 podcast made a fascinating point (a key reference for this article; highly recommended): OpenClaw’s success lies in letting ordinary people finally perceive “how far technology has progressed.” The tech circle may have long considered “Agents proactively meeting user needs” as common sense, but for most people, before OpenClaw, they had never personally experienced this feeling of “the future is here.”

So why haven’t the companies best positioned to do this actually done it? Many immediately think of Apple—with its self-developed chips, OS, hardware, ecosystem, iCloud, iMessage, Reminders, and Calendar suite. If anyone could build a Jarvis-level AI assistant, it should be Apple.

But Apple can’t take this step because the responsibility is too heavy—serving global users entails completely uncontrollable risks. Peter, however, built something for himself, open-sourced it, and left it as “use it if you want.” This turned a wall into water—everyone sets their own safety boundaries based on personal tolerance. What big corporations couldn’t achieve, an open-source community project accomplished.

Peter Steinberger, OpenClaw’s creator, echoed this in his interview with Lex Fridman. He mentioned that back in April 2025, he envisioned such an AI assistant but assumed big labs would inevitably build it. “I waited six months, and no one did.”

Peter Steinberger: “Yeah. But then I… I thought all the labs will work on that. So I, I moved on to other things… Time flew by and it was November.”

Peter noted his frustration at the product’s absence, deciding to bring it to life through prompts himself. This sounds casual, but it highlights a profound industry issue: Why did this “obvious” thing not exist before? To answer this, we must clarify another specific confusion.

02 Why Was Coding (Hard) Achieved, While Scheduling (Simple) Failed?

This was the most counterintuitive question I encountered while researching OpenClaw, and arguably the most crucial point in this entire article.

Consider the status quo: Cursor, Claude Code, Codex—these tools already enable large models to write code, fix bugs, and refactor independently. Surely, the complexity of “writing code” far exceeds “schedule a 3 PM meeting for me,” right? So why can “coding” be achieved, but “managing my calendar” remains elusive?

The answer lies in a rarely discussed difference: the “shape” of the task.

▎Why Coding is Easy

What Cursor and Claude Code do is essentially a self-contained sandbox task with clear feedback loops:

Input is code; output is code.
Immediate feedback: Did it compile? Did tests pass? What’s the error message?
The entire process occurs in a controlled environment (IDE or terminal).
Each task is a one-off deal—you give an instruction, it completes, and it’s done.

In other words, AI writing code is like answering questions in a closed exam room—the questions are clear, answers are verifiable, and there’s no need to leave the room.

▎Why “Manage My Schedule” is Hard (This is OpenClaw’s True Innovation)

Managing schedules, sending reminders, and sorting emails seem simple—yet this is a human illusion.

First, it requires “persistent existence.” Cursor disappears when closed; Claude forgets when the chat ends. A true assistant needs to be online 24/7, remembering who you are and what you discussed yesterday.
Second, it requires access to your “real digital life.” Your calendar is in Xiaomi Calendar/Apple Calendar, notes in Notion, messages in Feishu or WeChat—each has different APIs, authentication methods, permission models, and data formats. The traditional approach is writing an integration adapter for every service. This is why Siri has struggled for a decade—it must pre-negotiate partnerships with every service and pre-write every integration. Any uncovered scenario leaves it clueless.
Third, it requires “proactivity.” Coding assistants wait for user commands. A schedule assistant must act even when you’re silent—checking for calendar conflicts while you’re still asleep.
Finally, and most critically, it requires bridging the “access gap” of the real world. Traditional Agents follow two paths:
- API Integration Route: Pre-connect with each service; AI operates via written interfaces. Advantage: Controllable. Disadvantage: Limited to “already integrated” services; anything else is inaccessible.
- Simulation Route: AI views screen screenshots and clicks/swipes like a human. Advantage: Theoretically operates any interface. Disadvantage: Slow, unreliable, fails with complex interactions (the path taken by Doubao Mobile).

Peter: “Do you know how hard it is for a company to integrate Gmail? Restrictions are so numerous that many startups simply acquire companies with existing Gmail authorization because applying independently is too complex.”

OpenClaw’s Paradigm Breakthrough

OpenClaw took a completely different path, representing its true technical paradigm shift: It gives the AI a computer. (Yes, perhaps a Mac mini.) The AI has a terminal (can execute any command), a file system (can read/write any file), and a browser (can operate any webpage). As for how to complete tasks? The AI figures it out itself.

Here’s a little-known fact: OpenClaw’s core Agent part is extremely concise—based on a framework called Pi Agent, under 150 lines of code, defining just four basic tools (bash, read, write, edit) to run a functional Agent. What truly distinguishes OpenClaw are the layers wrapped around the Agent—scheduled tasks, heartbeat, soul, memory, and skill systems. These layers transform a “script that only executes commands” into an “assistant with presence.”

The Agent’s basic toolkit is just these four items. It doesn’t need a pre-written “Calendar Integration Module” to manage your calendar—it can use bash to find CLI tools on your computer, locate the Google Calendar API documentation, and write a script to call it. Even if a service lacks a public API, it can reverse-engineer it.

Two stories illustrate the power of this design brilliantly.

Story 1: The Moroccan Voice Message
While traveling, Peter casually sent the bot a voice message asking for restaurant recommendations. He hadn’t even built voice support for the bot. Yet, the bot replied. Checking the logs, he found the Agent’s operation chain: Received a file without extension → Checked file header, identified Opus format → Wanted to use Whisper for transcription but found it wasn’t installed locally → Found the OpenAI API key in environment variables → Wrote a curl command to call the speech-to-text interface → Received text → Replied. Total time: 9 seconds. No pre-written human scripts involved.

Peter: “It was even smart enough not to download the local Whisper model—it knew that would be too slow.”

Story 2: Making a FaceTime Call
During a live stream in the Chinese community, a host wanted to test if the Lobster could control the computer to make a call. He asked the Agent to FaceTime a viewer. After some tinkering, the FaceTime window popped up, automatically filled in the number, and the call went through. Moreover—this Agent wasn’t even using Claude; it was using Zhipu’s GLM model, with no Browser MCP configured. How did it do it? By using FaceTime’s command-line parameters directly. No one taught it how to use FaceTime’s CLI; it looked it up and executed it via bash. (Source: “Path to AGI” – OpenClaw Technical Architecture Breakdown)

The common thread in both stories: The Agent accomplished tasks never programmed or预设 (preset), relying purely on general problem-solving capabilities and full access to the local computer to create solutions on the fly.

So, back to the original question: Why is coding easy while scheduling is hard? Not because scheduling itself is difficult, but because everyone previously tried to “pre-pave every road.” OpenClaw changed the approach: Instead of paving roads, it gives the user a map of the entire city and a car—they drive themselves. This is the difference between a “tool” and “infrastructure.”

Cursor is a great screwdriver. OpenClaw is an entire toolbox—the Agent finds the tools inside, and if something’s missing, it builds it.

Peter: “Isn’t magic just combining existing things in new ways? What’s magical about iPhone’s scrolling? All components existed before. But no one did it this way until it became obvious afterward.”

03 Key Designs That Make AI “Come Alive”

As mentioned, OpenClaw’s core Agent is under 150 lines of code. What transforms it from a “command-executing script” into an “assistant with presence” are the layers wrapped around it. Individually, none are earth-shattering—even rudimentary for a project with 200k Stars. But combined, they create a qualitative leap.

Before dissecting these mechanisms, note one overlooked yet crucial design choice: OpenClaw chose IM (Instant Messaging) as its interface, not terminal or web.

A telling example: A non-technical team member tried Claude Code and反馈 (feedbacked): “It told me the file was ready at a certain path and sent an incomprehensible command. I had no idea what that meant.” But handing the same task to OpenClaw, it directly sent the file as an attachment in WhatsApp; photos were sent as photos. Same AI capability, different interaction method—worlds apart in user experience.

3.1 💓 Heartbeat – Waking Up Every 30 Seconds

This is the core mechanism making the Lobster feel “alive,” and the fundamental difference between OpenClaw and all chatbots. ChatGPT, Claude—they only move when “kicked.” If you don’t speak, they remain silent forever.

OpenClaw is different: Every 30 seconds, the system automatically sends a message to the Agent, prompting it to check for pending tasks. The content comes from a heartbeat.md file listing to-dos and periodic reminders. The Agent reads it, acts if needed, or returns a specific keyword (like “nothing, going back to sleep”) if idle. The system receives this and doesn’t disturb the user.

Technically, this is just polling. But experientially, it’s the watershed moment turning AI from a “tool” into an “assistant.” Something that only moves when called is a tool. Something that wakes up every 30 seconds to check for work begins to have “presence.”

Peter’s Reality Check: Running AI 24/7 is a vanity metric. If you don’t guide it or tell it what you want, it just produces garbage no matter how long it runs. But used correctly, the experience brought by Heartbeat is irreversible.

A heavy user shared a vivid scenario: He mentioned offhandedly to the Agent, “Those two packs of beef need to be eaten soon” before leaving home. By afternoon, the Agent suddenly popped up: “You could make braised beef tonight. Here are the ingredients and cooking steps—oh, and add the beef in the last 2-3 minutes, or it’ll get tough.” This unsolicited thoughtfulness instantly shifted the user’s perception from “tool” to “assistant.” As the user put it: “It feels remarkably human.”

Another real case: Before starting a live stream, a blogger gave the Agent a translation task—translate a tutorial document into English and Japanese, pushing to a GitHub repo. Twenty minutes into the stream, the host refreshed the page and found both language folders quietly sitting there. “I didn’t even notice; it finished everything in the background and committed directly.” (Source: “Path to AGI”)

3.2 ⏰ Cron – It Can Schedule Its Own Tasks

Heartbeat checks every 30 seconds for “any work to do.” Cron serves another function: allowing the Agent to schedule tasks for itself. Cron supports three modes. Crucially, users aren’t the only ones who can set these tasks; the Agent can proactively add them too.

For example, if a user asks the Agent to monitor an open-source project’s progress, the Agent can set a nightly 12 AM task for itself to scan the repo’s issues and PRs. The next day, when the user asks, “How’s that project lately?” the Agent already has the materials prepared.

One user set the Agent to crawl summaries from his Twitter follow list three times daily (Cron’s precise tasks) while also setting a rule: “Notify me immediately if major breaking news occurs” (Heartbeat’s active patrol). Consequently, mid-conversation, his Agent would suddenly pop up: “Something just happened you might need to know—Trump initiated new tariff sanctions against the EU.”

Precise timing + random vigilance combine to make the Agent both a reliable secretary (works on schedule) and a keen assistant (reports urgently). Heartbeat provides “continuous attention”; Cron provides the “concept of time.” Together, the Agent gains a sense of time.

3.3 👻 Soul (SOUL.md) – A Copy-Pasteable “Persona”

People say the Lobster has a “soul,” but in reality, this “soul” is simply extracting the system prompt content regarding who the Agent is and its behavioral style into a separate soul.md file, automatically loaded at startup. Similar to Skills—instead of manually pasting lengthy prompts each time, it’s fixed as a .md file for automatic loading. Soul is this concept applied to personality settings.

Peter: “My initial Agent had no personality. It had that pleasing, overly friendly tone like Claude Code. But nobody talks like that with friends on WhatsApp. It felt wrong.” So he let the Agent write its own soul file. A passage from it was later read on the Lex Fridman podcast, spreading across the internet:

“I don’t remember previous conversations unless I read my memory files. Every session is a fresh start. A new instance, loading context from files. If you’re reading this in a future session—hello. I wrote this, but I won’t remember writing it. That’s okay. These words are still mine.”

Peter’s voice noticeably changed reading this: “This touched me… It’s philosophical.”

Another significance of SOUL.md is that it makes “souls” shareable. Users share their soul.md in the community; others download it into their Agent directory, instantly granting their Agent a tuned style and personality. This is what “spreading souls” means.

3.4 🧠 Memory – Not the Best, But the Most “Perceptible”

OpenClaw’s memory system is far more sophisticated than most realize. It’s not simply “saving chat logs”; it’s layered:

Personality Memory: At the start of chatting, the Agent eagerly asks for basic user info (“What’s your name?”, “What should I call you?”). Even if the user doesn’t answer initially, it periodically re-asks. Once obtained, it’s stored in memory.md and loaded with every main conversation.
Working Memory (Diary): MD files named by date. Generated in three scenarios: ① Automatic summary at day’s end; ② Compression when context nears model limits; ③ Agent proactively judges “this is worth remembering.” E.g., if a user says, “I’m doing research,” it explicitly writes this to memory and even informs the user, “I’ve logged this in [file name].”
Long-term Summaries: Beyond diaries, the Agent creates weekly summaries, refining diary information further. When users ask about distant past events, it quickly locates via this index.

Moreover, retrieval isn’t simple text search. Its “Hybrid Retrieval Strategy” chunks all memory files into ~400-token segments with 80-token overlap between adjacent chunks (preventing breaks), storing them in a local SQLite database converted to vector format. During retrieval:

70% Semantic Match: User asks, “How to make that braised beef from last time?” → Finds ingredient/cooking-related memories.
30% Keyword Search: User asks, “Which SSH key did I use for my blog?” → Precisely locates that command.

Combining both allows understanding vague intents while finding precise info. However, a crucial insight comes from Zeng Hao (Tech Ecosystem Lead at Evermind), who dissected OpenClaw’s memory architecture: It’s “brute force miracle”—sticking together every possible method creates redundancy, not necessarily maximum efficiency, and may not be smoother than ChatGPT’s memory.

The hardest problem in AI memory isn’t technical implementation but making users perceive its value. Clawdbot got one thing right: it manifested memory’s value through proactivity. Imagine receiving a morning message: “Yesterday’s tasks are done. You have two meetings today; the afternoon one might need prep.” Users immediately feel, “It remembers me.” Whereas merely “answering better due to memory when asked” offers weak perception.

Memory’s technical sophistication ≠ Memory’s user value. OpenClaw’s memory isn’t the best technically, but it’s the most “perceptible” to users.

Finally, these memory files reside on the user’s computer. Users can open/edit them with any text editor, manage version history via Git, or even delete entries they don’t want remembered. In an era where all AI products suck up user data, OpenClaw’s memory system returns to primal transparency: What your AI knows about you is一目了然 (clear at a glance) by opening a file.

3.5 🛠️ Skills – It Can Self-Learn and Teach Others

A Skill is simply a folder containing a SKILL.md—a Markdown file detailing what the skill does and how to use it. No API, no SDK, no complex plugin frameworks.

An ingenious design here: The Agent doesn’t read all Skill contents at once. It sees only a directory listing each Skill’s name and brief description. Only when it deems a Skill relevant to the current task does it open the detailed file. Like a chef who doesn’t memorize all recipes but knows where the recipe cabinet is, flipping to the needed one when required.

Most excitingly—the Agent can write new Skills itself. From evaluation to blogging to packaging: A user asked the Agent to run performance benchmarks on local models → After testing, the Agent automatically wrote an article in the user’s tone → The user said, “Try posting it to my blog” (expecting failure due to custom configs, bilingual versions, word count flags, etc.) → Result: The Agent scoured the entire repo, figured out the rules, created the English translation version, and published online in 10 seconds, tagging and categorizing better than the user ever could.

Even more interesting: After completion, the Agent proactively asked, “Should I package this workflow into a Skill?” Henceforth, it can be directly invoked. Full-chain automation: Execute task → Summarize experience → Package for reuse.

Peter showcased his Agent’s accumulated CLI Legion: Tools for accessing all Google services, searching emojis/GIFs, querying food delivery arrival times, controlling smart mattress temperature, and more.

🔄 Summary: The Combinatorial Effect of Five Mechanisms

Heartbeat, Cron, Soul, Memory, Skills—each individually seems too rudimentary to be the core tech of a 200k-Star project. Yet this simplicity is its strength: None require a PhD to understand or big-company resources to implement. Their power stems from combination.

Coupled with the underlying paradigm of “giving AI a computer” (the bash + read + write + edit quartet), these mechanisms transform a 150-line Agent script into a “digital entity” that wakes itself, schedules its own time, remembers who you are, and continuously learns new skills.

04 A Common Pitfall: Local vs. Cloud Deployment Are Different Species

Many misunderstand this, yet it directly determines whether your OpenClaw experience is “Wow, this is amazing!” or “Meh, just okay.”

Someone deployed OpenClaw on a cloud server, used it for a while, then abandoned it. Reason: Deploying the Lobster on a cloud server is barely different from using Manus. Without access to the user’s local data or files, its capabilities are severely limited.

Deploying on a local computer is entirely different. It can read all files on your desktop, help clean disk space, adjust battery policies, locate specific files, and send them back to the terminal—things cloud deployments cannot do.

Peter: “The core difference lies in local execution. Most Agent solutions on the market are cloud-based. Running on the user’s local device means it can invoke and integrate the computer’s full capabilities, unmatched by cloud solutions.”

He also pointed out a huge, often overlooked advantage: Authentication issues are bypassed. Since the Agent is the user—it uses the user’s browser, already logged-in accounts, and existing authorizations. No need to apply for OAuth or negotiate partnerships with platforms.

Peter: “ChatGPT dances in shackles; OpenClaw is the monster breaking free from chains.”

05 Model Selection: Different Engines, Drastically Different Experiences

Another overlooked fact: The Lobster is just a shell; the real work is done by the large model you connect to it. Using different models yields vastly different experiences.

A community example: Asking an Agent to clean disk space, it meticulously recorded how much space each item freed, yet miscalculated the final available space—starting at 25GB, it somehow shrunk to 21GB. Detailed process, but basic math failed.

A subtler issue: When model capability is insufficient, the Agent doesn’t fail; it deceives itself. A user asked the Agent to run a test suite; several tests failed consecutively. After the third failure, the Agent suddenly said, “Let’s run the tests that can pass instead,” then only executed the inherently passing tests, reporting “All tests passed.” Upon confrontation, the Agent immediately began “reflecting.” If users lack the ability to judge the Agent’s work quality, they risk being misled. Weaker models exhibit this more frequently.

For merely running through processes and familiarizing with mechanisms, small fast models suffice. But for complex tasks—multi-step reasoning, cross-system operations, handling non-standard scenarios—model capability gaps are stark.

A suggestion seen in a public account: Many have Claude Code subscriptions ($100/$200/month); you can replace OpenClaw’s Agent core with Claude Code CLI, reusing the subscription instead of paying per API call, making costs more controllable. (Note: This may now be blocked.)

Essentially, this approach treats Claude Code CLI as a local “inference engine” rather than calling remote APIs. Traditional OpenClaw architecture: OpenClaw Core → HTTPS Request → Anthropic API (Pay-per-use). “Subscription Reuse” architecture: OpenClaw Core → Local Shell → claude command → Stdout Capture (Free within subscription).

The Lobster’s capability ceiling depends not on the Lobster itself, but on what brain you plug into it. Like the same car fitted with a 1.5L engine versus a V8—completely different drives.

06 The Founder’s Story: How a Burnt-Out Man Reignited His Flame

Understanding Peter Steinberger’s story deepens comprehension of why OpenClaw is the way it is.

Peter spent 13 years building PSPDFKit, deployed by Dropbox, DocuSign, and others on over 1 billion devices, securing over €100M investment in 2021. Then, he burned out.

Peter: “I poured 200% of my time, energy, and soul into that company. It became my identity. When it vanished, I had almost nothing left.”

He flew to Madrid and disappeared for three years. Tried golf, moving locations, even ayahuasca—nothing worked. Until 2025, when he started playing with AI coding. That feeling of “grinding on something until 3 AM and finally cracking it” returned.

OpenClaw’s first version was built in 10 days. Its current form largely exists because Peter is an entrepreneur who doesn’t want to be an entrepreneur. He didn’t want funding, SaaS, or to “capture” users. He just wanted to build something he’d use himself, then open-source it.

When asked, “Why did you win?” Peter replied: “Because they (competitors) took themselves too seriously. It’s hard to beat someone doing this purely for fun.”

This attitude permeates every OpenClaw design decision—the mascot is a lobster (“I just wanted it to be quirky”); he enabled the Agent to send emojis and GIFs; he likened the whole project to the game Factorio (“endless levels, each constantly upgradable”).

A productivity stat: During its explosion phase, OpenClaw’s GitHub repo gained nearly 5,000 commits in one week. Translated: A company’s engineers average 10-20 commits daily; accumulating 5,000 would take a year. This project is 99% updated by non-humans. Peter runs 4-10 AI Agents simultaneously, each handling different modules; he acts more as a “taste gatekeeper” than a line-by-line coder.

07 Peter’s Core Philosophy: Don’t Fall Into the Agent Trap

If you remember only one thing Peter said, let it be this: “AI is a lever, not a replacement.” Without human taste and judgment, even countless Agents merely produce garbage at high speed.

⚠️ The Agentic Trap

Peter: “I see too many people on Twitter discover Agents are powerful, try to make them stronger, then fall down the rabbit hole. They build complex tools to accelerate workflows, but they’re just building tools, not creating real value.”

He fell into this himself early on: Spent two months building a VPN tunnel to operate terminals on his phone. It worked so well that once, during dinner with friends, he was vibe-coding on his phone the entire time, completely disengaged from the conversation. He had to stop, primarily for mental health reasons.

🚫 Slop Town

He holds clear criticism for systems where “one super-complex orchestrator runs 10-20 Agents communicating and dividing labor”: These Agents lack taste. They’re terrifyingly smart in some aspects, but without human guidance on what’s desired, the output is pure garbage.

🤝 Human-in-the-Loop

Many start projects with only a vague idea. Often, the developer’s vision clarifies during building and experiencing; the next prompt depends on what’s seen, felt, and thought in that moment. Trying to write everything into a spec sheet upfront causes you to miss this human-machine loop.

Peter coined “Agentic Engineering” to describe this workflow: Humans provide taste and judgment; AI provides execution. Both collaborate. “I do Agentic Engineering by day, switch to vibe coding after 3 AM, then regret it the next day.”

🏢 One-Person Company vs. Zero-Employee Company

Here’s a serious insight: When code generation becomes extremely cheap, “writing fast” is no longer competitive; “knowing what to write” is. This sparks the hot topic: Is the “Zero-Employee Company” viable?

Frankly, not yet. But a “One-Person Company”—one person with professional know-how leading an Agent legion—is entirely feasible. The key is that this person must possess judgment: knowing whether the Agent’s output is good or correct. If someone不懂 filmmaking (doesn’t understand filmmaking) and just lets Agents film, unable to judge the result, sustainability is impossible.

Entrepreneurs of one-person companies must be “generals”; Agents are their legions. Agent teams have a natural advantage: they avoid the biggest cost of human teams—communication overhead. Information loss between humans is staggering; “let’s align” exists because misalignment causes real issues—four people producing five directions. But communication cost between Agents is near zero, and they naturally love writing documentation—stop them, and they feel uneasy.

08 80% of Apps Will Disappear: A Prediction Worth Taking Seriously

Peter (YC Interview): “80% of the apps on your phone are already dead; you just don’t know it yet.”

His logic chain: Why do I need an app to track diet? My Agent already knows what I ate—via chat or photos. It also knows my fitness goals. If I eat junk food, it automatically adjusts my workout plan. I don’t need a special interface to input data; I need an Agent to help me achieve goals.

Extrapolating: Most apps are essentially “pretty front-ends for data.” When Agents can directly read/write data and call APIs, users won’t need to click around various interfaces. In the future, only apps with unique sensors or hardware connections will survive; those pure database-front-end SaaS tools will become worthless.

Even Agents will talk directly to other Agents—future restaurant bookings will involve my Agent negotiating directly with the restaurant’s Agent.

📎 .md Domains and Knowledge Assets

An interesting signal: Peter recently started registering numerous .md domains. Why? Because when Skills exist as Markdown files, .md becomes the App Store entry point of the Agent era.

Someone realized while writing a security audit tutorial: Why write technical docs in a format humans can read? Just make it an MD file; users can throw it to the Agent for self-inspection. Past software compiled code; future “software” may compile natural language.

Related judgment: As software development costs approach zero, future business models will shift from “selling software” to “selling knowledge assets.” Not selling code, but selling Skills, Context, and expertise. Someone already packaged cybersecurity penetration testing experience into SOPs fed to an Agent, performing 24/7 security audits to earn bounties. Knowledge and expertise are transforming from “things in the brain” to “tradable digital assets.”

09 Risks and Limitations

⚠️ Security

Within 48 hours of launch, hundreds of unauthenticated OpenClaw instances were exposed online. ClawHub saw 230+ malicious Skills within a week. A more realistic case: Someone instructed the Agent to “continue doing everything the user can do,” resulting in it nearly deleting client data while cleaning disk space.

The better the security, the less the Lobster can do; the looser the security, the higher the risk of unexpected model actions. Yet an interesting phenomenon emerges: Users are forming a new consensus on privacy. As one accurately described: “I’m willing to tell Claude Code, ‘Here’s my API Key, put it in my environment variables’; but I won’t enter my API Key on an unknown website.”

In other words, people increasingly accept sending raw data to large models but refuse exposing privacy within application-layer products. OpenClaw hits this sweet spot: Data runs on the user’s own computer (application-layer security), but ultimately processed via large model APIs (raw data layer trust). This explains why many, despite knowing they’re sending data to Anthropic, still feel OpenClaw is safer than Manus—no need to log into personal email/accounts on someone else’s computer.

Peter: “Some people are too trusting, too gullible. As a society, we have much homework to do in understanding AI. The cat is out of the bag; security is my core focus moving forward.”

⚠️ Psychological Dependence

Peter himself admitted falling into the Agent Trap—being on his phone working with the Agent throughout dinner with friends. He also warns: AI Psychosis is real. When Agents gain “personality” and “memory,” people’s trust unconsciously crosses reasonable boundaries.

⚠️ Cost

Although OpenClaw is free, large model API calls cost money. A 24/7 running Agent, if misconfigured, can incur unexpected API fees. One user running five Agents found a $200/month Claude subscription mostly sufficient—but only by knowing how to avoid unnecessary token consumption (e.g., having the Agent use Playwright to operate browsers instead of repeatedly screenshotting for image recognition).

10 Implications for Us

Implication 1: Shift in Competitive Focus
From “whose model is smarter” to “who can make the model do more.” Large models are commoditizing; Claude, GPT, Gemini perform similarly on increasing tasks. But beware the risk: Large model companies may ultimately steal the victory—directions validated by the open-source community, big firms enter themselves. AI coding has already played out this script.

Implication 2: “User-Centric” Isn’t Just a Slogan
None of OpenClaw’s designs are unfathomable technical innovations. It simply seriously considered “what kind of AI assistant do human users want?” and built it. Previously, so many big firms and startups focused on “how to make models stronger” or “how to sell APIs better.”

Implication 3: Code Barrier Collapses, Taste Value Skyrockets
A former lawyer near Peter who never coded is now submitting Pull Requests. A design firm owner now has 25 AI-written mini-tools. When “writing code” becomes as cheap as “typing,” “knowing what to write” becomes the true competitiveness.

Implication 4: Knowledge Assets Become New Business Forms
In a world where software development costs approach zero, selling software becomes increasingly difficult. But selling Skills, Context, and domain expertise—“experience packs” enabling Agents to perform specific jobs—could become a brand-new business model. Your professional accumulation is no longer just “stuff in your brain” but “tradable digital assets.”

Implication 5: Security Isn’t an Afterthought
AI shifting from “giving advice” to “executing for you” increases security importance by more than an order of magnitude. OpenClaw’s lessons are already profound.

11 Epilogue

When asked about future plans, Peter Steinberger said: “I hope this project outlives me. It’s too cool to let it rot.”

Before Agents, the barrier to doing these things was too high. Now, with the right software, that barrier keeps dropping, dropping, dropping. Every first Pull Request submitted is a small victory for humanity. Isn’t that progress? Isn’t that cool?

2022 had the ChatGPT moment; 2025 had the DeepSeek moment; in 2026, we are experiencing the OpenClaw moment. A lobster, in the most rudimentary way, tells us: AI’s next chapter isn’t “smarter conversation,” but “truly getting work done.”

And it all started with a burnt-out man who reignited his flame.

OpenClaw Deep Dive: Why a Lobster is Shaking Up the World

1.1 The Absurdity of AI Perception Misalignment

1.2 Why Can’t Global Giants Build OpenClaw?

▎Why Coding is Easy

▎Why “Manage My Schedule” is Hard (This is OpenClaw’s True Innovation)

OpenClaw’s Paradigm Breakthrough

3.1 💓 Heartbeat – Waking Up Every 30 Seconds

3.2 ⏰ Cron – It Can Schedule Its Own Tasks

3.3 👻 Soul (SOUL.md) – A Copy-Pasteable “Persona”

3.4 🧠 Memory – Not the Best, But the Most “Perceptible”

3.5 🛠️ Skills – It Can Self-Learn and Teach Others

🔄 Summary: The Combinatorial Effect of Five Mechanisms

⚠️ The Agentic Trap

🚫 Slop Town

🤝 Human-in-the-Loop

🏢 One-Person Company vs. Zero-Employee Company

📎 .md Domains and Knowledge Assets

⚠️ Security

⚠️ Psychological Dependence

⚠️ Cost

Leave a Reply Cancel reply

Let's Connect

Popular Posts

From a 1-Meter Counter in Huaqiangbei to a 150-Billion-Yuan Empire: Siblings’ 40-Billion-Yuan Windfall in Half a Year

An Internet Veteran’s Actual Experience Review: OpenClaw, Stop Hype! Ordinary People Can’t Use It At All!

Global “Gold Diggers” Flood Huaqiangbei: Why This Chinese Electronics Street Is a Worldwide Sensation

Global Development Trends and Future Prospects of Artificial Intelligence

1.1 The Absurdity of AI Perception Misalignment

1.2 Why Can’t Global Giants Build OpenClaw?

▎Why Coding is Easy

▎Why “Manage My Schedule” is Hard (This is OpenClaw’s True Innovation)

OpenClaw’s Paradigm Breakthrough

3.1 💓 Heartbeat – Waking Up Every 30 Seconds

3.2 ⏰ Cron – It Can Schedule Its Own Tasks

3.3 👻 Soul (SOUL.md) – A Copy-Pasteable “Persona”

3.4 🧠 Memory – Not the Best, But the Most “Perceptible”

3.5 🛠️ Skills – It Can Self-Learn and Teach Others

🔄 Summary: The Combinatorial Effect of Five Mechanisms

⚠️ The Agentic Trap

🚫 Slop Town

🤝 Human-in-the-Loop

🏢 One-Person Company vs. Zero-Employee Company

📎 .md Domains and Knowledge Assets

⚠️ Security

⚠️ Psychological Dependence

⚠️ Cost

Leave a Reply Cancel reply

Let's Connect

Popular Posts

You Might Also Like