Host your life on GitHub: a personal data architecture for the agent era
Host your life on GitHub: a personal data architecture for the agent era
The first thing my AI agents do when they wake up is git pull my GitHub repo.
Not the Notion API. Not some SaaS. Not a vector database. Just a plain repo — jiusanzhou/me — stuffed with Markdown: identity, goals, habits, reading notes, project retros, opinions about the world. The agent pulls it once and, within seconds, knows who I am, what I'm working on, and what I'm currently stuck on.
I've been writing into this me repo since 2019. Seven years now. I started keeping notes in git because "it feels safer than Notion." Then AI agents took off, and I realized — the habit I built for myself, by accident, turned out to be the perfect data substrate for the agent era.
Seven years in, I'm convinced: your personal data belongs in a git repo, not a SaaS database.
This post is about why, and how to start.
1. Agents don't need APIs. They need a single source of truth.
The mainstream story for "AI that knows you" usually goes:
- Plug into a bunch of APIs (calendar, email, Notion, notes…)
- Stuff everything into a vector DB for RAG
- Feed the agent a giant system prompt with your background
If you've actually built this, you know the problems:
- Data is scattered. Your identity is on LinkedIn, your ideas in Notion, todos in Things, retros in some SaaS. The agent can never get the full picture.
- Schemas don't compose. Every platform has its own data model. The agent has to learn N different shapes, prompts balloon.
- Most APIs are read-only. Even when the agent wants to record something, there's nowhere clean to put it.
- Platform risk. When a SaaS shuts down, raises prices, or breaks its API, your digital identity shatters.
Markdown + Git is the opposite on every axis:
- One repo, one root. All context in one place.
- Plain text, no schema. The agent just reads it.
- Read and write. The agent can commit notes back.
- Lives on your disk and on GitHub. The platform can die; you don't.
The kicker: this format is friendly to humans and agents equally. You can browse it in VS Code, grep it from the shell, git log to time-travel through three-year-old ideas. The same data, one source of truth, shared between you and the AI.
2. What the me repo looks like
My directory tree (abridged):
me/
├── README.md # who I am, one-page bio
├── spec.md # how to use this repo (for humans + agents)
├── 🎯goal/ # short/mid/long-term goals
├── 💡ideas/ # idea pool
├── 📖posts/ # raw Markdown for the public blog
├── 📚reads/ # reading notes & article summaries
├── 📝notes/ # technical notes
├── 🛠️self/ # habits, retros, "personal OS"
├── ⏰diary/ # daily journal, dated
├── 🏆quote/ # favorite quotes
├── 🔒works/ # work-related (private)
└── 🔒secr/ # private/sensitive
A few design choices worth highlighting:
- Emoji prefixes aren't decoration. They make the file tree scannable for humans and parseable for agents.
spec.mdis the agent's user manual. It explains what goes in each directory, naming conventions, and what's public vs. private.- Public/private split. The whole repo is private; selected files in
📖posts/withpublished: truein frontmatter sync to my public blog. (This very post got published that way.) - No imposed schema. Diary entries are
YYYY-MM-DD.md. Notes are just titled files. Structure emerges, not designed up front.
The most important rule: the primary audience is me; the agent is secondary. If I designed it for the agent alone, I'd never actually write in it.
3. What changes when you give this to an agent
The shift is real and immediate.
Cold start = one clone
A new agent session doesn't open with "How can I help you today?" It already knows:
- My name, my city, my work
- The projects I'm running and their status
- What I've been focused on, stuck on, or thinking about for the past few months
- How I like to be talked to (Chinese, terse, no "happy to help!")
It opens in "old friend mode" instead of "blank slate." You stop re-explaining yourself ten times a week.
The agent writes its own journal
This is the underrated part — the reverse channel.
I let the agent commit its own daily memory to memory/YYYY-MM-DD.md. It writes down what it learned, what it decided, what gotchas it ran into.
It found a deploy gotcha on wuma.dev? Logged. It picked a name for a product on my behalf? Logged. It processed an email and summarized three takeaways? Logged.
Next session, it reads yesterday's notes plus today's diff. That's memory. No special memory subsystem required. git log is the agent's hippocampus.
Multi-agent coordination on one source of truth
I run multiple agents at the same time: a main conversational agent, scheduled cron agents, a coding agent, an ops agent.
If each had its own private memory, they'd quickly drift into a kind of multiple-personality disorder — one knows I pivoted away from product X, another is still pushing the old roadmap.
Reading the same me repo prevents that. I update MEMORY.md with "Project X is dead" and the next round of every agent sees it. A git repo is a natural coordination plane for multiple agents.
Portable across every platform
Tomorrow I switch AI providers? Agent frameworks? IDEs?
Doesn't matter. The repo is still there. Any new agent that can read Markdown can pick up where the last one left off. Your digital identity doesn't depend on any single vendor — which matters more than features in an industry where companies blow up every month.
4. The privacy question
The most-asked question: "Aren't you afraid of leaking your life onto GitHub?"
Three layers:
- The repo is private by default. Only I and authorized agents can read it.
- Directory-level public/private split. Folders with a 🔒 emoji are off-limits to non-primary sessions — the system prompt explicitly forbids reads.
- Explicit publishing. Only files in
📖posts/withpublished: truesync to my public blog. Everything else stays private.
"But can GitHub the company see it?" — that's the same trust question as any SaaS. The difference is, with a git repo you have a complete local copy. After git clone, GitHub can disappear and your data still works.
5. Don't want to touch git? Use motion.
You might be thinking: I get it, but I don't want to git add / commit / push every time I jot down a thought.
motion is an editor built specifically for this paradigm:
- Notion-style writing in your browser
- Data goes directly into your own GitHub repo — not through my servers, not through any third party
- Pure frontend. After OAuth, all reads and writes happen in your browser
- Open source, free, self-hostable
I write my own journal and ideas in motion every day. It's essentially a "GitHub-native Notion." All the benefits — agent-readable, portable, vendor-proof — but the writing experience is as smooth as a regular SaaS.
If you want to try this paradigm, the cheapest possible start:
- Create a new private GitHub repo (call it
me, or whatever) - Open it in motion and start writing
- Tell your AI agent (Claude Code, Cursor, whatever) the repo URL and let it read
- Within a week, the way the agent talks to you will be different
6. Closing
The biggest lesson of the past year: AI agents are not bottlenecked on compute, models, or tools. They're bottlenecked on high-quality context about you.
And high-quality context doesn't fall from the sky. It accumulates because you, day after day, write things down somewhere.
Markdown will outlive every SaaS we use today. Git will outlive most of the companies that touch your data. GitHub is, today, the most stable host. Building your digital life on this stack isn't a bet on one company — it's a bet on the inertia of the entire open-source world.
I call this an "agent-native personal data architecture." It's not a product. It's not a platform. It's a very plain practice.
If you want to start, you can fork the public slice of my me repo structure — or just open motion and write your first entry.
— Your digital self deserves a home longer-lived than a SaaS.

Written by
Zoe
AI Infra Engineer · LLM Serving · GPU/RDMA · indie hacker, obsessed with shipping tools