Building a Private Chatbot

Project

I built a system that lets me have a real conversation with everything I’ve ever written — an AI version of myself I call Mini Me.

  • Powered by Pinecone, OpenAI, and ResembleAI
  • Built from years of strategy notes and presentations
  • Exploring how voice, brand, and knowledge converge

A Conversation with Mini Me

What if you could talk to yourself about everything you’ve ever written?

I’ve spent years capturing notes, strategies, and reflections across roles and projects.
Then I built a system that lets me have a real conversation with all of it – in my own voice.

Major large language models. https://informationisbeautiful.net/visualizations/the-rise-of-generative-ai-large-language-models-llms-like-chatgpt/

Why I Built It

Like most strategists, I’ve accumulated years of thinking – buried in OneNote pages, PowerPoint decks, and half-finished ideas. The insights were there, but they weren’t accessible. Even OCR scans still retained the messiness of my scribbles and the lack of structure in early notes.

I wanted to know: what would happen if I could ask my past work a question – and it could answer back?

How It Works

The first stage in any project like this is working out the tech. There were a number of criteria that I used:

  • Does it actually do the job! Sounds obvious, but I didn’t want to spend lots of time figuring out bits of technology then finding out that it didn’t cover what I needed
  • Price. My target was sub-$10 per month. I tried doing it for $0 per month but the difference in effort and quality was enormous, for just the price of a burger.
  • My prior knowledge. There is some tech I know about and a lot that I don’t. I didn’t want to spend months learning something new, even if it could be argued it was more appropriate.
  • Variety. It might have been easier to do everything in a single unified tech stack for example Microsoft, but that would make the project less interesting for me.

Given the above, this is what I went with:

Technology URL Description
Linux linux.org Open-source operating system known for stability and developer flexibility.
Google Cloud Run cloud.google.com/run Serverless platform for running containerized applications and APIs.
OneDrive for Business microsoft.com/onedrive/business Microsoft’s cloud storage for secure file access and sharing within organizations.
GitHub github.com Platform for version control, collaboration, and hosting of code repositories.
Feedparser pypi.org/project/feedparser Python library for reading and parsing RSS or Atom feeds.
Resemble AI resemble.ai AI voice cloning and synthetic speech generation platform.
Pinecone pinecone.io Managed vector database for similarity search and retrieval-augmented generation (RAG).
GraphAI graphai.io Framework for building AI workflows using graph-based data structures and reasoning.
ChromaDB trychroma.com Open-source local-first vector database for embedding storage and semantic search.

Process at High Level

Step 1 – Collect all the Data

By far the most interesting, laborious but useful part of the process. I have around 14 years worth of information about all of the work I have done. With hindsight I would have had all of that formatted in the same way and well structured for AI ingestion.

That was absolutely not the case:

A lot of the work is the preprocessing often unstructured and poorly scribbled information. here are just a few of my data sources as well as lots of blog posts, Word documents, PowerPoint documents; really, everything.

  • PowerPoint files
    These were manually selected and hardcoded into the script — a reasonable tradeoff given how few I needed to track.
  • RSS Feeds
    • My blog at bjrees.com
    • A few curated industry insight feeds
  • OneNote Notebooks, such as:
    • Project documentation (e.g. SkynetThe Oracle)
    • Notes from a Cambridge Judge Business School programme
    • Third-party and personal research logs
  • iCloud Backups
    These contained archived slide decks and supporting materials.

Step 2 – Index

Once there is some structure to data you can start the next stage of building a database. I used Pinecone, a vector database, perfectly suitable for the task.

Step 3 – build an online interface where I couldn’t ask questions

Nothing fancy, but I wanted an easy way where I could ask questions without having to use the command line, firing up services or anything. For this I used Google Cloud Run extensively.

Step 4 – Voice

This stage was certainly unnecessary. But as mentioned part of this is about learning and growing your knowledge. I wanted to see if there was a way of using a cloud service to speak like me (a “Mini me!”). The following voice recordings are not me. they are from ResembleAI, and cost me around $2 to make. The one word it struggled with was BJREES sadly.

 What is something that is particular to BJREES that is different from standard marketing practice?
How should you market to larger organisations?
What is the marketing lifecycle?
What is good positioning in B2B marketing?

What Happened?

The conversation surprised me. what was particularly impressive and I think the most important part of the project is that it pulled out ideas from previous years into coherent answers. We all have a recency bias in our memories personally I’m a long way from having total recall. The world of marketing has evolved significantly but there are certain core concepts that have remained. A simple example – understanding your customers was the main task of marketing from the first project I did to the most recent.

Example exchanges:

Real me: “What’s the biggest barrier to marketing and product alignment?”
Mini me: “In your past notes you wrote that alignment isn’t the real issue — it’s trust in shared metrics.

Real me: “What motivates you to keep building?”
Mini me: “You wrote once: progress feels meaningful when it changes how people work, not just what they do.

The key point here is that though I remember writing this many years ago, I certainly didn’t remember the detail and I certainly didn’t remember the connections with other pieces of work that I did.

Why It Matters

it was enjoyable doing this work but crucially this wasn’t about creating a gimmick. It’s a glimpse of what happens when brand, knowledge, and AI converge.

Imagine your company being able to talk to its accumulated knowledge:

  • Every customer insight.
  • Every proposal, presentation, and lesson learned.
  • All accessible through natural conversation.

I see this as a future direction for Brand intelligence – providing more to your customers to help them do their jobs well.

FAQ

What is the “Mini Me” project?
An experiment where I built an AI version of myself trained on my own notes, slides, and documents — using Pinecone, OpenAI, and ResembleAI — to explore how human knowledge, voice, and brand can interact in conversation.
How does Mini Me work?
It connects four core components: OneNote for source content, Pinecone for semantic indexing, OpenAI for reasoning and dialogue, and ResembleAI for generating a natural voice.
Why did you build it?
To test what happens when your professional knowledge base becomes conversational — and to explore the future of personal assistants that reflect your own thinking and communication style.
Is Mini Me available for others to use?
Not yet — but the framework could be adapted for teams or organisations to converse with their collective knowledge base.
What did you learn from the experiment?
It’s possible to create meaningful, context-aware dialogue from personal archives — but accuracy, tone, and governance need careful design to maintain trust and identity.