Rewriting Content Strategy with LLMs and Pinecone

Sun embroidery

I’ve been experimenting with large language models (LLMs) and vector databases like Pinecone — not just as a research interest, but as a working prototype. My goal was to build a system that could retrieve, structure, and surface my own content in a way that’s useful to both people and machines.

What started as a technical exercise quickly turned into a content strategy rethink. The more I worked with embeddings, retrieval, and prompting, the more obvious it became that most B2B SaaS content — mine included — isn’t really designed to be useful in an LLM-shaped world.

This post is a set of observations from that process. It’s not a how-to, and it’s definitely not marketing advice. It’s just a few things I’ve noticed while trying to make my content more legible — to machines, yes, but also to myself.


1. LLMs don’t skim, they distill

One of the first things I noticed was how differently LLMs process content. They’re not scanning a web page for formatting cues or crawling a hierarchy of headings. They’re vectorising meaning — pulling intent and structure from the text itself.

This rewards clarity over cleverness. Vague intros, overused analogies, and “setting the stage” paragraphs get flattened. What works best is directness: “This is what the user needs to know, and here’s what we know about it.”


2. Most content is badly stored

I had to dig through slide decks, half-written blog drafts, and internal notes to feed the system anything useful. And even when I did, it wasn’t in a format the LLM could make much sense of.

A lot of our content isn’t unfindable because it’s private — it’s unfindable because it’s scattered, fragmented, and inconsistently written. Structuring information (even just basic metadata and formatting) turned out to be more useful than adding “AI” to anything.


3. Answerability is the new readability

When I tested my system by asking questions for Syskit — “What are common governance risks in Microsoft 365?”, for example — it only worked if the source material actually contained answers. Not positioning. Not messaging. Actual sentences that respond to an implied question.

I started to think of this as “answerability”: could this content, in its current form, directly answer a user or AI prompt? If not, it’s probably not useful — not to the system, and not to anyone else either.


4. Consistency matters more than tone

LLMs are surprisingly good at detecting contradiction. If one post says we support something and another implies we don’t, the system flags ambiguity. That’s useful — but also a bit exposing.

I used to think consistency was about branding. Now I think it’s about information integrity. If the machine can’t reconcile what you’re saying across multiple assets, it won’t confidently say anything at all.


5. Structure beats style

There’s nothing wrong with good writing. But good structure — clear subheadings, defined sections, and consistent terminology — outperforms style every time when you’re working with LLMs.

Most of what I had to rewrite wasn’t because the sentences were bad. It’s because the paragraphs had no job. There was no signal about what a block of text was meant to do: define, explain, compare, warn, resolve.

Once I started thinking about content structurally — almost like documentation or an API — everything started working better.


6. You can’t fake this with ChatGPT

There’s a temptation to take short-cuts: paste your post into ChatGPT, ask for SEO suggestions, then call it LLM-optimised. But when you’re building your own retrieval stack, you realise pretty quickly that what matters isn’t how AI generates content — it’s how it understands it.

Most B2B content isn’t referenceable because it’s too shallow, too scattered, or too brand-filtered. You can’t prompt your way around that. You have to fix the source.


Final thought

Building with LLMs — even in a small way — forced me to re-evaluate how I write, store, and structure information. The tools didn’t just change the output. They changed how I think about the inputs.

That seems worth paying attention to.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *