Daily Learnings: Fri, Jun 27, 2025
Knowing others is wisdom, knowing yourself is Enlightenment. — Laozi
AI Coding Agents - More Experimentation
Over the past 3 days I’ve continued my experimenting with AI coding agents, continuing with Claude Code & trying out the new Gemini agent CLI, to put them through their paces. I’ve seen them do really well in certain areas, and fall flat in others. Here are some general notes so far.
- ✅: I riffed on the infinite UI prototype generation repo that I mentioned a couple of days ago to create some automation in my personal blog repo to generate full site prototypes.
- Claude Code built some great prototypes to review with fully fleshed out UI components and a design system that was consistent and well-organized
- I did need to iterate slightly on the prompt & spec that generated these prototypes, some learnings on this below
- I ultimately selected one of these templates to use to totally redesign my site
- 🆗: I asked Claude Code to take a look at the current state of my site, styles, and components, and generate a plan to migrate over the site to a selected design prototype in phases.
- The plan was well-organized, but potentially too broad (again, referenced below)
- The plan included 6 phases, which I figured we’d tackle one at a time, to try and limit the scope of what each agent needed to consider
- This was the beginning of quite the rabbit hole that eventually led me to maxing out my tokens for the standard $20/mo plan
- Important note: Even when I hit the limit, I was REALLY impressed how far I got in a day when that limit was reached
- 👎: Implementing the plan was where I believe that I gave Claude too much leeway.
- I used a multi-sub-agent approach for this, which did work pretty well. The coordinator agent was able to manage context, and it got into a really nice groove of queueing up a new phase to a subagent, the agent did the work, reported back, and the coordinator agent then updated the plan doc.
- The unfortunate part: The site definitely didn’t look great, it was broken in many areas.
Key Learnings
The biggest takeaway here is all about human assumptions, and how those + broad work can result in really poor outcomes and generated code. I attempted to use more agents (and manual work) to salvage a lot of what had been done, but I decided to not be tanked by the sunk cost fallacy. So I abandoned ship, kept the designs, and started fresh with a new trial. 🚢
As I reflected on this exercise, and the state that the site was left in, I realized that, in the translation of the updated design prototype to the actual site, there were a lot of assumptions that I didn’t specifically call out in my prompt, custom slash commands, or in the plan that we created. Some of those include:
- How the actual content in the design prototypes should be ignored, and only the styles should be applied to the existing site’s content and data design
- This was something that I assumed Claude would know about, layering in only the designs and aesthetic of the prototypes where needed. I was wrong, and the updated site’s overall vibe shifted dramatically
- Code architecture and adhering to guidelines on how to approach potential tradeoffs in where to collocate code vs. component-ize it, etc…
- I have some guidelines in my
CLAUDE.MDfile to help with this, but when the overall task was so large, Claude ignored it sometimes - A good example of this was where I wanted CSS to live—at the page level, vs. component level, vs. at the global
styles.csslevel
- I have some guidelines in my
Next Steps
To continue my exploration on this, I’m having Claude go through and update each design prototype HTML page individually, plowing back actual site content into the designs, which has been a really good exercise regardless. It’s helped to address some assumptions that I had about the content of the site, and also helped to open up some interesting and creative ideas on adjustments to how I present the underlying content.
Hopefully this series of smaller, more focused tasks will yield better fruit for the eventual redesign.
MCP Servers - Quick Aside
Quick report on the puppeteer & Zen MCP servers: They work super well with Claude, and have been a major step-up in quality of generated content. In fact, I have Claude talking with Gemini right now via the Zen MCP server on a critical thinking task.
Claude uses Puppeteer well when needed. The only thing that sometimes it gets tripped up on is toggling light/dark mode for reviewing current websites / design prototypes.