<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <title>Posts Tagged "software-eng-auto" on Alex Leighton's Blog</title>
  <id>https://alexleighton.com/posts/tags/software-eng-auto-tag-feed.xml</id>
  <link href="https://alexleighton.com/posts/tags/software-eng-auto-tag-feed.xml" rel="self" />
  <link href="https://alexleighton.com/posts/tags/software-eng-auto.html" />
  <updated>2026-04-20T00:41:38.993470907Z</updated>
  <author>
    <name>Alex Leighton</name>
    <uri>https://alexleighton.com/</uri>
  </author>
  <icon>https://alexleighton.com/static/icon-dino.png</icon>
  <logo>https://alexleighton.com/static/icon-dino.png</logo>
  
  <entry>
    <title>Filesystems as Personal Memory</title>
    <id>https://alexleighton.com/posts/2026-03-09-filesystems-as-personal-memory.html</id>
    <link href="https://alexleighton.com/posts/2026-03-09-filesystems-as-personal-memory.html" />
    <published>2026-03-09T13:00:00Z</published>
    <updated>2026-03-09T13:00:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Maybe plain files and git are all you need.</p><p>Published on <span title="2026-03-09T13:00:00Z">2026-03-09</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Maybe plain files and git are all you need.</h3><p>Published on <span title="2026-03-09T13:00:00Z">2026-03-09</span><br>Tags: commentary, git, llms, software-eng-auto</p><blockquote>
<p><a href="https://madalitso.me/notes/why-everyone-is-talking-about-filesystems"><strong>Daniel Phiri</strong> on 2026-02-23</a>:</p><p>Here's my actual take on all of this, the thing I think people are dancing around but not saying directly.</p>
<p>Filesystems can redefine what personal computing means in the age of AI.</p>
<p>Not in the "everything runs locally" sense (but maybe?). In the sense that your data, your context, your preferences, your skills, your memory — lives in a format you own, that any agent can read, that isn't locked inside a specific application.</p></blockquote>
<p>I like this vision of the future — personal data in whatever form is easiest or convenient, stored as the person chooses, arbitrary computation enabled by the natural language interface of LLMs. As I read this superb summary of the current state of coding agents intersecting with the filesystem, I had a couple thoughts.</p><p>...<br><a href="https://alexleighton.com/posts/2026-03-09-filesystems-as-personal-memory.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: Fast Forward</title>
    <id>https://alexleighton.com/posts/2026-02-28-kbs-fast-forward.html</id>
    <link href="https://alexleighton.com/posts/2026-02-28-kbs-fast-forward.html" />
    <published>2026-03-01T03:45:00Z</published>
    <updated>2026-03-01T03:45:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Code generation on cruise control.</p><p>Published on <span title="2026-03-01T03:45:00Z">2026-03-01</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Code generation on cruise control.</h3><p>Published on <span title="2026-03-01T03:45:00Z">2026-03-01</span><br>Tags: article, llm, ocaml, python, software-eng, software-eng-auto</p><p>On today's episode: a lot of code. The previous work prepared the project codebase to guide agents in generating good quality code, which is what we did.</p>
<h2>Update, Resolve, Archive Commands</h2>
<p>After the <code>Kb_service</code> module was broken apart, the coding agent easily generated full verticals for update, resolve, and archive commands. The process I'm following:</p>
<ul>
<li>Work with the agent to peel off a chunk of functionality from <a href="https://github.com/alexleighton/knowledge-bases/blob/0eb55844acc47bb847f6d8063a9bc1b4c7689e4b/docs/product-requirements.md"><code>docs/product-requirements.md</code></a>.</li>
<li>Then write a <a href="https://github.com/alexleighton/knowledge-bases/blob/0eb55844acc47bb847f6d8063a9bc1b4c7689e4b/prompts/activities/implementation-plan.md"><code>prompts/activities/implementation-plan.md</code></a> for building the functionality.</li>
<li>After the code is generated, I review the code, as well as the agent (<a href="https://github.com/alexleighton/knowledge-bases/blob/0eb55844acc47bb847f6d8063a9bc1b4c7689e4b/prompts/activities/code-review.md"><code>prompts/activities/code-review.md</code></a>), and apply changes for all issues.</li>
</ul>
<p>Related commit: <a href="https://github.com/alexleighton/knowledge-bases/commit/0eb55844acc47bb847f6d8063a9bc1b4c7689e4b"><code>0eb5584</code></a> — feat: add update, resolve, and archive commands</p><p>...<br><a href="https://alexleighton.com/posts/2026-02-28-kbs-fast-forward.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: List and Show</title>
    <id>https://alexleighton.com/posts/2026-02-25-kbs-list-and-show.html</id>
    <link href="https://alexleighton.com/posts/2026-02-25-kbs-list-and-show.html" />
    <published>2026-02-26T05:00:00Z</published>
    <updated>2026-02-26T05:00:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Code generation picks up speed.</p><p>Published on <span title="2026-02-26T05:00:00Z">2026-02-26</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Code generation picks up speed.</h3><p>Published on <span title="2026-02-26T05:00:00Z">2026-02-26</span><br>Tags: article, llm, ocaml, software-eng, software-eng-auto</p><p>Today's log consists mostly of code generation. We've laid enough foundation that new functionality is easily produced.</p>
<h2>List Functionality</h2>
<p>The functionality for <code>bs list</code> is well defined after the work for the <a href="../../../posts/2026-02-24-kbs-nesting-and-ideation.html">previous dev log</a>. I think the experiment of directing the agents from the requirements document rather than my own intuition of what to tackle was successful, though perhaps only at the scale of a project like Knowledge Bases. The coding agent was able to select a reasonable chunk of functionality to peel off and turn into an implementation plan, with a different agent generating the implementation.</p>
<h2>"Flaky" Tests</h2><p>...<br><a href="https://alexleighton.com/posts/2026-02-25-kbs-list-and-show.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>Antirez&#39;s Z80 Experiment</title>
    <id>https://alexleighton.com/posts/2026-02-25-antirezs-z80-experiment.html</id>
    <link href="https://alexleighton.com/posts/2026-02-25-antirezs-z80-experiment.html" />
    <published>2026-02-25T16:30:00Z</published>
    <updated>2026-02-25T16:30:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>More research on automatic software development.</p><p>Published on <span title="2026-02-25T16:30:00Z">2026-02-25</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>More research on automatic software development.</h3><p>Published on <span title="2026-02-25T16:30:00Z">2026-02-25</span><br>Tags: commentary, llm, software-eng, software-eng-auto</p><p>More software engineering research has dropped, this time from Salvatore Sanfilippo of Redis fame, in the vein of the experiment from <a href="../../../posts/2026-02-12-new-software-engineering-modes.html">StrongDM</a> and <a href="../../../posts/2026-02-14-more-software-engineering-research.html">OpenAI</a> (and my own <a href="../../../posts/2026-02-17-kbs-going-automatic.html">incomplete experiment</a>), to build a <a href="https://en.wikipedia.org/wiki/Zilog_Z80">Z80 emulator</a>.</p>
<blockquote>
<p><a href="https://antirez.com/news/160"><strong>Salvatore Sanfilippo</strong> on 2026-02-24</a>:</p><p>I wrote a markdown file with the specification of what I wanted to do. Just English, high level ideas about the scope of the Z80 emulator to implement.</p>
<p>...</p>
<p>This file also included the rules that the agent needed to follow, like:</p>
<ul>
<li>Accessing the internet is prohibited, but you can use the specification and test vectors files I added inside ./z80-specs.</li>
<li>Code should be simple and clean, never over-complicate things.</li>
<li>Each solid progress should be committed in the git repository.</li>
<li>Before committing, you should test that what you produced is high quality and that it works.</li>
<li>Write a detailed test suite as you add more features. The test must be re-executed at every major change.</li>
<li>Code should be very well commented: things must be explained in terms that even people not well versed with certain Z80 or Spectrum internals details should understand.</li>
<li>Never stop for prompting, the user is away from the keyboard.</li>
<li>At the end of this file, create a work in progress log, where you note what you already did, what is missing. Always update this log.</li>
<li>Read this file again after each context compaction.</li>
</ul></blockquote><p>...<br><a href="https://alexleighton.com/posts/2026-02-25-antirezs-z80-experiment.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: Nesting and Ideation</title>
    <id>https://alexleighton.com/posts/2026-02-24-kbs-nesting-and-ideation.html</id>
    <link href="https://alexleighton.com/posts/2026-02-24-kbs-nesting-and-ideation.html" />
    <published>2026-02-24T14:30:00Z</published>
    <updated>2026-02-24T14:30:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>More scaffolding before future work.</p><p>Published on <span title="2026-02-24T14:30:00Z">2026-02-24</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>More scaffolding before future work.</h3><p>Published on <span title="2026-02-24T14:30:00Z">2026-02-24</span><br>Tags: article, llm, ocaml, software-eng, software-eng-auto</p><p>Today's changes follow on from the previous post's "next steps": nesting and ideation.</p>
<h2>Nesting</h2>
<p>I worked with the agent to find code with deep nesting, decide how to flatten it, and generalize the approach into suggestions for a <a href="https://github.com/alexleighton/knowledge-bases/blob/faf049e4a4fecaf8a04552e505a178ed29c2299f/docs/lib/principles.md#3-low-nesting-depth">nesting principle</a>. We then applied that principle across the codebase, and I think we were successful as the diff in <a href="https://github.com/alexleighton/knowledge-bases/commit/a05f2ea7eaa69805e88e395082dbfe57d29198d2">the refactor commit</a> is noticeably flatter.</p>
<blockquote>
<p><strong>4. Nesting Depth</strong></p>
<p>Keep expressions nested at most two or three levels deep. Deeply nested code
is hard to follow because the reader must hold every enclosing context in their
head at once. When nesting grows, treat it as a signal that the code should be
restructured.</p>
<p>Approaches for reducing nesting:</p>
<ul>
<li>Use monadic result operators (<code>let*</code>, <code>let+</code>). ...</li>
<li>Extract resource-management wrappers. ...</li>
<li>Consolidate error mapping into named functions. ...</li>
<li>Factor repeated control-flow shapes into helpers. ...</li>
</ul>
</blockquote><p>...<br><a href="https://alexleighton.com/posts/2026-02-24-kbs-nesting-and-ideation.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: Adding Todos</title>
    <id>https://alexleighton.com/posts/2026-02-19-kbs-adding-todos.html</id>
    <link href="https://alexleighton.com/posts/2026-02-19-kbs-adding-todos.html" />
    <published>2026-02-20T03:45:00Z</published>
    <updated>2026-02-20T03:45:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>More guidance, and thoughts on automatic review.</p><p>Published on <span title="2026-02-20T03:45:00Z">2026-02-20</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>More guidance, and thoughts on automatic review.</h3><p>Published on <span title="2026-02-20T03:45:00Z">2026-02-20</span><br>Tags: article, code-review, llm, ocaml, software-eng, software-eng-auto</p><p>Today's changes build on the previous commit's unpacking of <code>Todo</code>, making a repository for <code>Todo</code>s, and hooking it up to the command line to provide <code>bs add todo ...</code>.</p>
<h2>More Guidance</h2>
<p>The implementation plan forgot to add a task for integration tests, which resulted in a <a href="https://github.com/alexleighton/knowledge-bases/blob/412117ae6940be69a6122a8a1d048f73ba0fa8de/docs/bin/principles.md?plain=1#L17-L40"><code>bin</code> principle</a> saying changes to <code>bin</code>, where we're only keeping command-line parsing and top-level orchestration, should always be accompanied by an integration test. I also had the agent put together an <a href="https://github.com/alexleighton/knowledge-bases/blob/412117ae6940be69a6122a8a1d048f73ba0fa8de/docs/test-integration/architecture.md">architecture document</a> guiding the structure of integration tests and their purpose.</p>
<p>In the hopes of keeping file sizes down, I added a line length principle to <a href="https://github.com/alexleighton/knowledge-bases/blob/412117ae6940be69a6122a8a1d048f73ba0fa8de/docs/bin/principles.md?plain=1#L52-L57"><code>bin</code></a> and <a href="https://github.com/alexleighton/knowledge-bases/blob/412117ae6940be69a6122a8a1d048f73ba0fa8de/docs/lib/principles.md?plain=1#L41-L46"><code>lib</code></a>. I'm not sure about these principles, because I think this can be automated as a lint. I will keep these in mind and improve the linting situation when I get another structural/syntactic principle that can be automated.</p><p>...<br><a href="https://alexleighton.com/posts/2026-02-19-kbs-adding-todos.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: Unpacking Todo</title>
    <id>https://alexleighton.com/posts/2026-02-18-kbs-unpacking-todo.html</id>
    <link href="https://alexleighton.com/posts/2026-02-18-kbs-unpacking-todo.html" />
    <published>2026-02-19T04:15:00Z</published>
    <updated>2026-02-19T04:15:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Guidance, prompts, and TDD.</p><p>Published on <span title="2026-02-19T04:15:00Z">2026-02-19</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Guidance, prompts, and TDD.</h3><p>Published on <span title="2026-02-19T04:15:00Z">2026-02-19</span><br>Tags: article, code-review, llm, ocaml, software-eng, software-eng-auto</p><p>Today's change to Knowledge Bases is addressing my unhappiness at the beginning of the month: <a href="../../../posts/2026-02-01-storage-for-knowledge-bases.html">"Data modeling"</a>. I've returned to the work of <a href="https://github.com/alexleighton/knowledge-bases/blob/d5fcf06481d710a6ecf35a0b874474e96d016dc5/lib/data/todo.ml#L19-L25">unpacking <code>Note</code> into <code>Todo</code></a>, to make it easier to control the TypeID prefix, and prepare for making a <code>Todo</code> repository.</p>
<h2>Revisiting Guidance Docs</h2>
<p>A handy thing about coding agents: they can make such a wide-ranging change with minimal effort (see the number of changed files in the commit). Additionally, the coding principles guidance doc paid for itself — agent code review noted the duplicated validation logic for title and content now in both <code>Note</code> and <code>Todo</code> and recommended deduplication. I redirected it to encapsulate the validation into new types, a favorite software engineering technique of mine (see <a href="../../../posts/2025-10-24-note-identifiers-and-tests.html">"Aside: Correct Construction"</a>), as well as <a href="https://github.com/alexleighton/knowledge-bases/blob/d5fcf06481d710a6ecf35a0b874474e96d016dc5/docs/lib/principles.md?plain=1#L15-L26">adjusted the guidance</a>.</p><p>...<br><a href="https://alexleighton.com/posts/2026-02-18-kbs-unpacking-todo.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>KBS: Going Automatic</title>
    <id>https://alexleighton.com/posts/2026-02-17-kbs-going-automatic.html</id>
    <link href="https://alexleighton.com/posts/2026-02-17-kbs-going-automatic.html" />
    <published>2026-02-18T05:30:00Z</published>
    <updated>2026-02-18T05:30:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Exploring a hands-off agent-driven workflow.</p><p>Published on <span title="2026-02-18T05:30:00Z">2026-02-18</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Exploring a hands-off agent-driven workflow.</h3><p>Published on <span title="2026-02-18T05:30:00Z">2026-02-18</span><br>Tags: article, git, llm, ocaml, software-eng, software-eng-auto</p><p>I've decided to use this project as a testbed for automatic development, to explore the kinds of techniques described in recent results [<a href="../../../posts/2026-02-12-new-software-engineering-modes.html">1</a>] [<a href="../../../posts/2026-02-14-more-software-engineering-research.html">2</a>], and to speed up development. I'll be making the vast majority of changes to the codebase via coding agents, exploring what controls are needed to stay hands-off yet maintain code quality.</p>
<h2>Short AGENTS.md</h2>
<p>I've played around with longer <code>AGENTS.md</code> and not seen much value. Agents would ignore pieces, and after a while it felt like simply a waste of context space. I'm keeping <a href="https://github.com/alexleighton/knowledge-bases/blob/223c4ece2550fb1196cec770d97762a66823fcda/AGENTS.md?plain=1"><code>AGENTS.md</code> short</a> this time around, "linking" to where more information can be retrieved, with the hope that brevity keeps the content important.</p><p>...<br><a href="https://alexleighton.com/posts/2026-02-17-kbs-going-automatic.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>More Software Engineering Research</title>
    <id>https://alexleighton.com/posts/2026-02-14-more-software-engineering-research.html</id>
    <link href="https://alexleighton.com/posts/2026-02-14-more-software-engineering-research.html" />
    <published>2026-02-15T05:00:00Z</published>
    <updated>2026-02-15T05:00:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Strategies for guiding coding agents.</p><p>Published on <span title="2026-02-15T05:00:00Z">2026-02-15</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Strategies for guiding coding agents.</h3><p>Published on <span title="2026-02-15T05:00:00Z">2026-02-15</span><br>Tags: code-review, commentary, llm, software-eng, software-eng-auto</p><p>OpenAI has put out yet another software engineering report on coding agent use, closer to StrongDM's <a href="../../../posts/2026-02-12-new-software-engineering-modes.html">Software Factory</a> than to <a href="../../../posts/2026-01-27-orchestrating-coding-agents-in-2026.html">Gas Town</a>.</p>
<blockquote>
<p><a href="https://openai.com/index/harness-engineering/"><strong>Ryan Lopopolo for OpenAI</strong> on 2026-02-11</a>:</p><p>Over the past five months, our team has been running an experiment: building and shipping an internal beta of a software product with <strong>0 lines of manually-written code</strong>.</p>
<p>The product has internal daily users and external alpha testers. It ships, deploys, breaks, and gets fixed. What’s different is that every line of code—application logic, tests, CI configuration, documentation, observability, and internal tooling—has been written by Codex.</p>
<p><strong>Humans steer. Agents execute.</strong></p></blockquote>
<p>OpenAI provides a number of interesting details here that, I think, complement the practices described by StrongDM. They started by reviewing Codex's commits, and that review load shrank drastically over time as every correction was worked into a set of guiding documents that agents would automatically pick up. It sounds like they did human-to-human reviews of feature and guidance documents. The picture they paint makes a lot of sense — encoding practical software engineering standards into tooling and guidelines documents, and then routinely "garbage collecting" using agents prompted specifically to review and clean up. An interesting thing to me is that they found "rolling forward" via speedy coding agent code generation to be faster and less disruptive to the process than rolling back bugs.</p><p><a href="https://alexleighton.com/posts/2026-02-14-more-software-engineering-research.html">Read the post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>New Software Engineering Modes</title>
    <id>https://alexleighton.com/posts/2026-02-12-new-software-engineering-modes.html</id>
    <link href="https://alexleighton.com/posts/2026-02-12-new-software-engineering-modes.html" />
    <published>2026-02-12T16:30:00Z</published>
    <updated>2026-02-14T22:00:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Adapting to code abundance.</p><p>Published on <span title="2026-02-12T16:30:00Z">2026-02-12</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Adapting to code abundance.</h3><p>Published on <span title="2026-02-12T16:30:00Z">2026-02-12</span><br>Tags: code-review, commentary, llm, software-eng, software-eng-auto</p><p>Complementing <a href="../../../posts/2026-01-27-orchestrating-coding-agents-in-2026.html">Gas Town</a>, a team of engineers at StrongDM have coined the term "<a href="https://factory.strongdm.ai/">Software Factories</a>" (<a href="https://simonwillison.net/2026/Feb/7/software-factory">via</a>) for a different kind of coding agent software engineering methodology. They get straight to the heart of things:</p>
<blockquote>
<p>Code <strong>must not be</strong> written by humans</p>
<p>Code <strong>must not be</strong> reviewed by humans</p>
</blockquote>
<p>Coding agents write code faster than a human can read and understand it. This has the potential to be very valuable — quickly producing working programs on its own, but also the amount of work a single engineer can ship. To sustain that speed, the code cannot be reviewed. I think most people who've worked with coding agents have seen that they can tap into the speed, but then you end up forced to slow down and stretch your code review skills. What would need to change to make full use of the speed?</p><p>...<br><a href="https://alexleighton.com/posts/2026-02-12-new-software-engineering-modes.html">Read the full post →</a></p>]]></content>
  </entry>
  
  <entry>
    <title>Orchestrating Coding Agents in 2026</title>
    <id>https://alexleighton.com/posts/2026-01-27-orchestrating-coding-agents-in-2026.html</id>
    <link href="https://alexleighton.com/posts/2026-01-27-orchestrating-coding-agents-in-2026.html" />
    <published>2026-01-28T06:00:00Z</published>
    <updated>2026-01-28T06:00:00Z</updated>
    <author><name>Alex Leighton</name></author>
    <summary type="html"><![CDATA[<p>Experimental results in the agent orchestration design space.</p><p>Published on <span title="2026-01-28T06:00:00Z">2026-01-28</span></p>]]></summary>
    <content type="html"><![CDATA[<h3>Experimental results in the agent orchestration design space.</h3><p>Published on <span title="2026-01-28T06:00:00Z">2026-01-28</span><br>Tags: commentary, erlang, git, llm, software-eng, software-eng-auto</p><p>I am gratified to see some of my musings on the direction of coding agents are proving accurate. In <a href="../../../posts/2025-09-01-quote-erlang-supervisors.html">September of last year</a> I speculated that, given the non-deterministic and faulty nature of LLMs, folks might be served by adopting fault-tolerant architectures to orchestrate coding agents:</p>
<blockquote>
<p>From everything I've seen, we're not yet in a situation where it's either practical or economical to execute multiple coding agents in parallel or orchestrated. However I think in a couple years the technology will be cheap enough that we'll start needing to think about how to orchestrate groups of agents, and the idea of leaning on Erlang's learnings intrigues me. Constructing the agents and their execution frameworks as individual actors for concurrent execution, while arraying some as supervisors and others as workers, seems rich for investigation.</p>
</blockquote><p>...<br><a href="https://alexleighton.com/posts/2026-01-27-orchestrating-coding-agents-in-2026.html">Read the full post →</a></p>]]></content>
  </entry>
  
</feed>
