The Future of Software Engineering

Working hypotheses:

Software engineering is the best domain for examination because it is the tip of the spear for AI; impacts and patterns in this domain should similarly ripple into others
As agentic development moves from human in the loop to autonomous swarms, parallels can be drawn to older fields like management theory

Red Queen's Race

AI capability, and accordingly user trust in those systems, has increased dramatically since the initial ChatGPT moment in 2022. The smallest unit of initial adoption was the tab-autocomplete, a straightforward, next token predictor; typing out “Green Eggs and” would result in a suggestion of “Ham”. The engineer still drove the bulk of the logic and implementation.

Next was the shift from micro-generation to macro-generation, with ChatGPT being able to generate entire files and the engineer moving into the role of discerning copy paster. The AI, however, was still unable to get much context on the underlying software being edited. Companies like Cursor then integrated AI into the IDE so it could read and understand the files it was editing. OpenAI introduced function calling, and now models could both gather their own context and debug their own output. Problems shifted from granular semantic errors with the generated code to code that had surface level beauty, but logical issues with the product direction or broader context.

These systems moved from single back and forth chats into chained together tasks that constituted an agentic loop. The software engineer’s workflow evolved to be much more collaborative; the human would give an objective and review the proposed strategy, adjusting or approving, and then let the agent think, write, debug, and test its own code. Human intervention, outside of guardrailing the implementation agent, shifted down the SDLC to PR reviews - although specialized AI tools like Greptile or Cursor’s BugBot are increasingly eroding that point of human intervention as well.

With rising quality of AI output and more time for people to become comfortable (at what point before the water boils is the frog most comfortable?) with these systems, viral agents like OpenClaw run with complete autonomy and act on behalf of their human owner. With the human mostly or completely out of the loop, the cognitive load of managing a single agent has decreased such that engineers now have the ability to manage teams of agents in parallel.

The current industry focus is architecting agentic swarms, teams of agents that can have different models and roles, to accomplish meaningful long-running work. A post on harness design from Anthropic’s engineering blog references building a structure with a generator and evaluator agent, drawing inspiration from Generative Adversarial Networks (GANs), a machine learning concept. Another approach, from Factory’s engineering post on Missions lays out how different models can be used to fill in the roles of orchestrator, implementor, validator, and researcher.

Just as different models are suited to different roles, different agent swarm hierarchies will be better suited to different work products. The question, it seems, will be on the determinism of these swarm structures. Will it draw more inspiration from machine learning - with a set amount of tweakable core architectures that should be followed for best results? Or will agentic swarms take after the management structure of the companies that implement them, as posited by Conway’s Law?

And what does this hold for the future - should students stop studying computer science, and software engineers start upskilling? Probably not. Unfortunately, except in the case of large structural societal changes, it would seem that everyone just gets busier. There is an alpha to AI adoption right now that reduces as it diffuses across more enterprises. In the 2010s you could find analyst level workers who would anonymously confess that they had learned an intermediate level of Python, and automated their job down to just a few hours a week. If everyone is commuting to work via bike, and one individual secretly had a car, they could either have more time to work or relax, depending on their alignment. Once everyone gets a car though, that extra individual’s advantage is gone. The expectation is that one keeps up with their peers, and a company keeps up with its competitors.

The paradigmatic example making its way through 𝕏‘s AI sphere is from Lewis Carroll’s Through the Looking-Glass:

“Well, in our country,” said Alice, still panting a little, “you’d generally get to somewhere else—if you ran very fast for a long time, as we’ve been doing.”

“A slow sort of country!” said the Queen. “Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!”