#102 - AI Retrieval for Engineering Firms

A view on managing data, building forward, and what to do with decades of project data

May 19, 2026

A note before we start. Flocode has had a real effect on the direction my work has taken over the past year. It has given me a space to think carefully about topics I find fascinating, and I am genuinely grateful for the responses from readers: engineers, AEC leaders, solopreneurs and people building similar things inside their own firms. So, thank you. If anything in this piece resonates or you are working through similar challenges, please get in touch. I would rather be in conversation than monologue. We're in this together.

It is 2026, and most engineers I know have been using AI tools in some form for a couple of years. Some lightly, some seriously. The conversation has moved past whether the technology is useful and on to a harder question: what should we actually do with it inside a firm that produces engineering work for a living? How should we organize information, process data, and build systems that help us do better work without putting the accuracy of that work at risk?

That question gets thorny quickly when you look at where engineering firms actually keep their knowledge. We all have mountains of project data. Most of it lives in project folders on shared drives, structured by some combination of project number, discipline, and phase. Most firms have document control standards that are followed reasonably well, in patches. Some projects are immaculate. Some are graveyards. Over time, across clients and decades and personnel changes, the archive becomes what it becomes: a record of work done, structured by the people who happened to do it, with no consistent metadata holding the whole thing together. That is not a failure of any particular firm. It is the natural state of decades of professional services work.

So the practical question for 2026 is this. Given what we have, and given the tools that are available, where should we put our effort?

The popular answer, and the one most vendors will sell you, is to focus on the AI architecture. Pick a retrieval-augmented generation pipeline. Wire in a vector database. Add a reranker. The category itself, often called RAG, is broad and the landscape is still moving fast. New approaches arrive every few months: hybrid search, graph-based retrieval, document tree parsing, tabular reasoning models. I have not settled on a strong opinion about which of these will dominate, because the field has not settled either. For now, I am watching, building, and trying to learn faster than the ground moves. (This overview is a good starting point if you want to see the current state of the infrastructure race.)

What I have formed an opinion on is something simpler and more durable than the architecture question. The data you feed these systems matters more than the architecture you wrap around it. A sophisticated retrieval pipeline running over a messy corpus produces sophisticated retrieval of messy information. The semantic search finds the closest match. The closest match may be a superseded version, an outdated assumption, or a piece of working iteration that was never meant to be authoritative. The system has no way to know.

A concrete example. We have been building a retrieval system at Knight Piésold focused on technical engineering documentation: design codes, engineering manuals, dense textbooks from our internal library. The pipeline performs well. But it sits on top of an inherent tension that every structural engineer will recognize, which is that design codes change. ASCE 7-16 and ASCE 7-22 are both in active use, on different projects, for legitimate reasons. When you query the system, you sometimes want one and sometimes want the other. The system has to know which project you are working on, what the contract specifies, and which edition governs. That is not a retrieval failure. It is the same ambiguity that every working engineer has to manage in their head every day. The system inherits the messiness of how engineers actually work, and the right response is not to pretend the messiness does not exist. It is to design the system in a way that makes the version question visible and answerable, rather than hidden inside a confident-sounding response.

That kind of metadata discipline (knowing which document supersedes which, which standard governs which project, which calculation derives from which design basis) is what separates a retrieval system that helps you from one that quietly misleads you. Building it requires a different kind of attention than building the AI pipeline itself. It requires looking at how documents are actually organized before they go into the system, deciding what relationships need to be made explicit, and accepting that this work is unglamorous and slow. It does not generate a demo. It is the difference between a system that performs and one that does not.

Which brings me to the harder question. Most firms, when they start thinking seriously about AI implementation, look at their existing archive and see a goldmine. Hundreds of projects, thousands of documents, decades of hard-won engineering decisions. Surely all of that is worth cleaning up and feeding into an AI system?

My honest answer is mostly no.

The instinct treats accumulation as value. Some of what is in those folders is genuinely valuable. Most of it is iteration. A typical project folder contains the final issued-for-construction drawing set and also every revision that preceded it, the internal markups, the calculations that were superseded, the correspondence threads that went nowhere, the drafts that were replaced, the meeting minutes from decisions that were later reversed. The useful fraction is real. It is also a minority of the total volume, and it is not labeled. Distinguishing the authoritative artifacts from the surrounding accumulation requires someone who knows the project well. Across a portfolio of fifty projects, that is hundreds of hours of expert engineering time before a single retrieval query runs.

There is a worse problem than the cost. A system that tries to handle both the old archive and the new forward-flowing data ends up being a compromised version of both. The metadata discipline you can enforce going forward (consistent document types, clear supersession relationships, explicit links between calculations and the design basis they derive from) is impossible to retrofit cleanly onto twenty years of inconsistent legacy work. So you build dual logic to handle both worlds, and the dual logic makes the system slower, more fragile, and less trustworthy on the new work where you actually have control. You end up with a slightly better archive of the past and a meaningfully worse foundation for the future.

The exceptions are real but specific. If there is a class of documents in the archive that represents finalized, verified, stamped engineering output, those are worth extracting. A surgical filter that pulls only the final stamped reports from a project folder, ignoring the hundreds of related files that represent the iterative work behind them, gives you the gold without the noise. Same for specific technical domains where your firm has done unusual work that does not exist in published references: site-specific hydrological models, unusual ground conditions, material performance data from completed installations. Targeted extraction of irreplaceable knowledge is worth the effort. Comprehensive archive cleanup, in my view, is not.

The choice is not “clean the past or ignore it.” It is “clean the past at the cost of the future, or build the future cleanly and mine the past selectively.” The second option is harder to sell internally because it sounds like you are giving up on the archive. You are not. You are choosing to spend the effort where it compounds.

None of this is settled, and I would be misleading you to suggest otherwise. The retrieval architecture question is moving fast, and I expect my views on some of the technical pieces to change in the next twelve months. The corpus question is more stable, but the practical question of how to enforce metadata discipline across a global firm with thousands of staff is its own engineering problem, and we are working through it in real time.

We are also still wrestling with parts of the technical problem that the field has not solved. Vision interpretation of engineering plots is a good example. Moment-curvature curves, culvert sizing charts, anything with multiple axes and overlaid families of lines: these are difficult for a qualified engineer to read carefully, and current vision models do not handle them reliably. Digitizers exist that can convert a chart image into a numerical table, which we use where it helps. The full problem, where a model can look at a sizing chart and reason about it the way a structural engineer would, is not yet solved. We are getting closer. The gap is real.

So this is where I am. Confident that the corpus matters more than the architecture, and that forward-focused systems will outperform comprehensive archive cleanup for most firms. Watchful on the retrieval architecture itself, because the field has not converged and I do not want to commit too hard to any one approach yet. Honest about the parts where the technology is still catching up to the work we actually do.

We are building the plane as we fly it. If a better answer arrives next quarter, I will change my mind. Until then, this is where I’m at.

See you in the next one.

James 🌊

Discussion about this post

Ready for more?