Journal/Postmortem/Notes from a quoting automation rollout.

Notes from a quoting automation rollout.

What eleven weeks inside a mid-sized fabricator taught us about evals, change management, and why the receptionist always knows first.

Published
Mar 04, 2026
Reading time
14 minutes
Category
Postmortem

This is a sanitized account of an eleven-week engagement. Names, sectors, and numbers have been changed enough that the firm cannot be identified, and any client we work with is welcome to read this without finding themselves in it. The lessons are the part that travels.

The wedge was simple: incoming requests for quotation arrived in three places — email, a web form, and the phone. Each one was re-keyed into a quoting tool by hand, by one of three people, with answers turned around in twenty-four to seventy-two hours. The CFO wanted the turnaround under four. We agreed that was achievable. Eleven weeks later, average turnaround was sixteen minutes. Here is what we learned along the way.

01. Week one was not what we expected

We arrived ready to talk about models, retrieval, and the rule book the team used to price jobs. The first useful conversation happened with the receptionist.

She knew, before anyone in operations, which incoming requests were "easy" — repeats from existing customers, standard part numbers, predictable margins — and which were "hard" — new customers, custom specs, anything urgent. The pricing team had never asked her. We did, and she gave us, in a single afternoon, a more accurate triage taxonomy than anything in the existing CRM.

Every operations diagnostic ends in the same place: the person closest to the door knows things the org chart does not. — project journal, week one

This is the first lesson of the rollout, and it is not a technical one. Talk to the people who handle the inputs first. They have been compressing your business into mental models for years. Those mental models are the eval set you are about to need.

02. The eval was boring and saved us twice

Before any model touched a real RFQ, we built a set of forty historical quotes — selected with the receptionist and the head of pricing — with their actual outcomes. The agent had to produce a quote within a tolerance of the historical answer. Below tolerance: it could ship. Above: it could not.

The eval caught two regressions we would otherwise have shipped. The first was a category of stainless-steel jobs where the agent confidently underquoted by a number that would have, over a quarter, eaten more margin than the project saved. The second was a class of urgent jobs the agent priced at standard rate because it had no notion of urgency from the input fields.

Without the eval, both would have launched. With the eval, both surfaced in week six and were patched before the rollout.

03. The hardest part was not the model

Once the agent worked — at week seven, comfortably — we hit the part of the project nobody had budgeted for: change management. The pricing team did not initially trust the system. This is rational. They had been priced wrong by humans for years; now a machine was doing it.

We solved this in three steps, none of which were technical:

  • The agent's outputs were drafts, not decisions. Every quote had a "send" button that required a human to press it. For four weeks, that button was the project.
  • We built a one-screen audit trail. Every quote the agent produced was visible alongside the chunks of the rulebook it had used. The pricing lead could spot-check ten a day in fifteen minutes.
  • We let the team override and log the override. Each override became an eval case. The agent improved, fast, on the cases the humans cared most about.

By week nine, the pricing team was sending agent drafts unedited for the easy categories and reviewing only the hard ones. By week eleven, sixteen-minute turnaround was the operating reality.

04. The two things that almost killed it

In a project like this, the obvious risks — model drift, hallucination, integration failure — are not what kill the rollout. The two near-misses we had were:

The CFO's deadline slipped early. A board meeting was moved up by three weeks, and a "pilot" became a "result we need to report." We pushed back hard, and the CFO held the line. If she hadn't, the project would have shipped without the eval and we would be writing a different postmortem.

The pricing lead nearly left. Four weeks in, she was unsure whether the project was a help or a quiet plan to replace her. Her manager handled this — directly, in a one-on-one. The lesson is not that you should manage your people well. It is that you should budget for the conversation, because it is going to happen, and avoiding it is the failure mode.

Working principle: the people closest to the workflow you are automating need to be told — early, plainly — what changes for them. The conversation is part of the rollout, not a thing you do after.

05. What the number actually meant

Sixteen minutes was the headline. The CFO got her board win. But the numbers we cared more about, internally, were:

  • The receptionist's afternoon. It came back. She stopped re-keying RFQs and started, of her own initiative, flagging customer behavior the CRM did not capture. That is now a separate, smaller wedge.
  • Pricing team headcount. Did not change. The hours they used to spend re-keying went to harder pricing decisions on the jobs that needed judgment.
  • Margin variance on stainless-steel jobs. Tightened by three points. Almost entirely because of the regression caught by the eval in week six.

The headline number ships the project. The shadow numbers are why the project was worth doing.

A short closing

If you are about to run a project like this one, the things we wish we had known are not technical. They are:

  1. Talk to the people closest to the input first. They know more than the org chart says.
  2. Build the eval set with them, not separately.
  3. Treat the change-management conversation as a deliverable, not a side effect.
  4. The shadow numbers — the time given back, the variance tightened — are what compound.

The model was the easiest week of the project. The other ten weeks were people, process, and the unfashionable habits that make a rollout survive.


Filed under: POSTMORTEM · METHOD
First published: Mar 04, 2026