Agentic development using existing code as a source of truth

A small regression that wasn't small at all

A customer once told us that our table had become slow enough that they might need to hire another planner to compensate for the slowness.

At first, that sounded exaggerated. I checked the numbers and saw around a 100ms regression when applying a value to a cell. Not huge on paper. I even thought maybe I wasn't reproducing it correctly. Then I joined a call and watched how they actually used the table.

They were editing cells back-to-back, pressing tab and typing values faster than I could type my (long and only 😅) password. At that speed, 100ms is not small, it's friction. And when you repeat that hundreds or thousands of times a day, it becomes a real cost.

The legacy table problem

We already had a table implementation in our SaaS product. It had been around for years.

It was:

built on outdated technologies
poorly structured
extended over time with more and more features
missing specs, documentation, and tests

Adding new features was either extremely expensive or practically impossible. Fixing bugs often caused more issues than it solved. And performance was always fragile.

Still, it worked, and users depended on it heavily.

The failed rewrite

Our first attempt was to replace it with our internal component library table. In theory, that sounded like the right move: it was modern, maintained, and reusable. In practice, it didn't work for us.

We had too many features that users relied on, and our API didn't match the table's API. That forced us to build a translation layer between our system and the component.

That introduced several problems:

extra data conversions added a lot of processing time
duplicated state (our store + table's internal store) made memory usage spike
our custom cell renderers wrapped inside another layer in the table created too many DOM elements

After implementing most features and testing it, the result was clear: it got slower than the old implementation. It became obvious we needed a different approach.

Rethinking the approach

We paused and explored alternatives. We decided to take our time and investigate our next move.

Around that time, I had already been using Claude Code for tasks I wasn't very familiar with, like our Helm charts. It was getting better, outputs were more reliable, and it was clearly having its moment.

So I tried something different: instead of writing specs from scratch, I started feeding Claude our existing table implementation. The code was messy, but it had one important property:

It was deterministic. It did exactly what it was written to do.

I asked Claude to modernize parts of it, restructure it, and match the strongly opinionated design I had in mind. It was a massive project, so I broke it down into smaller pieces and moved forward more carefully. After a few iterations, it started to feel like it might actually work.

Using the old code as the source of truth

When I presented the idea to the team, they supported it. I was tasked with building the first properly structured version, and then we would finish the features together.

What surprised me was how fast things moved after that. Because we already had working code:

many features worked almost immediately
behavior didn't need to be redefined
I could extract implicit "specs" from the code and verify them manually (it was okay, more of a starting point)

Performance improvements

Lastly, I made several passes on what re-renders when something happens and minimized that to a bare minimum. The most beneficial improvement came from reducing windowing re-renders, scrolling became super smooth.

Here are a few things we did to improve performance:

Minimized DOM elements, especially per cell, we only rendered one div
Reduced event listeners, we only rendered one editing input and positioned it on top of the active cell. The input stays mounted but hidden, and simply moves when editing different cells
Applied the same pattern for tooltips, extra cell info, right-click menus, dropdowns, etc., render once, position as needed
Kept backend data unchanged, stored it in an observable map keyed by cell position, and cells read directly from it

Results

The improvements were significant:

Initial load time of a massive table dropped from ~3 seconds (with multiple frame drops up to 500ms+) to around 1 second
Rendering lag dropped to a single frame drop of ~100ms or less in many cases
Cell editing latency became effectively unnoticeable
Scrolling over thousands of cells became smooth

We added tests, fixed edge cases, and polished the implementation with the team.

The original plan was to release behind a feature flag, but feedback was good enough that we just shipped it in our upcoming release.

What this showed about LLMs

This experience reinforced something I already had in mind.

If you ask an LLM something like "who is the president of Finland," it might answer correctly, make up a name, or even say something incorrect. But if you give it a reliable source and ask, "According to this document, who is the president of Finland?" you almost always get the correct answer.

The same thing applied to our table rewrite. We already had an implementation, it wasn't great, but it was truthful and deterministic. It did exactly what it was written to do, unlike documentation or vague descriptions like "make it clean and bug-free."

By feeding the model real code and guiding it with a clear, opinionated target, it consistently produced useful results, as long as I stayed involved and guided it step by step.

What I would do differently next time

Next time, I would start by using the existing code to generate reusable tests that capture the actual behavior and specifications. I would spend more time validating that those tests truly cover what's needed and are implementation-agnostic.

Then I'd move on to the new implementation. That would likely allow the agent to progress more independently, with less hand-holding.

Design principles behind the new table

If you're interested in the strongly opinionated design behind the new table, it looked like this:

Absolutely no logic in React components: components remain small and only handle visuals and event bindings
No prop drilling: components receive the main store (or minimal local data) and act as observers
React hooks are used only for layout, event management, and user interaction, and kept minimal
One main store handles business logic, data fetching, and updates
One UI store (as a substore) manages UI concerns, with smaller substores for specific features like windowing, copy-paste, and edit positioning
Everything exposed to components is observable or an action

This created a clear separation of concerns, made the code easier to navigate, and made performance tuning much more predictable.

// All logic lives here
class TableStore {
  data = new ObservableMap()      // raw backend data, keyed by cell position

  ui = {
    editing: new EditingStore(),
    selection: new SelectionStore(),
    scroll: new ScrollStore(),
    clipboard: new ClipboardStore(),
  }
}

// Components receive the store and read what they need
const Cell = observer(({ store, row, col }) => (
  <div style={store.ui.scroll.positionOf(row, col)}>
    {store.data.get(row, col)}
  </div>
))

// Overlays render once at the table level, positioned by store state
const Table = observer(({ store }) => (
  <div>
    <Cells store={store} />
    <Editor store={store} />     {/* one, follows active cell */}
    <ContextMenu store={store} />{/* one, positioned on demand */}
    <Tooltip store={store} />    {/* one, positioned on demand */}
  </div>
))

Final thoughts

To wrap this up: existing implementations, whether open source or your own, give code agents a clear and reliable guide for both behavior and structure. That leads to higher-quality and more predictable results.