Assembli AI | YapLabs

Six months into a CTO role at an early-stage construction tech company, and I’m building three things simultaneously: a product that helps general contractors stop leaving money on the table during pre-construction, an AI-driven interface architecture that treats widgets as the unit of composition rather than pixels, and a development workflow where a single engineer orchestrates a swarm of AI agents across parallel workstreams without the codebase devolving into chaos.

The Product Problem

Custom home construction runs on bids. A single project might involve 50 subcontractors, each submitting proposals for their trade — framing, electrical, plumbing, HVAC, concrete. That’s easily 100 to 150 bids and related conversations to track, compare, and act on before a shovel hits dirt. When I joined Assembli, the product handled one piece of this: computer vision models that extract material quantities from architectural plans. Useful, but a point solution in a workflow that spans weeks and dozens of relationships.

I’m expanding the platform into the full bid management lifecycle. Generating scopes of work from plans. Managing communication with subs. Flagging gaps or unusual responses in returned bids. Comparing competing proposals for the same trade on the same job. Beyond bid management, I’m building toward what I think of as a “pricing brain” — a system where contractors upload whatever historical bids, prices, and actuals they have, and the platform organizes it into something queryable. Not just analytics dashboards, but conversational access: “How much will 10 extra 2x4x12s cost for this job?” answered from the contractor’s own data. We’re focused on custom single-family builders today, but the bid and cost management tools extend naturally into remodeling, tract homes, multitenant, and light commercial.

Generative UI Without the Chaos

The interface I’m designing centers on a persistent conversation component — similar to Spotlight on macOS but always visible at the bottom of the screen. This gives users access to an orchestration agent built on a LangGraph swarm, where specialized sub-agents handle different workflows and problem domains. The agent can surface widgets onto the screen: a list of bidders for a given trade, bids requiring attention, a side-by-side comparison of competing proposals. These widgets range from a single card to a full-screen dialog depending on context.

The critical design decision is what the agent can and cannot control. An agent selects a widget type, a scope of data, a state or condition, and a placement within a small set of layout templates. It does not choose where a button goes or what it does. The widgets themselves handle responsive layout, data binding, and interaction logic through traditional business rules. Widgets can also request other relevant widgets based on context — viewing a bid from Subcontractor A might trigger a sidebar showing their last bid on a different project, competing bids from other subs for this work, and recent emails on file.

I arrived at this architecture after studying Google’s generative UI research, published alongside Gemini 3 in late 2025. Their system generates entire interfaces from prompts — and even with state-of-the-art models, the output is only comparable to human-crafted UI 44% of the time. The other 56% is noticeably worse. The failure mode of fully generative UI is clear: without constraints, the system produces plausible-looking interfaces that don’t actually work for real tasks. My approach limits the agent’s surface area to 30 to 60 purpose-built widgets with strong fallbacks and sensible defaults. The orchestration agent operates both reactively and proactively — answering questions but also suggesting next steps and surfacing relevant information unprompted. The result is an interface that behaves more like a conversation with a knowledgeable partner who can bring things up on screen, not just in text.

This builds on guardrail patterns I’ve been refining since my time at Kno, before Meltwater. The through-line across a decade of AI product work is the same: give the AI enough capability to be genuinely useful in dynamic situations while constraining the failure modes tightly enough that the system doesn’t break when the AI gets something wrong. Emerging standards like MCP, A2A, and the nascent A2UI protocols are finally making this kind of integration practical at the tooling level.

Agentic Development at Scale

The development workflow I’ve built is arguably as experimental as the product itself. The core idea: a single engineer manages a swarm of AI coding agents working across multiple git worktrees in parallel. Each agent operates in isolation. Other agents review their work. Automated audit processes run at multiple steps in the pipeline to catch and fix AI-generated code that doesn’t meet quality standards — the same pattern of agents checking agents’ work that drove the cost reduction at Legacy Logix, applied to software development instead of estate analysis.

The key insight is that increased throughput should not map linearly to increased feature output. If the system produces 10x the code volume, only about 1.5x goes to new features. The remaining 85% is directed at refactoring, hardening, testing, and documentation. This is deliberate. The quality of a codebase maintained by AI agents stays high only if the workflow systematically allocates capacity to identifying and fixing AI slop — not as an afterthought, but as a routing audit process built into every step.

The practical rhythm is: spend a day or two planning and decomposing work into well-defined issues. Spend a day executing rapidly with agents. Spend a day or two auditing results and planning the next cycle. This relies on a backlog of clearly specified, largely automatable tasks — it doesn’t work everywhere. But the results on my own projects have been substantial. The container build system powering this very development environment grew to 80,000 lines of shell with 2,500 unit tests using this approach. A Rust project reached 250,000 lines with 5,000 tests. I’m now bringing these patterns to the Assembli team to see if they scale beyond a single practitioner.

What I’m Learning

The generalist thesis is the thread connecting all three streams. Agentic development has nearly made the language question irrelevant — fundamental computer science skills matter more than syntax fluency when agents handle the translation layer. People who can plan work, design systems, and evaluate output across disciplines will thrive in this model. The most valuable skill isn’t knowing how to write code in a specific language; it’s knowing what to build and being able to judge whether it was built correctly.

This is still early. The generative UI isn’t released yet. The agentic dev patterns are proven on personal projects but unproven at team scale. The pricing brain is a hypothesis. I’m sharing this work in progress because the problems are real, the approaches are grounded in patterns I’ve tested across multiple companies, and the construction industry — an enormous market that’s been largely untouched by modern software — deserves better tools than spreadsheets and phone calls.

The Product Problem

Generative UI Without the Chaos

Agentic Development at Scale

What I’m Learning

Let's Talk