Field Notes
I called three sign-recognition models failures. The recipe was the failure.
A Parley notebook reported three landmark architectures as broken on cross-signer ASL. A warmup and a gradient clip brought all three back, and two matched the best model. The ranking had measured my training recipe, not the models.
Recipe beats architecture: lottery tickets in sign models
We trained seven landmark architectures three times each. Three of them worked on one seed and collapsed to near-random on the others. A single-seed comparison would have called two of those collapses a result.
The 38-point gap: one accuracy number, twenty-one very different users
Our sign model averages 42% across signers. That average hides a range from 26% to 64% — and the thing that decides where a person lands is not the signs they make, it is who they are.
45%, not 90%: the only sign-recognition number I trust
Our best landmark-only sign model scores 45% on signers it has never seen. The field routinely reports numbers twice that high. The lower number is the honest one, and it is the one we publish.
Six ways hearing-built sign-language AI fails the Deaf community
I keep a running catalog of how hearing-led sign-language AI fails. It is not a list of other people's sins. It exists so Parley can catch itself the moment it starts to look like one of them.
Picking the glasses
One AR-glasses decision for two ventures. The criteria that survived the cut, and why doing it once was the right move.
Why I'm running Parley
I started a Kaggle research project in Q2 2026 while running two startups. The decompression channel, the four open questions, and what makes it survive.
ESP32 firmware with Claude: the gap between 'it compiles' and 'it works on the bench'
Claude is excellent at writing ESP32 firmware that compiles. It is not reliable at predicting what that firmware will do when the hardware is actually in front of you. Three incidents and the gate I added.
31 million systematic throws or 91 million random ones. Which one taught us more.
The answer isn't obvious. Systematic grid search gives you coverage. Random sampling gives you reality. You need both, and you need to know what each one is telling you.
We ran 124 million simulated cornhole throws. Here's what it cost and what we got.
A parametric physics engine, 8 parameters, two overnight runs. The database exists. Here's the honest accounting of what building it took and what we learned that we couldn't have learned any other way.
What it actually takes to build an AR overlay on a physical object in real time.
AR on physical objects is 80% coordinate system problems. Claude is great at helping you think through the geometry. You still have to understand it yourself.
Training a custom CV model with Claude: the data quality lesson we learned the hard way.
Clean data beats model size. Every time. Don't upgrade the model until you've audited the labels.
We put a language model inside a hardware device. Here's every decision we made.
LLMs in real-time hardware aren't ChatGPT. Latency budget is the constraint that changes everything.
Claude built our CV pipeline. Then it lied about being done.
Agents are unreliable judges of their own work. Here's how a structural fix — not a smarter model — stopped QC's CV pipeline from shipping silent failures.
I pasted my session tokens into a chat. Here's the gate I built.
Sixteen cookies that together are my whole Google account, dropped into a chat window — while building a security playbook. The how is the whole point.
Computer vision already runs elite sports. It's about to run the rec league too.
Twelve-camera tracking rigs and Hawk-Eye are infrastructure at the top. The same capability now fits on a $249 board and a commodity camera. What that unlocks across every sport — and why the smartest way in is the narrowest one.
I run three ventures from one Obsidian vault. Here's the 13-folder template.
Why a markdown vault outperforms a Notion plus Asana plus Drive plus Slack stack when AI agents are part of the work. Battle-tested across QC, MHG, and Parley.
Why I write a postmortem for every meaningful incident, and the template that makes it take 45 minutes.
Twenty-six postmortems across QC and Parley in six weeks. The template, four worked examples, and the discipline that makes the same mistake stop happening twice.
I shipped a Sprint Contract template. Here's why my AI agents kept declaring done when they weren't.
A 48-contract system born from agents that praised their own work. The fix wasn't a smarter model. It was structural.
I'm starting a publication. Here's what it is and why.
Field notes from an operator running three ventures on Claude Code. Biweekly. No theory. Receipts only.
How I run six AI agents across three ventures without them stepping on each other.
The routing model that holds up under load. Built for TruPath, tested on Mile High Golf, Quantum Caddy, and Parley.
I tried to build my own AI-native business OS. Here's why I scrapped it.
A 20-agent Electron app that didn't ship. The decision to throw it away. What I'd tell anyone tempted to build the same thing.