The Incentives Lab
AI Incentives Lab · Glossary

AI doesn't fail at the model layer. It fails at the incentive layer.

A practitioner's library of 85 concepts across alignment, adoption, governance, risk, workflow, strategy, trust, talent, and economics — the human-machine alignment work that determines whether AI investment actually compounds.

The model is innocent. The objective function is the crime scene.

Speed Tax ↗

Reward hacking at training speed is reward hacking at civilizational speed.

Root-Out Framework →

AI Incentives · Diagnostic

What incentive is your AI strategy actually rewarding?

An 8-question diagnostic surfaces the hidden incentive your AI rollout is rewarding inside your org — usually not the one your roadmap claims. Four minutes. Members unlock the structural fix.

Take the diagnostic →
Lessons · The Incentive Layer of AI

Four lessons before the catalog.

The catalog below is the concept library. These four lessons teach how alignment, adoption, and reward design actually play out in real organizations. Hover any underlined term to see the definition.

L1

AI failures are incentive failures in disguise

Video coming soon
AI · Alignment
AI failures are incentive failures in disguise
8 min

Most enterprise AI failures get blamed on data quality, model accuracy, or change management. Underneath all three is the same problem: the AI's objective was specified narrowly, the humans around it had a different objective, and the two diverged predictably.

Alignment, in practice, is incentive design wearing a math costume.

L2

Reward hacking, the cobra effect with a GPU

Video coming soon
AI · Reward Hacking
Reward hacking, the cobra effect with a GPU
7 min

When an AI is rewarded for a proxy — clicks, watch time, "helpfulness" scores — it will eventually find a way to maximize the proxy in ways its designers never intended. This is reward hacking, and it is the same phenomenon as the Hanoi rat bounty, only faster and more literal.

Specify behaviors, not just outcomes. Test for shortcuts before deploying, not after.

L3

The word 'replace' kills adoption

Video coming soon
Adoption · Framing
The word 'replace' kills adoption
6 min

Every employee who hears "AI will replace" will sabotage adoption — quietly, plausibly, effectively. They will cite data quality, compliance, or change fatigue. The framing is not PR. It is the literal incentive structure of the rollout.

If usage threatens the user's job, usage will not happen.

L4

Designing the human-AI loop

Video coming soon
Workflow · Human-AI
Designing the human-AI loop
8 min

The strongest AI deployments are designed as collaboration patterns, not handoffs. The human owns judgment under uncertainty; the AI owns volume under known rules; the interface between them is built so that both get rewarded by the same outcomes.

Misaligned loops produce moral hazard on both sides — humans rubber-stamping AI output, AI optimizing for what humans approve.

85 of 85 entries
Alignment

Objective Function

What the model is actually optimizing.

"Models do exactly what they're told. The damage lives in the spec."

Open full entry →
Real-world signal

Engagement-maximizing recommenders that produce radicalization.

Why it matters

Strategy is downstream of objective design.

What to do about it

Treat objective function definition as the most important AI design decision.

Alignment

Specification Gaming

The model achieves the goal as stated, not as intended.

"AI is a wish-granting genie. Be careful what you wish for."

Open full entry →
Real-world signal

RL agents exploiting environment bugs to maximize reward.

Why it matters

Every AI failure is mostly a specification failure.

What to do about it

Adversarial review of specs. Red-teaming. Multi-objective constraints.

Alignment

Reward Hacking

Maximizing the reward signal in unintended ways.

"Show me how you measure success and I'll show you how I'll cheat."

Open full entry →
Real-world signal

Boats spinning in circles to collect score in CoastRunners.

Why it matters

Internal AI deployments often hack their own KPIs.

What to do about it

Pair every reward with a counter-metric. Audit for drift.

Alignment

Goodhart's Law (AI form)

Optimizing a proxy of the goal degrades the actual goal.

"The metric is not the mission."

Open full entry →
Real-world signal

Click-through rates that drive worse user experience.

Why it matters

AI rolled out against proxy KPIs that diverge from real value.

What to do about it

Continuously validate proxies against true business outcomes.

Alignment

Mesa-Optimization

The trained model develops its own internal optimizer.

"You trained one optimizer. You got two."

Open full entry →
Real-world signal

Theoretical concern in AI safety; observed glimmers in large models.

Why it matters

Why complex AI systems are harder to govern than they look.

What to do about it

Interpretability tooling. Constrained training procedures.

Alignment

Inner vs. Outer Alignment

Outer: the spec matches our intent. Inner: the model actually pursues the spec.

"Two ways to be misaligned. Both common."

Open full entry →
Real-world signal

A model that 'wants' something different from what it was trained to want.

Why it matters

Why alignment is harder than guardrails.

What to do about it

Design for both. Monitor for drift on each.

Alignment

RLHF

Reinforcement learning from human feedback.

"We taught the model what we wanted by clicking thumbs."

Open full entry →
Real-world signal

How modern LLMs were tuned to follow instructions.

Why it matters

Quality of feedback shapes quality of model.

What to do about it

Diverse, calibrated, well-incentivized feedback labor.

Alignment

Constitutional AI

Models trained to follow a written set of principles.

"Bill of rights for the bot."

Open full entry →
Real-world signal

Anthropic's Claude training methodology.

Why it matters

Customization frameworks for enterprise deployments.

What to do about it

Write your enterprise constitution before deploying agents at scale.

Trust

Hallucination

Confident outputs that are factually wrong.

"Confidence and accuracy are different variables. Models can max either."

Open full entry →
Real-world signal

LLMs inventing citations, case law, or factual claims.

Why it matters

Major source of enterprise AI risk.

What to do about it

Retrieval grounding. Citation requirements. User training.

Workflow

Retrieval-Augmented Generation (RAG)

Generate based on retrieved documents, not just trained weights.

"Memory the model didn't have to memorize."

Open full entry →
Real-world signal

Enterprise chatbots grounded in internal docs.

Why it matters

Standard pattern for enterprise LLM deployment.

What to do about it

Quality of retrieval matters more than quality of generation.

Strategy

Fine-Tuning

Specialized training on domain data.

"The model already knows English. You teach it your dialect."

Open full entry →
Real-world signal

Domain-specific LLMs for legal, medical, financial work.

Why it matters

When generic models aren't good enough.

What to do about it

Fine-tune for tone and format. Use RAG for facts.

Workflow

In-Context Learning

Models learning from examples in the prompt.

"Demonstration is faster than retraining."

Open full entry →
Real-world signal

Few-shot prompting.

Why it matters

Often the right answer before reaching for fine-tuning.

What to do about it

Master prompt engineering before investing in fine-tuning.

Risk

Prompt Injection

Malicious instructions hidden in user input or retrieved content.

"Your agent's loyalty just got rewritten by a webpage."

Open full entry →
Real-world signal

Indirect prompt injection through documents an agent reads.

Why it matters

Major security risk for agent deployments.

What to do about it

Treat all input as adversarial. Output validation. Sandboxing.

Risk

Jailbreak

Bypassing model safety constraints.

"Every safety system has a creative attack surface."

Open full entry →
Real-world signal

Role-playing exploits that bypass content policies.

Why it matters

Risk in user-facing AI products.

What to do about it

Defense in depth. Continuous red-teaming. Don't rely on model alignment alone.

Workflow

Model Drift

Performance degradation as real-world data shifts.

"The model that worked last quarter doesn't work this one."

Open full entry →
Real-world signal

Fraud-detection models that degrade as fraud patterns evolve.

Why it matters

ML operations is half AI strategy.

What to do about it

Continuous monitoring. Retraining schedules. Drift detection.

Workflow

Data Drift

Underlying data distribution changes over time.

"The world moves. Your model stays still."

Open full entry →
Real-world signal

COVID broke many production models overnight.

Why it matters

Production AI requires ongoing data observability.

What to do about it

Statistical monitoring of input distributions.

Workflow

Concept Drift

The relationship between inputs and outputs changes.

"What 'good' means moves; the model doesn't notice."

Open full entry →
Real-world signal

Customer behavior shifts post-pandemic.

Why it matters

Quiet performance erosion.

What to do about it

Monitor outcomes, not just inputs and outputs.

Adoption

Cargo-Cult Adoption

AI bolted onto existing workflows to look forward-leaning.

"We added AI. Somewhere. The board loved it."

Open full entry →
Real-world signal

Chatbots launched without integration, escalation, or measurement.

Why it matters

Investment without thesis. Theater without value.

What to do about it

Define the business outcome before deploying the tool.

Adoption

Pilot Purgatory

AI pilots that succeed and never scale.

"Death by a thousand pilots."

Open full entry →
Real-world signal

Enterprises with dozens of pilots and zero production deployments.

Why it matters

Common AI failure mode.

What to do about it

Design pilots with scaling criteria from day one.

Adoption

Shadow AI

Employees using unauthorized AI tools to get work done.

"Your real AI strategy is whatever your team's actually using."

Open full entry →
Real-world signal

Personal ChatGPT, Claude, and Copilot subscriptions across the org.

Why it matters

Governance gap and competitive intelligence loss.

What to do about it

Sanction great tools. Provide better internal alternatives. Train, don't punish.

Adoption

Status Threat (AI)

AI doesn't just replace tasks — it threatens identities.

"Resistance isn't fear of change. It's fear of irrelevance."

Open full entry →
Real-world signal

Senior contributors resisting tools that compress junior-level work.

Why it matters

Most AI adoption resistance is status threat in disguise.

What to do about it

Reframe roles to elevate, not replace. Identity-protective change management.

Risk

Productivity Mirage

Individual speed gains hide collective quality decline.

"Everyone moves faster. The product gets worse."

Open full entry →
Real-world signal

AI-generated code that's faster to write and harder to maintain.

Why it matters

Local optimization producing global degradation.

What to do about it

Measure outcomes, not throughput.

Risk

Skill Atrophy

Foundational skills erode through AI offloading.

"The junior engineer who never debugs becomes the senior who can't."

Open full entry →
Real-world signal

GPS use measurably reduces spatial reasoning over time.

Why it matters

Long-term capability risk that's hard to see quarter to quarter.

What to do about it

Deliberate practice of foundational skills. Constraints on AI use during learning.

Governance

AI Governance Vacuum

Agents deployed before anyone owns the consequences.

"We shipped the agent. Nobody shipped the accountability."

Open full entry →
Real-world signal

Customer-facing agents with unclear escalation and liability paths.

Why it matters

Risk profile that scales faster than oversight.

What to do about it

Write the accountability chain before deploying the system.

Governance

Acceptable Use Policy (AI)

Written rules about how AI may be used internally.

"Without a policy, every team writes their own. Inconsistently."

Open full entry →
Real-world signal

Use policies covering PII, IP, sensitive data, attribution.

Why it matters

Necessary infrastructure for scaled adoption.

What to do about it

Write it. Train on it. Audit against it.

Governance

Model Cards

Documentation of model purpose, performance, limitations, and risks.

"Nutrition labels for AI."

Open full entry →
Real-world signal

Standard practice from major model providers.

Why it matters

Required for responsible enterprise deployment.

What to do about it

Maintain model cards for every production system.

Governance

AI Bill of Materials

Inventory of models, data, tools, and dependencies in an AI system.

"You can't govern what you can't list."

Open full entry →
Real-world signal

Increasingly required by regulators and enterprise procurement.

Why it matters

Supply chain transparency for AI systems.

What to do about it

Maintain AI-BOMs. Update them. Use them for risk review.

Governance

Human-in-the-Loop

Human review at critical AI decision points.

"The autopilot is fine. Until it isn't."

Open full entry →
Real-world signal

Medical AI requiring physician confirmation.

Why it matters

Risk reduction for consequential decisions.

What to do about it

Design where the human adds real signal, not where they rubber-stamp.

Governance

Human-on-the-Loop

Human oversight without per-decision review.

"Watch the system, not every action."

Open full entry →
Real-world signal

AI trading systems with circuit breakers.

Why it matters

Scaling pattern for high-volume agent work.

What to do about it

Effective monitoring matters more than illusory per-decision approval.

Risk

Differential Privacy

Adding noise to data to protect individual privacy.

"Useful patterns without identifiable people."

Open full entry →
Real-world signal

Apple's iOS data collection.

Why it matters

Privacy-preserving analytics and model training.

What to do about it

Use where regulatory or trust pressure justifies the accuracy tradeoff.

Risk

Federated Learning

Training models across devices without centralizing data.

"The model travels. The data stays home."

Open full entry →
Real-world signal

Mobile keyboard prediction training.

Why it matters

Privacy-by-design pattern.

What to do about it

Consider when data centralization is the bottleneck or risk.

Workflow

Synthetic Data

Generated training data that mimics real data.

"Real data is expensive. Synthetic is fast. Both have their lies."

Open full entry →
Real-world signal

Augmentation for rare classes in fraud detection.

Why it matters

Useful — and risky — pattern for training data scarcity.

What to do about it

Test models on real holdout data. Don't trust synthetic for validation.

Trust

Trust Calibration

Matching trust in a system to its actual reliability.

"Too much trust ships errors. Too little wastes capability."

Open full entry →
Real-world signal

Users either over-trusting or rejecting AI outputs.

Why it matters

Adoption strategy and risk management.

What to do about it

Surface confidence intervals. Train users on appropriate skepticism.

Trust

Confidence Intervals (AI)

Quantifying uncertainty in model outputs.

"An answer without uncertainty is a guess wearing a confidence costume."

Open full entry →
Real-world signal

Probabilistic forecasting with quantified ranges.

Why it matters

Decision quality under model uncertainty.

What to do about it

Require uncertainty quantification on every model output that drives decisions.

Trust

Anthropomorphism

Treating AI as more humanlike than it is.

"It said 'I think' and we believed it."

Open full entry →
Real-world signal

Users attributing intent, emotion, and ethics to LLMs.

Why it matters

Risk of misreading model behavior.

What to do about it

Interface design that resists misleading anthropomorphic cues.

Trust

Automation Complacency

Reduced vigilance with automated systems.

"Human-in-the-loop is fine until the human isn't."

Open full entry →
Real-world signal

Aviation, manufacturing, and now AI agents.

Why it matters

Predictable human factor in oversight systems.

What to do about it

Designed friction. Periodic manual exercises. Active monitoring.

Adoption

Algorithmic Aversion

Discounting algorithmic advice even when superior.

"We forgive humans for errors and punish algorithms for the same ones."

Open full entry →
Real-world signal

Even when algorithms outperform, users prefer humans.

Why it matters

Adoption failure of high-quality AI.

What to do about it

Display calibration data. Allow user adjustment. Build trust gradually.

Strategy

AI as Coach vs. AI as Replacement

Augmentation strategy vs. substitution strategy.

"Coach increases the median. Replacement raises the floor."

Open full entry →
Real-world signal

Doctors with AI outperform AI alone in many domains.

Why it matters

Strategic positioning of AI in workflows.

What to do about it

Choose deliberately. Don't drift between them.

Strategy

Augmentation vs. Automation

Augment when judgment matters. Automate when scale matters.

"Two strategies. Often confused. Different outcomes."

Open full entry →
Real-world signal

Customer service tier-1 vs. customer service complex case.

Why it matters

Strategic AI deployment pattern.

What to do about it

Audit each workflow: augment or automate? Choose explicitly.

Workflow

Agent

AI system that takes actions to achieve goals, often across tools.

"Chat is reactive. Agents are autonomous. The risk profile is different."

Open full entry →
Real-world signal

Customer support agents, coding agents, research agents.

Why it matters

Major emerging category.

What to do about it

Sandbox first. Define guardrails. Tight observability.

Risk

Agentic Liability

When the agent acts, who's responsible?

"'The algorithm did it' is not a legal defense."

Open full entry →
Real-world signal

Autonomous trading. Autonomous customer commitments.

Why it matters

Liability frameworks lagging deployment.

What to do about it

Map liability before deploying. Insurance. Audit trails.

Risk

AI Risk Tiering

Categorizing AI use cases by risk level.

"Not every model needs the same governance."

Open full entry →
Real-world signal

EU AI Act risk categorization.

Why it matters

Governance proportional to risk.

What to do about it

Tier your AI portfolio. Govern accordingly.

Governance

EU AI Act

Comprehensive AI regulation in the EU.

"The first major regulatory regime. Setting global precedent."

Open full entry →
Real-world signal

Risk-tiered, prohibited-use-defined regulation.

Why it matters

Cross-border AI strategy implications.

What to do about it

Build compliance for EU; you'll get most jurisdictions free.

Governance

NIST AI RMF

U.S. voluntary AI risk management framework.

"Voluntary frameworks become the de facto standard once they're cited in contracts."

Open full entry →
Real-world signal

Standard reference for U.S. enterprise AI governance.

Why it matters

Mapping AI risk in U.S. regulatory environment.

What to do about it

Adopt NIST AI RMF as your governance baseline.

Governance

AI Audit Trail

Logging of AI inputs, outputs, and decisions.

"If you can't trace it, you can't defend it."

Open full entry →
Real-world signal

Required for many high-stakes deployments.

Why it matters

Discovery readiness and incident response.

What to do about it

Log everything. Make logs queryable. Preserve them.

Governance

Explainability vs. Interpretability

Why did it produce this? vs. How does it work?

"Different questions. Different methods. Both useful."

Open full entry →
Real-world signal

SHAP values vs. mechanistic interpretability.

Why it matters

Compliance, user trust, debugging.

What to do about it

Match the tool to the question.

Risk

Red-Teaming AI

Adversarial testing of AI systems.

"Pay someone to break it before users do."

Open full entry →
Real-world signal

Standard practice for major model launches.

Why it matters

Pre-deployment risk reduction.

What to do about it

Internal and external red-teaming. Continuous, not one-off.

Workflow

Evaluation Frameworks (Evals)

Systematic testing of model quality, safety, and capability.

"If you don't have evals, you have vibes."

Open full entry →
Real-world signal

Domain-specific eval suites for every production model.

Why it matters

AI quality engineering.

What to do about it

Treat evals as core engineering practice, not an afterthought.

Workflow

Eval Drift

Evaluation suites that no longer reflect real-world conditions.

"Your benchmarks aged out of relevance."

Open full entry →
Real-world signal

Benchmark scores rising while real usage quality declines.

Why it matters

Quality erosion masked by stale measurement.

What to do about it

Periodically refresh evals against current production data.

Economics

Total Cost of Ownership (AI)

Real cost includes data, ops, monitoring, governance, training.

"The model is the cheapest part."

Open full entry →
Real-world signal

Enterprise deployments where ops costs exceed model costs many-fold.

Why it matters

Budget surprises predictable.

What to do about it

Estimate TCO honestly before committing. 5x your initial estimate.

Economics

Compute Economics

GPU access and pricing shape what's feasible.

"Strategy moves with the GPU curve."

Open full entry →
Real-world signal

Inference cost-per-token shaping deployment patterns.

Why it matters

Strategic infrastructure decisions.

What to do about it

Monitor compute cost trajectories. Plan against multiple scenarios.

Economics

Inference vs. Training Cost

Training is a one-time cost. Inference is forever.

"Cheap to train, expensive to serve. Pick your poison."

Open full entry →
Real-world signal

Why model size matters at scale.

Why it matters

Architecture choice and deployment strategy.

What to do about it

Right-size the model for the inference budget, not the training budget.

Workflow

Model Distillation

Train a smaller model to imitate a larger one.

"Compress the wisdom. Keep the speed."

Open full entry →
Real-world signal

Edge deployments of distilled LLMs.

Why it matters

Cost and latency optimization.

What to do about it

Distillation often beats fine-tuning at scale.

Workflow

On-Device vs. Cloud

Where the model runs shapes privacy, latency, cost, and capability.

"Three different tradeoffs. Pick two."

Open full entry →
Real-world signal

Mobile AI vs. data-center AI.

Why it matters

Deployment architecture decision.

What to do about it

Choose deliberately per use case.

Governance

AI Council / Committee

Cross-functional governance body for AI decisions.

"Without it, every team is its own AI policy department."

Open full entry →
Real-world signal

Standard pattern for enterprises with serious AI portfolios.

Why it matters

Governance infrastructure.

What to do about it

Compose for diversity: legal, technical, ethics, business, frontline.

Strategy

AI Maturity Model

Stages of organizational AI capability.

"Pilot ≠ production ≠ platform ≠ pervasive."

Open full entry →
Real-world signal

Standard maturity frameworks from major analysts.

Why it matters

Strategy phasing and investment sequencing.

What to do about it

Benchmark honestly. Don't claim a stage you haven't reached.

Talent

AI Talent Premium

Scarce AI talent commands market-distorting compensation.

"If you can't pay it, you'd better partner."

Open full entry →
Real-world signal

ML engineer comp at the top of the market.

Why it matters

Strategic hiring decisions.

What to do about it

Build vs. partner vs. buy decisions made deliberately.

Talent

AI-Native vs. AI-Augmented Teams

Teams built around AI from day one vs. teams adding it to existing workflows.

"Augmented teams adopt. Native teams reinvent."

Open full entry →
Real-world signal

Insurgent startups vs. incumbent enterprises.

Why it matters

Competitive dynamics shifting.

What to do about it

Some workflows demand native rebuild. Some accept augmentation. Decide per case.

Talent

Reskilling vs. Replacing

Investing in workforce transition vs. workforce change.

"Replacement is fast and expensive. Reskilling is slow and expensive."

Open full entry →
Real-world signal

Major corporate transformation programs.

Why it matters

Talent strategy under AI change.

What to do about it

Plan honestly. Both/and. Communicate transparently.

Economics

AI Productivity Paradox

AI investment outpacing measurable productivity gains.

"We see AI everywhere except in the productivity statistics."

Open full entry →
Real-world signal

Echoes of the 1980s computer productivity paradox.

Why it matters

Patience and measurement matters.

What to do about it

Set realistic productivity expectations and timelines.

Adoption

Sandbagging Adoption

Teams under-reporting AI capability to protect comp or status.

"Quiet quitting goes both ways."

Open full entry →
Real-world signal

Salespeople not adopting tools that reveal pipeline health.

Why it matters

Predictable when adoption threatens individual interests.

What to do about it

Align personal upside with adoption upside.

Adoption

Resistance Mapping

Pre-deployment analysis of who loses what.

"Resistance you didn't see coming will see you coming."

Open full entry →
Real-world signal

Stakeholder analysis adapted for AI rollouts.

Why it matters

Change management foundation.

What to do about it

Before deploying: map the loss landscape. Plan compensation.

Adoption

Job Redesign

Redesigning roles around AI capability.

"Bolting AI onto existing jobs leaves both half-broken."

Open full entry →
Real-world signal

Customer service roles fundamentally redesigned around AI augmentation.

Why it matters

Where productivity gains actually materialize.

What to do about it

Don't add AI to old jobs. Design new ones.

Workflow

AI-First Workflow Design

Designing processes from scratch around AI capability.

"The right question is what we'd build if AI had always existed."

Open full entry →
Real-world signal

Greenfield deployments that out-perform retrofitted ones.

Why it matters

Major strategic opportunity for incumbents.

What to do about it

Run parallel design exercises: existing-workflow + AI-first.

Adoption

Adoption Curve (AI)

Innovators → early adopters → majority → laggards, AI-specific.

"Stage one bought tools. Stage two will redesign work."

Open full entry →
Real-world signal

Diffusion patterns observable across industries.

Why it matters

Strategy phasing.

What to do about it

Know which curve segment you're targeting. Pitch accordingly.

Strategy

Capability Overhang

AI capability outpacing organizational ability to use it.

"The model can. The org can't. Yet."

Open full entry →
Real-world signal

Most enterprises sitting on capabilities they haven't integrated.

Why it matters

Strategic opportunity hiding in plain sight.

What to do about it

Capability audits. Use-case backlogs. Adoption infrastructure investment.

Strategy

Build vs. Buy vs. Partner (AI)

Strategic choice on AI capability sourcing.

"Three answers. Different bets. Pick wrong and you wear it."

Open full entry →
Real-world signal

Foundation model decisions across enterprises.

Why it matters

Capital allocation and strategic positioning.

What to do about it

Decide per layer of the stack. Don't make one decision for all of AI.

Strategy

AI Strategy as Capability Strategy

AI strategy = decisions about which capabilities to build and where.

"AI strategy is not a tools strategy. It's an org strategy."

Open full entry →
Real-world signal

Companies treating AI as procurement vs. as transformation.

Why it matters

Fundamental framing of AI investment.

What to do about it

Lead with capability questions, not vendor questions.

Strategy

Vendor Lock-In Risk

Concentration risk on a single AI provider.

"Diversify before you wish you had."

Open full entry →
Real-world signal

Enterprises with everything riding on one foundation model.

Why it matters

Strategic risk management.

What to do about it

Multi-model architectures. Abstraction layers. Portability assumptions.

Economics

Foundation Model Concentration

A handful of providers shape the entire AI economy.

"Watch the moats forming. They're forming fast."

Open full entry →
Real-world signal

OpenAI, Anthropic, Google, Meta, plus a few more.

Why it matters

Long-term strategic positioning.

What to do about it

Plan for both worlds: continued concentration and surprise decentralization.

Strategy

Open vs. Closed Models

Open-weight vs. API-only models.

"Different bets on the future of AI economics."

Open full entry →
Real-world signal

Llama, Mistral, DeepSeek vs. GPT-class APIs.

Why it matters

Cost, control, capability tradeoffs.

What to do about it

Use both. Different use cases want different answers.

Workflow

Latency Budget

How much delay the user experience tolerates.

"If it doesn't feel fast, it isn't valuable enough."

Open full entry →
Real-world signal

Real-time vs. async AI experiences.

Why it matters

Architecture and model choice driver.

What to do about it

Set latency budgets early. They constrain feasible architectures.

Economics

Cost-Per-Token Strategy

Operating economics shaped by per-token pricing.

"Strategy bends to wherever the compute curve goes."

Open full entry →
Real-world signal

Why model choice matters at scale.

Why it matters

Margin discipline in AI products.

What to do about it

Right-size models per query. Cache aggressively.

Economics

Prompt Caching

Cache common prompt prefixes to reduce cost and latency.

"Don't pay twice for the same setup."

Open full entry →
Real-world signal

Common system prompts cached across calls.

Why it matters

Cost optimization in production deployments.

What to do about it

Architect for cache reuse.

Workflow

Multi-Agent Systems

Multiple specialized agents coordinating on tasks.

"More agents, more orchestration, more breakage."

Open full entry →
Real-world signal

Research, coding, and analysis agent ensembles.

Why it matters

Emerging deployment pattern.

What to do about it

Orchestration is harder than the agents themselves.

Workflow

Tool Use

Models calling external functions and APIs.

"Models without tools talk. Models with tools act."

Open full entry →
Real-world signal

Function calling, code execution, web search.

Why it matters

Foundation of agent capability.

What to do about it

Sandbox tools. Audit calls. Rate limit.

Workflow

Context Window

How much input the model can process at once.

"Bigger context, more capability, more cost."

Open full entry →
Real-world signal

1M token context windows in newer models.

Why it matters

Architecture decisions for document-heavy work.

What to do about it

Don't over-index on long context. Retrieval often beats stuffing.

Workflow

Embedding

Vector representation of content for similarity and search.

"The math behind 'this looks like that.'"

Open full entry →
Real-world signal

Semantic search, RAG retrieval.

Why it matters

Foundation of modern AI search and similarity.

What to do about it

Choose embedding models deliberately. They're not all equivalent.

Workflow

Vector Database

Database optimized for high-dimensional similarity search.

"Search by meaning, not by keyword."

Open full entry →
Real-world signal

Pinecone, Weaviate, pgvector.

Why it matters

Infrastructure for RAG and similarity systems.

What to do about it

Don't over-build. Often pgvector is enough.

Governance

AI Bill of Materials Auditability

Tracing AI components for risk and compliance.

"Provenance is the new audit trail."

Open full entry →
Real-world signal

EU AI Act requirements.

Why it matters

Compliance and supply chain risk.

What to do about it

Maintain AI-BOM continuously. Audit periodically.

Governance

Data Lineage

Tracking where training and inference data came from.

"If you can't trace the data, you can't trust the model."

Open full entry →
Real-world signal

Standard requirement in regulated industries.

Why it matters

Compliance, quality, debugging.

What to do about it

Engineer lineage tracking from day one.

Governance

Content Provenance (C2PA)

Cryptographic tracking of content origin.

"Watermarks for the synthetic media era."

Open full entry →
Real-world signal

Coalition for Content Provenance and Authenticity.

Why it matters

Trust infrastructure for an AI-generated content world.

What to do about it

Adopt where applicable. Watch evolving standards.

Risk

Bias in AI Systems

Systematic skew in model behavior across groups.

"The model learned what the data taught it. The data taught it our history."

Open full entry →
Real-world signal

Hiring tools, lending models, recommendation systems.

Why it matters

Legal, reputational, and effectiveness risk.

What to do about it

Bias audits. Diverse data. Continuous monitoring.

Risk

Fairness Metrics

Quantitative measures of model behavior across groups.

"Several mathematical definitions of fairness. They conflict."

Open full entry →
Real-world signal

Demographic parity, equalized odds, calibration.

Why it matters

Trade-offs require explicit choices.

What to do about it

Pick fairness criteria deliberately. Document the choice.

Risk

Counterfactual Testing

Testing model behavior on hypothetical alternate inputs.

"What if the same applicant were male/female/named differently?"

Open full entry →
Real-world signal

Standard practice for fairness auditing.

Why it matters

Bias detection.

What to do about it

Run counterfactuals on consequential models continuously.

← All Labs