Documentation Getting Started What You Can Learn

What You Can Learn

Primer isn’t a dashboard you glance at — it’s an intelligence layer that answers specific questions about how your organization uses AI coding tools. This guide explains what you can learn at each level, how to interpret the metrics, and what actions to take.

The Three Levels

Primer provides insights at three levels, each targeting a different audience:

LevelAudienceCore Question
OrganizationVP Eng, DirectorsIs our AI investment working?
TeamTeam Leads, ManagersHow do I help my team get better?
IndividualEngineersHow do I improve my own practice?

Every metric in Primer exists to answer a question that couldn’t be answered before AI tools generated session data.


Organization Level

Adoption & ROI

The question: Are we getting value from our AI tool investment?

Primer computes:

  • Adoption rate — percentage of engineers who used an AI tool in the period. A common pattern: adoption starts at 30-40% and plateaus around 80-90%. If you’re stuck below 60%, look at the maturity page to find which teams lag.
  • Sessions per engineer per day — a proxy for integration depth. Early adopters average 1-2/day; power users hit 4-6. Below 1/day often means the tool isn’t integrated into daily workflow yet.
  • ROI ratio — estimated time saved per dollar spent (configurable via PRIMER_MINUTES_PER_SESSION). This isn’t precise, but directionally useful for justifying budget.
  • Cost per successful outcome — total spend divided by sessions that achieved their goal. This is the number to optimize, not raw cost.

Track adoption rate over time, not as a snapshot. A team at 60% adoption trending upward is healthier than one at 85% and flat — the first is still learning, the second may have stalled.

Friction Analysis

The question: What systemic issues are making AI tools less effective across the org?

Primer classifies friction into types:

Friction TypeWhat It MeansTypical Fix
permission_deniedTool tried to access something it couldn’tUpdate .claude/settings.json allowed directories
context_limitSession hit token limitBreak tasks into smaller sessions; use project-scoped instructions
timeoutTool or command timed outInvestigate slow test suites or builds
edit_conflictTool’s edit was rejected or conflictedImprove CLAUDE.md with project conventions
tool_errorAn MCP tool or integration failedCheck tool configuration and availability

The key insight: Primer doesn’t just count friction — it scores impact. A friction type that occurs often but doesn’t affect success rates is noise. A type that occurs less often but correlates with 40% lower success rates is worth fixing immediately.

Look at the Friction Impact chart on the dashboard. The x-axis is frequency, the y-axis is success rate penalty. Items in the upper-right quadrant are your highest-priority fixes.

Cost Intelligence

The question: Are we spending efficiently?

Beyond raw spend tracking, Primer provides:

  • Cache savings analysis — How much money prompt caching is saving and how much more it could save. If your org-wide cache hit rate is below 40%, engineers may not be using project context files effectively.
  • API vs. subscription modeling — For each engineer, Primer estimates whether they’d be cheaper on API billing, Pro ($20/mo), or Max ($100/mo). The aggregate recommendation shows potential monthly savings.
  • 30-day cost forecast — Linear regression on daily costs with confidence bands. Useful for budget planning and catching trends before they become problems.
  • Budget burn-rate alerts — Set monthly budgets per team. Primer calculates daily burn rate and projects whether you’ll exceed the budget, alerting before it happens.

Team Level

Engineer Benchmarking

The question: How do my engineers compare, and who needs help?

The Engineers leaderboard ranks by multiple dimensions:

  • Success rate — percentage of sessions that achieve their goal. Team median is typically 55-70%. Engineers consistently below 50% likely need configuration help or training.
  • Cost efficiency — total cost divided by sessions. High cost with high success is fine. High cost with low success means the engineer is spending more but getting less.
  • Leverage score — a 0-100 composite measuring how sophisticatedly someone uses AI tools (tool diversity, orchestration use, cache efficiency). Higher leverage correlates with higher success rates.
  • PR impact — number of PRs linked to AI sessions, and merge rate of those PRs.

Don’t use these metrics punitively. An engineer with a low success rate but high session count is trying hard — they need guidance, not criticism. An engineer with zero sessions needs a different conversation about whether they’ve adopted at all.

Onboarding & Ramp-Up

The question: How quickly are new hires becoming effective with AI tools?

Primer’s cohort analysis groups engineers by their first session date and tracks how their metrics evolve week-over-week. You can see:

  • Time to team median — how many weeks until a new hire’s success rate matches the team average
  • Ramp-up patterns — whether new hires plateau early or continue improving
  • Configuration gaps — new hires often have default configs; Primer’s config optimization flags when someone’s setup differs significantly from high-performers

Code Quality Impact

The question: Does AI-assisted code meet our quality bar?

Primer links sessions to GitHub PRs and compares:

MetricClaude-AssistedNon-ClaudeWhat It Means
Merge rateHigher/lower?BaselineAre AI PRs being accepted?
Review commentsMore/fewer?BaselineDo AI PRs need more review?
Time to mergeFaster/slower?BaselineDoes AI speed up delivery?
Code volumeLarger/smaller?BaselineAre AI sessions more ambitious?

These comparisons answer the skeptic’s question: “Is the AI actually helping or just creating more work for reviewers?”

Primer also tracks automated review findings from bots like BugBot (cursor[bot]). For each PR, it parses review comments to extract:

  • Severity breakdown — how many high, medium, and low severity issues were flagged
  • Fix rate — percentage of findings resolved before merge
  • Average findings per PR — trend over time to see if AI-assisted code is improving

This gives you a concrete quality signal beyond merge rates: are AI-assisted PRs generating fewer (or more) automated review issues than the baseline?


Individual Level

Personal Trajectory

The question: Am I getting better at using AI tools?

Each engineer’s profile shows weekly sparklines for:

  • Success rate trend — are you succeeding more often over time?
  • Session volume — are you using the tool more or less?
  • Cost trend — is your cost per session stable, rising, or falling?
  • Tool diversity — are you expanding your tool usage or stuck on the basics?

The goal is upward trends in success rate and tool diversity, with stable or declining cost per session.

Friction Breakdown

The question: What specific problems keep disrupting my sessions?

Your profile shows your top friction types ranked by frequency and impact. Common patterns:

  • Heavy on context_limit? You’re probably trying to do too much in single sessions. Break work into focused tasks.
  • Lots of permission_denied? Your Claude settings need updating — you’re hitting guardrails that slow you down.
  • Frequent edit_conflict? Your project’s CLAUDE.md may need better conventions about code style and patterns.
  • tool_error spikes? An MCP server or integration is unreliable — check your tooling setup.

AI-Generated Insights

The question: What’s the big picture of my usage patterns?

Primer uses Claude to generate narrative reports about your patterns:

  • At a Glance — What’s working, what’s hindering, quick wins
  • Impressive Things You Did — Highlights from your sessions
  • Where Things Go Wrong — Friction pattern analysis
  • New Usage Patterns — Emerging behaviors worth building on
  • On the Horizon — Suggestions for what to try next

These narratives synthesize hundreds of data points into actionable stories.

MCP Sidecar

The question: How am I doing right now?

Engineers can query their own data mid-session via the MCP sidecar in Claude Code:

  • “What’s my success rate this week?”
  • “What friction am I hitting most?”
  • “How do I compare to my team?”
  • “What recommendations do you have for me?”

This closes the loop — instead of checking a dashboard after the fact, engineers get feedback in the moment they can act on it.


How to Think About These Metrics

Leverage vs. Effectiveness: The Two Dimensions of AI Maturity

Primer measures AI tool usage along two independent dimensions:

Leverage (0-100) measures how sophisticatedly an engineer uses AI tools. It has three sub-scores:

Sub-ScoreWhat It MeasuresFactors
Tool Mastery (33%)Breadth of tool usageTool diversity (Shannon entropy) + category spread (5 categories)
Orchestration Depth (33%)Sophistication of delegationOrchestration+skill ratio + agent team detection (bonus)
Efficiency (33%)Resource intelligenceCache hit rate + model diversity (cost tier coverage)

The model diversity factor rewards engineers who choose cost-appropriate models — using Haiku for quick lookups, Sonnet for standard coding, and Opus for complex architecture rather than defaulting to the most powerful model for everything.

The agent team detection factor rewards multi-agent orchestration (TeamCreate, SendMessage, Agent coordination). This is a bonus: engineers who don’t use teams aren’t penalized; those who do get an uplift.

Effectiveness (0-100) measures how well it’s working — the outcomes:

FactorWeightWhat It Measures
Success rate40%Percentage of sessions that achieve their goal
Cost efficiency30%Cost per success vs. team median (lower = better)
Session health30%Composite health score

Together, Leverage and Effectiveness create four diagnostic quadrants:

High EffectivenessLow Effectiveness
High LeverageMastery — advanced tooling, strong outcomesExperimenting — sophisticated but outcomes need work
Low LeverageEfficient basics — ships results, could level upNeeds support — both tool usage and outcomes lag

The AI Maturity page shows both scores per engineer with a quadrant view. Use this to target coaching: high-leverage/low-effectiveness engineers need better problem framing, not more tool training. Low-leverage/high-effectiveness engineers are ready to learn advanced patterns.

Leading vs. Lagging Indicators

TypeMetricsWhy
LeadingAdoption rate, leverage score, model diversity, tool diversityPredict future outcomes
LaggingEffectiveness score, success rate, cost per outcome, PR merge rateConfirm past outcomes

Focus on leading indicators for planning and lagging indicators for validation.

Healthy Ranges

These are rough benchmarks from typical engineering organizations:

MetricConcerningHealthyExcellent
Adoption rate< 40%60-85%> 85%
Success rate< 45%55-70%> 75%
Sessions/engineer/day< 0.51-33-6
Leverage score< 2535-55> 60
Effectiveness score< 4050-70> 75
Cache hit rate< 20%30-50%> 50%
Model diversity1 model2-3 models3+ across tiers
Cost per successTrending upStableTrending down

These benchmarks will vary by organization, team size, and project complexity. Use them as starting points, then establish your own baselines.

Actions by Signal

You SeeWhat It MeansWhat to Do
High adoption, low successEngineers are using the tool but strugglingInvest in CLAUDE.md, shared prompts, and training
Low adoption, high successA few people love it, most haven’t triedPair programming, demos, and reduce setup friction
Rising costs, stable successUsage is growing but efficiency isn’tReview cache optimization and session scope
Friction spike on one teamSomething changed in their environmentCheck recent infra changes, permission updates, or tooling
Leverage score plateauEngineers stopped exploring new capabilitiesShare patterns from high-leverage users, run workshops
High leverage, low effectivenessEngineers know the tools but outcomes lagFocus on problem framing, CLAUDE.md quality, and session scope
Low model diversityEngineers defaulting to one model for everythingTrain on model selection — Haiku for lookups, Sonnet for coding, Opus for architecture
New hire ramp-up > 4 weeksOnboarding for AI tools needs workCreate team-specific getting-started guides

Where to Find These Insights

Each insight area has a dedicated page in the Primer dashboard:

PageWhat It ShowsKey Questions Answered
Organization (/dashboard)KPIs, activity trends, outcomes, alerts, deep-dive navigationHow is our AI investment performing overall?
Friction (/friction)Friction trends, impact scoring, project-level breakdownWhat systemic issues hurt AI effectiveness most?
Code Quality (/quality)PR metrics, Claude-vs-non-Claude comparison, automated review findings, engineer rankingsAre AI-assisted PRs better?
Sessions (/sessions)Browse tab + Insights tab with health, satisfaction, goals, cacheHow are sessions performing in aggregate?
AI Maturity (/maturity)Leverage + Effectiveness scores, tool categories, Tools tab, model diversity, agent teamsHow mature is our AI usage? How effective are engineers?
Growth (/growth)Onboarding cohorts, shared patterns, team skill gapsHow fast are new hires ramping up? What patterns should we share?
FinOps (/finops)Cost tracking, cache analytics, subscription modeling, budgetsAre we spending efficiently?
Synthesis (/synthesis)AI-generated narrative reports at org, team, and engineer scopeWhat’s the big picture story?
Engineers (/engineers)Leaderboard with success rates, costs, leverage scoresWho needs help? Who’s excelling?
Engineer Profile (/engineers/:id)Weekly trajectory, friction breakdown, strengths, AI-generated insightsHow is this individual doing?

Next Steps