Back to Company news
Company news

From Log Chaos to Clarity: Debugging at Scale

When users report issues, debugging production systems means sifting through thousands of log lines. We built a slash command that turns log chaos into structured insights in 30 seconds. Here's how.

From Log Chaos to Clarity: Debugging at Scale
8 min readUpdated at April 15, 2026
Written and edited by
Akshay Kumar
Akshay Kumar
Engineering @ ngram.com

When something goes wrong in production, developers used to stare at thousands of log lines trying to piece together what happened. Now it takes 30 seconds.

The Problem: Logs Are a Haystack

Modern systems are complex. A single user request might touch multiple services, invoke various background jobs, and generate logs across different systems. When a user reports "something's not working," the debugging experience typically looks like this:

  1. Find the request ID or user identifier
  2. Open your logging dashboard
  3. Search and filter through entries
  4. Scroll through hundreds of log lines
  5. Manually piece together what happened, in what order
  6. Try to spot the error buried somewhere in the middle

This takes 15-30 minutes per issue. For a team shipping fast, that's unacceptable.

The Solution: A Slash Command That Does the Work

We built a debugging skill for our AI coding assistants (Claude Code and Cursor). One command, one identifier, instant clarity.

/debug-conversation 8b7d7d8d-81f0-43b2-8d19-358c298cbca4

The output isn't raw logs. It's a structured narrative of what happened.

What you see at a glance:

  • All request parameters and configuration used
  • Execution timeline with durations and success/failure status
  • Step-by-step breakdown of what each component did
  • What worked vs. what failed

The difference is night and day. Instead of hunting through logs, you immediately understand the request's story.

How It Works

The skill is defined in a single markdown file that instructs the AI assistant on what to do. Here's the workflow:

1. Fetch Logs from Your Log Provider

We use Vercel's log API to pull all entries matching the request identifier. The time range is configurable (--since 1h, --since 7d, etc.). This works with any log provider that has an API-Datadog, CloudWatch, Papertrail, etc.

2. Parse and Extract Structure

The raw logs contain JSON messages with timestamps, operation names, parameters, and results. The skill instructs the AI to:

  • Extract request metadata and configuration
  • Build a chronological execution timeline
  • Calculate durations between start/end events
  • Identify errors and warnings
  • Parse structured output data if present

3. Generate a Dynamic Summary

The output format adapts to what's in the logs. Different request types produce different summaries based on what's relevant.

Be descriptive and contextual. Don't just show raw data-explain what happened.

So instead of "processData: 75000ms", you get: "Data processing took 75 seconds and handled 3 input sources"

The Skill File

The entire debugging capability lives in one markdown file: .claude/commands/debug-conversation.md

It defines:

  • Input arguments: request ID, time range, optional file path
  • Setup instructions: one-time API token configuration
  • Step-by-step workflow: fetch → parse → analyze → summarize
  • Output format: tables for parameters, timeline for operations, narrative for context
  • Error handling: what to show when logs are missing or empty

Why This Pattern Matters

This isn't just about debugging. It's about building skills for repetitive developer tasks.

Every engineering team has workflows that look like:

  1. Get some context (logs, metrics, code)
  2. Apply domain knowledge to interpret it
  3. Produce structured output

These workflows live in people's heads. When someone leaves, the knowledge goes with them. When a new person joins, they learn by watching.

By encoding workflows as AI skills, you get:

  • Consistency: Every debug session follows the same thorough process
  • Speed: 30 seconds instead of 30 minutes
  • Onboarding: New engineers can debug from day one
  • Evolution: Update the skill file, everyone gets the improvement

Building Your Own Debugging Skills

If you're running complex backend systems, consider building similar skills for your team.

Key Ingredients

  1. Structured logging: Your system needs to log operations, parameters, and results in a parseable format. We log JSON with consistent fields.
  2. Queryable log storage: You need API access to filter logs by identifiers. Vercel, Datadog, CloudWatch-whatever you use.
  3. A skill file: A markdown document that tells the AI assistant exactly how to fetch, parse, and present the data.

The Broader Vision: Developer Tooling as Skills

At ngram, we're building a library of these skills:

  • Debug user sessions: The one described here
  • Trace request flow: Follow a request from start to finish
  • Analyze resource usage: Understand cost and performance breakdown

Each skill encodes expertise that would otherwise require senior engineers to investigate.

The pattern extends beyond debugging. Any repetitive task that requires fetching data and applying judgment can become a skill:

  • Code review checklists that actually check the code
  • Migration validators that verify data integrity
  • Deployment analyzers that explain what changed

Try It Yourself

If you're using Claude Code or Cursor, you can create skills in your own repositories.

  1. Create a .claude/commands/ directory
  2. Add a markdown file describing the workflow
  3. Use /your-skill-name to invoke it

The AI assistant reads your instructions and executes them with the tools available (shell commands, file reading, API calls).

For debugging specifically, the investment is small: a few hours to write the skill, a one-time token setup. The return is every future debug session taking seconds instead of minutes.

Real Example: Debugging a Broken Chat Session

Here's what this looks like in practice.

Our product has an agentic chat interface where users interact with an AI assistant that orchestrates multiple services-web research, storyboard generation, image creation, voice synthesis. When a user reports that their session is stuck or produced unexpected results, we need to figure out which part of the pipeline broke.

Before this skill, that meant manually searching logs by conversation ID, scrolling through hundreds of entries, and mentally reconstructing the execution flow. Now:

/debug-conversation 8b7d7d8d-81f0-43b2-8d19-358c298cbca4

The skill takes the conversation ID, fetches all matching logs from our logging service, automatically detects the time range from the first and last entries, and parses every tool invocation along the way. Thirty seconds later, you see something like:

  • Researcher tool - scraped 3 URLs, extracted 12 key points (4.2s) ✓
  • Storyboard generator - created 6 scenes with transitions (8.1s) ✓
  • Image generation - timed out on scene 3 after 45s ✗
  • Voice synthesis - never started (blocked by upstream failure) ✗

Immediately clear: the image generation service timed out, which cascaded and blocked everything downstream. No scrolling, no guesswork.

The skill also supports different input modes depending on where your logs live:

  • /debug-conversation <id> - fetches from the remote logging service
  • /debug-conversation <id> --since 7d - searches a wider time window
  • /debug-conversation --file logs/session-dump.json - analyzes a local log file instead

This is just one skill. The same approach works for any repetitive investigation your team does. The point isn't the specific command-it's that encoding the workflow in a file means every engineer gets the same thorough analysis, every time.

What's Next

We're continuing to build skills that make our engineering team faster. The goal is simple: encode expertise, share it instantly, improve it continuously.

If your team is running complex production systems, consider this approach. The tooling that helps you debug today becomes the documentation that onboards tomorrow's engineers.

At ngram, we build tools that help teams move faster. If you're interested, check out ngram.com.

Related articles

AI Explainer Video Maker: Create Videos That Convert
Article14 min read

AI Explainer Video Maker: Create Videos That Convert

Learn how to use an AI explainer video maker to create professional, on-brand explainer videos in minutes. Covers tools, costs, step-by-step process, and real data.

ngramAI Video
Anish Muppalaneni
Anish Muppalaneni
Co-founder & CEO
Apr 16, 2026
AI Video Creator for Business: 10 Use Cases Backed by Data (2026)
Article19 min read

AI Video Creator for Business: 10 Use Cases Backed by Data (2026)

Discover 10 proven business use cases for AI video creators in 2026, from marketing explainers to sales demos and employee training, backed by real data and ROI statistics.

ngramAI Video
Kyra Rachitsky
Kyra Rachitsky
Content & Insights
Apr 16, 2026
AI Video Generator for Marketing: The Data-Driven Guide (2026)
Article20 min read

AI Video Generator for Marketing: The Data-Driven Guide (2026)

How marketing teams use AI video generators to cut production costs by 91% and create 11x more content. Data-backed guide with market trends, use cases, and ROI metrics.

ngramAI Video
Anish Muppalaneni
Anish Muppalaneni
Co-founder & CEO
Apr 16, 2026
AI Video Generators in 2026: How They Work and How to Choose
Article12 min read

AI Video Generators in 2026: How They Work and How to Choose

A comprehensive guide to AI video generators in 2026 covering the technology behind them, pricing comparisons, business use cases, and how to pick the right tool for your team.

ngramAI Video
James Crawford
James Crawford
Content & Insights
Apr 16, 2026
AI Video Marketing in 2026: The Complete Strategy Guide for Business Teams
Article13 min read

AI Video Marketing in 2026: The Complete Strategy Guide for Business Teams

Learn how companies are using AI video for marketing in 2026 - from tools and strategies to real ROI data and case studies. Includes market data, tool comparisons, and actionable frameworks.

ngramAI Video
Kyra Rachitsky
Kyra Rachitsky
Content & Insights
Apr 16, 2026
50+ AI Video Statistics for 2026: The Data Behind Video's Biggest Shift
Industry news20 min read

50+ AI Video Statistics for 2026: The Data Behind Video's Biggest Shift

The most comprehensive collection of AI video statistics for 2026 - covering market size, adoption rates, production cost shifts, viewer behavior, and GTM impact. Every data point sourced and cross-referenced.

ngramAI Video
Anish Muppalaneni
Anish Muppalaneni
Co-founder & CEO
Apr 16, 2026

Ready to create your first video?

Join thousands of product teams using AI to create professional videos in minutes.