Thinking Like a Detective: Solving Cloud Infrastructure Mysteries

code red code red

When requests mysteriously fail in cloud environments, even experienced engineers can struggle to identify the culprit. With components spread across multiple layers and services, tracking down these elusive bugs requires both methodical investigation and specialized knowledge.

Modern cloud architectures involve numerous components: CDNs, WAFs, load balancers, API gateways, and more, each capable of silently causing failures. Engineers often focus exclusively on their application code, overlooking the complex request path where many issues actually occur. Effective troubleshooting requires understanding the full journey of a request and knowing how to interpret the evidence left behind.

In this presentation, you'll learn to solve cloud infrastructure mysteries by thinking like a detective, with particular focus on:

  • Systematically investigating cloud components as software written by other developers
  • Creating comprehensive documentation of your entire request path, including often-overlooked upstream components
  • Organizing runbooks by user-reported symptoms rather than root causes for faster problem resolution
  • Interpreting HTTP status codes, response headers, and error pages as evidence to identify the true culprit
  • Building an investigation toolkit that enables proactive monitoring and prevents recurring issues

Speaker

Brendan McLoughlin

Frontend Architect @CarGurus | Former Ember Data Maintainer | Previously Open Web Consultant @Bocoup

Brendan McLoughlin is a Senior Principal Engineer at CarGurus, where he reduces friction between technical capabilities and customer needs. With over a decade of experience in web technologies, he combines deep systems thinking with hands-on problem solving to help teams navigate and debug distributed systems at scale. As part of CarGurus' Technical Leadership Team, he helps define engineering standards across a 350+ engineer organization while staying close to the code. Previously at Bocoup, he contributed to major open source projects including Ember Data. His expertise spans cloud infrastructure, web security, and complex system design, with a particular focus on helping teams navigate and debug distributed systems at scale. Brendan is passionate about knowledge sharing and regularly authors technical content to help engineers understand complex systems.

Read more
Find Brendan McLoughlin at: