OwlTail

Is now Fathom

OwlTail

Is now Fathom

The Downtime Project

Tom and Jamie talk through the postmortems of outages that have affected high profile sites.

This means that the episode rankings aren't working properly. Please revisit us at a later time to get the best episodes of this podcast!

Ranked #1

7 Lessons From 10 Outages

After 10 post-mortems in their first season, Tom and Jamie reflect on the common issues they’ve seen. Click through for ... Read more

22 Jun 2021

•

46mins

Ranked #2

Salesforce Publishes a Controversial Postmortem (and breaks their DNS)

On May 11, 2021, Salesforce had a multi hour outage that affected numerous services. Their public writeup was somewhat ... Read more

31 May 2021

•

40mins

Ranked #3

Kinesis Hits the Thread Limit

During a routine addition of some servers to the Kinesis front end cluster in US-East-1 in November 2020, AWS ran into a... Read more

25 May 2021

•

44mins

Ranked #4

How Coinbase Unleashed a Thundering Herd

In November 2020, Coinbase had a problem while rotating their internal TLS certificates and accidentally unleashed a hug... Read more

17 May 2021

•

38mins

Most Popular Podcasts

The Joe Rogan Experience

TED Talks Daily

The Tim Ferriss Show

The Daily

Stuff You Should Know

Oprah's SuperSoul Conversations

Armchair Expert with Dax Shepard

Ranked #5

Auth0’s Seriously Congested Database

Just one day after we released Episode 5 about Auth0’s 2018 outage, Auth0 suffered a 4 hour, 20 minute outage that was c... Read more

10 May 2021

•

1min

Ranked #6

Talkin’ Testing with Sujay Jayakar

Tom was feeling under the weather after joining Team Pfizer last week, so today we have a special guest episode with Suj... Read more

3 May 2021

•

29mins

Ranked #7

GitHub’s 43 Second Network Partition

In 2018, after 43 seconds of connectivity issues between their East and West coast datacenters and a rapid promotion of ... Read more

26 Apr 2021

•

53mins

Ranked #8

Auth0 Silently Loses Some Indexes

Auth0 experienced multiple hours of degraded performance and increased error rates in November of 2018 after several une... Read more

19 Apr 2021

•

47mins

Ranked #9

One Subtle Regex Takes Down Cloudflare

On July 2, 2019, a subtle issue in a regular expression took down Cloudflare (and with it, a large portion of the intern... Read more

12 Apr 2021

•

54mins

Ranked #10

Monzo’s 2019 Cassandra Outage

Monzo experienced some issues while adding servers to their Cassandra cluster on July 29th, 2019. Thanks to some good pr... Read more

5 Apr 2021

•

43mins

Monzo

© 2020 OwlTail All rights reserved. OwlTail only owns the podcast episode rankings. Copyright of underlying podcast content is owned by the publisher, not OwlTail. Audio is streamed directly from Tom Kleinpeter and Jamie Turner servers. Downloads goes directly to publisher.

“Podium: AI tools for podcasters. Generate show notes, transcripts, highlight clips, and more with AI. Try it today at https://podium.page”