How Netflix Built Self-Healing System to Survive Concurrency Bug

#602 – November 24, 2024

CPUs were dying, the bug was temporarily un-fixable, and they had no viable path forward

How Netflix Built Self-Healing System to Survive Concurrency Bug
8 minutes by Matthew Hawthorne

CPUs were dying, the bug was temporarily un-fixable, and they had no viable path forward. This article describes a production incident at Netflix where a concurrency bug in an internal library was gradually consuming CPU resources across their cluster, causing a loss of server capacity. Rather than implementing traditional solutions like manual server reboots or weekend work, the team created an automated solution by pinning the cluster size to maximum capacity and randomly terminating and replacing instances every 15 minutes.

Universal-2 Speech-to-Text: Tackle Complex Conversations
sponsored by AssemblyAI

Universal-2 tackles complex challenges in conversational data: automatically identifying speakers, extracting key moments, and transcribing technical phrases like “Q4 revenue target $3M.” Easily integrate it into your apps with our simple API for meeting summaries, CRM updates, sales tools, and more—delivering accurate, scalable insights from millions of minutes of audio data.

Binary vector embeddings are so cool
5 minutes by Evan Schwartz

Vector embeddings by themselves are pretty neat. Binary quantized vector embeddings are extra impressive. In short, they can retain 95+% retrieval accuracy with 32x compression 🤯.

Improving search relevance with word proximity
2 minutes by James G.

My website search engine uses text search to identify documents relevant to a given term. Up until recently, the search engine treated every word in a term independently.

The Prequel to SQL is SEQUEL
6 minutes by Justin Jaffray

A look back at the 1974 paper "SEQUEL: A STRUCTURED ENGLISH QUERY LANGUAGE" by Chamberlin and Boyce reveals the origins of SQL, highlighting how this groundbreaking work laid the foundation for modern database querying. While the paper was ahead of its time in understanding declarative programming and data abstraction, it had some interesting quirks, including a problematic BNF grammar and limited understanding of joins.

Good software development habits
5 minutes by Zarar

Zarar shares 10 practical habits that help maintain high-quality and efficient software development. Key points include keeping commits small, continuous refactoring, frequent deployment, smart testing strategies, and managing technical debt effectively.

🗓️ The Virtual Code AI Summit awaits (Dec 12)
sponsored by Sourcegraph

Featuring leaders from Google, Netflix, Anthropic, Netlify, and more, this event is all about the real world progress being made with AI in complex enterprise codebases. Join for fantastic sessions, live Q&A, and to connect with hundreds of other dev leaders. You can join us virtually or in person at our live viewing parties. Can't attend live? Register anyway and we'll send you the recordings.

newsletters