Building a notification system may seem trivial, but what about building one that could reach million of users within a few seconds? What about doing that right after your advertisement airs?
Event-based notification systems are not uncommon anymore, but there’s rarely a cost-effective example of an on-demand, highly parallel notification system. The complexity of building such a system comes from the intersection of system design, site reliability, and cloud resource management. All of that while being pressured by the demands of an unhinged marketing campaign over TV and the Web.
In this presentation, we will focus on:
- How we built/test a robust on-demand notification system
- What it takes to manage cloud resources/site-reliability at the same time.
- How to mitigate reliability issues with “zombie mode" and other relevant internal tooling we created.
Interview:
What's the focus of your work these days?
My current projects involve cost optimization and performance improvement for backend services as well as edge devices.
What technical aspects of your role are most important?
I focus on backend service reliability and robustness. While product teams are busy shipping features, my team and I provide guardrails and solutions to strengthen our infrastructure.
How does your InfoQ Dev Summit Boston session address current challenges or trends in the industry?
I think it provides a great perspective on how engineering teams could work with challenging and changing requirements, as well as mitigating scaling issues for big events.
How do you see the concepts discussed in your InfoQ Dev Summit Boston session shaping the future of the industry?
I don't see my session as a "how-to" guide, but I will try my best to inspire others to collaborate on and execute high-quality engineering projects that they first defined as "impossible."
Speaker
Zhen Zhou
Software Engineer at @Duolingo, Previous Theoretical Computer Science Enthusiast @CMU
Zhen is a senior software engineer at Duolingo. He is a core member of the Growth Infrastructure team, which is responsible for backend microservices that fuels Duolingo’s vibrant social features and retention strategies.
He joined Duolingo in 2020 after graduating from Carnegie Mellon University. He is deeply passionate about designing and implementing robust backend infrastructure that directly impacts millions of users globally. He embraces challenges that come with evolving system designs and adapts services to meet the needs of a growing and diverse global audience.