Table of Contents
Awesome Scalability
Return to Awesome DevOps, Awesome Lists, Scalability
An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.
== If your system goes slow
- Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some [design principles]] - (principle) and checking how [scalability]] - (scalability) and [performance]] - (performance) problems are solved at tech companies. The section of [intelligence]] - (intelligence) are created for those who work with data and machine learning at big data and deep learning scale.
== If your system goes down
- “Even if you lose all one day, you can build all over again if you retain your calm!” - Thuan Pham, former CTO of Uber. So, keep calm and mind the availability - availability and stability - stability matters!
== If you are having a system design interview
- Look at some interview notes - (interview) and real-world architectures with completed diagrams - (architecture) to get a comprehensive view before designing your system on whiteboard. You can check some talks - (talk) of engineers from tech giants to know how they build, scale, and optimize their systems. There are some selected books - (book) for you (most of them are free)! Good luck!
== If you are building your dream team
- The goal of scaling team is not growing team size but increasing team output and value. You can find out how tech companies reach that goal in various aspects: hiring, management, organization, culture, and communication in the [organization]] - (organization) section.
== Content
- Principle - (principle)
- Scalability - (scalability)
- Availability - (availability)
- Stability - (stability)
- Performance - (performance)
- Intelligence - (intelligence)
- Architecture - (architecture)
- Interview - (interview)
- Organization - (organization)
- Talk - (talk)
- Book - (book)
== Principle
== Scalability
- Distributed Tracking, Tracing, and Measuring - (https://www.oreilly.com/ideas/understanding-the-value-of-distributed-tracing)
- Distributed Monitoring and Alerting - (https://www.oreilly.com/ideas/monitoring-distributed-systems)
- Periskop - Exception Monitoring Service at SoundCloud - (https://developers.soundcloud.com/blog/periskop-exception-monitoring-service)
- Alerting on Service-Level Objectives (SLOs) at SoundCloud - (https://developers.soundcloud.com/blog/alerting-on-slos)
- Monitoring and Alert System using Graphite and Cabot at HackerEarth - (http://engineering.hackerearth.com/2017/03/21/monitoring-and-alert-system-using-graphite-and-cabot/)
- Distributed Repositories, Dependencies, and Configurations Management - (https://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/)
- Scaling Continuous Integration and Continuous Delivery - (https://www.synopsys.com/blogs/software-security/agile-cicd-devops-glossary/)
== Availability
- Resilience Engineering - Learning to Embrace Failure - (https://queue.acm.org/detail.cfm?id=2371297)
- Autoscaling - (https://medium.com/@BotmetricHQ/top-11-hard-won-lessons-learned-about-aws-auto-scaling-5bfe56da755f)
== Stability
- Timeouts - (https://www.javaworld.com/article/2824163/application-performance/stability-patterns-applied-in-a-restful-architecture.html)
== Performance
- Performance Optimization on OS, Storage, Database, Network - (https://stackify.com/application-performance-metrics/)
- Performance Optimization by Tuning Garbage Collection - (https://confluence.atlassian.com/enterprise/garbage-collection-gc-tuning-guide-461504616.html)
- Performance Optimization by Brotli Compression - (https://blogs.akamai.com/2016/02/understanding-brotlis-potential.html)
== Intelligence
- Big Data - (https://insights.sei.cmu.edu/sei_blog/2017/05/reference-architectures-for-big-data-systems.html)
== Architecture
- Architectures of Finance and Banking Systems - (https://www.sesameindia.com/images/core-banking-system-architecture)
== Interview
- Designing Large-Scale Systems - (https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/)
- Explaining Low-Level Systems (OS, Network/Protocol, Database, Storage) - (https://www.palantir.com/how-to-ace-a-systems-design-interview/)
- "What Happens When... and How" Questions - (https://www.glassdoor.com/Interview/What-happens-when-you-type-www-google-com-in-your-browser-QTN_56396.htm)
== Organization
== Talk
== Book
Fair Use Sources
Scalability: Cloud scalability, Auto Scaling Groups, Container Orchestration, Content Delivery Networks (CDNs), Cost Optimization in Scalable Cloud Architectures, Database Sharding, Distributed Computing and Scalability, Elastic Load Balancing, Fault Tolerance and High Availability, Horizontal Scaling vs Vertical Scaling, Infrastructure as Code (IaC), Multi-Region Deployment Strategies, Scaling Big Data Analytics Workloads, Scalability Testing in Cloud Environments, Scalable AI and Machine Learning Services, Scalable Message Queues and Event Streaming, Scalable Networking, Scalability in Microservices Architecture, Scalability in Serverless Computing, Scalable Storage Solutions, Serverless Computing. (navbar_scalability - see also nabvbar_load_balancing)
© 1994 - 2024 Cloud Monk Losang Jinpa or Fair Use. Disclaimers
SYI LU SENG E MU CHYWE YE. NAN. WEI LA YE. WEI LA YE. SA WA HE.