awesome_scalability [https://RustRocket.rs -- Cloud Monk Losang Jinpa, Ph.D., MCSE / MCT, https://Rustaceans.rs, Full Stack Cloud Native Rust DevOps Engineer - Rocket Framework Rust Microservices on Kubernetes-AWS-Azure-GCP]

awesome_scalability

Table of Contents

Awesome Scalability
Fair Use Sources

Awesome Scalability

Return to Awesome DevOps, Awesome Lists, Scalability

Awesome Scalability

An updated and organized reading list for illustrating the patterns of scalable, reliable, and performant large-scale systems. Concepts are explained in the articles of prominent engineers and credible references. Case studies are taken from battle-tested systems that serve millions to billions of users.

== If your system goes slow

Understand your problems: scalability problem (fast for a single user but slow under heavy load) or performance problem (slow for a single user) by reviewing some [design principles]] - (principle) and checking how [scalability]] - (scalability) and [performance]] - (performance) problems are solved at tech companies. The section of [intelligence]] - (intelligence) are created for those who work with data and machine learning at big data and deep learning scale.

== If your system goes down

“Even if you lose all one day, you can build all over again if you retain your calm!” - Thuan Pham, former CTO of Uber. So, keep calm and mind the availability - availability and stability - stability matters!

== If you are having a system design interview

Look at some interview notes - (interview) and real-world architectures with completed diagrams - (architecture) to get a comprehensive view before designing your system on whiteboard. You can check some talks - (talk) of engineers from tech giants to know how they build, scale, and optimize their systems. There are some selected books - (book) for you (most of them are free)! Good luck!

== If you are building your dream team

The goal of scaling team is not growing team size but increasing team output and value. You can find out how tech companies reach that goal in various aspects: hiring, management, organization, culture, and communication in the [organization]] - (organization) section.

== Content

Principle - (principle)
Scalability - (scalability)
Availability - (availability)
Stability - (stability)
Performance - (performance)
Intelligence - (intelligence)
Architecture - (architecture)
Interview - (interview)
Organization - (organization)
Talk - (talk)
Book - (book)

== Principle

== Scalability

Microservices and Orchestration - (https://martinfowler.com/microservices/)

Distributed Caching - (https://www.wix.engineering/post/scaling-to-100m-to-cache-or-not-to-cache)

Distributed Locking - (https://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html)

Distributed Tracking, Tracing, and Measuring - (https://www.oreilly.com/ideas/understanding-the-value-of-distributed-tracing)

Distributed Scheduling - (https://www.csee.umbc.edu/courses/graduate/CMSC621/fall02/lectures/ch11.pdf)
- Distributed Task Scheduling (3 parts) at PagerDuty - (https://www.pagerduty.com/eng/distributed-task-scheduling-3/)

Distributed Monitoring and Alerting - (https://www.oreilly.com/ideas/monitoring-distributed-systems)

Distributed Security - (https://msdn.microsoft.com/en-us/library/cc767123.aspx)

Distributed Messaging, Queuing, and Event Streaming - (https://arxiv.org/pdf/1704.00411.pdf)

Distributed Logging - (https://blog.codinghorror.com/the-problem-with-logging/)

Distributed Searching - (http://nwds.cs.washington.edu/files/nwds/pdf/Distributed-WR.pdf)

Distributed Storage - (http://highscalability.com/blog/2011/11/1/finding-the-right-data-solution-for-your-application-in-the.html)
- In-memory Storage - (https://medium.com/@denisanikin/what-an-in-memory-database-is-and-how-it-persists-data-efficiently-f43868cff4c1)
- Object Storage - (http://www.datacenterknowledge.com/archives/2013/10/04/object-storage-the-future-of-scale-out)

Relational Databases - (https://www.mysql.com/products/cluster/scalability.html)

NoSQL Databases - (https://www.thoughtworks.com/insights/blog/nosql-databases-overview)

Time Series Databases - (https://www.influxdata.com/time-series-database/)

Distributed Repositories, Dependencies, and Configurations Management - (https://betterexplained.com/articles/intro-to-distributed-version-control-illustrated/)

Scaling Continuous Integration and Continuous Delivery - (https://www.synopsys.com/blogs/software-security/agile-cicd-devops-glossary/)

== Availability

Resilience Engineering - Learning to Embrace Failure - (https://queue.acm.org/detail.cfm?id=2371297)

Failover - (http://cloudpatterns.org/mechanisms/failover_system)

Load Balancing - (https://blog.vivekpanyam.com/scaling-a-web-service-load-balancing/)

Rate Limiting - (https://www.keycdn.com/support/rate-limiting/)

Autoscaling - (https://medium.com/@BotmetricHQ/top-11-hard-won-lessons-learned-about-aws-auto-scaling-5bfe56da755f)

== Stability

Circuit Breaker - (https://martinfowler.com/bliki/CircuitBreaker.html)

Timeouts - (https://www.javaworld.com/article/2824163/application-performance/stability-patterns-applied-in-a-restful-architecture.html)

== Performance

Performance Optimization on OS, Storage, Database, Network - (https://stackify.com/application-performance-metrics/)

Performance Optimization by Tuning Garbage Collection - (https://confluence.atlassian.com/enterprise/garbage-collection-gc-tuning-guide-461504616.html)

Performance Optimization on Image, Video, Page Load - (https://developers.google.com/web/fundamentals/performance/why-performance-matters/)

Performance Optimization by Brotli Compression - (https://blogs.akamai.com/2016/02/understanding-brotlis-potential.html)

Performance Optimization on Languages and Frameworks - (https://www.techempower.com/benchmarks/)

== Intelligence

Big Data - (https://insights.sei.cmu.edu/sei_blog/2017/05/reference-architectures-for-big-data-systems.html)

Distributed Machine Learning - (https://www.csie.ntu.edu.tw/~cjlin/talks/bigdata-bilbao.pdf)

== Architecture

== Interview

Designing Large-Scale Systems - (https://www.somethingsimilar.com/2013/01/14/notes-on-distributed-systems-for-young-bloods/)

Explaining Low-Level Systems (OS, Network/Protocol, Database, Storage) - (https://www.palantir.com/how-to-ace-a-systems-design-interview/)

"What Happens When... and How" Questions - (https://www.glassdoor.com/Interview/What-happens-when-you-type-www-google-com-in-your-browser-QTN_56396.htm)

== Organization

== Talk

== Book

Fair Use Sources

Fair Use Sources:

https://github.com/azurecloudmonk/awesome-scalability
- https://github.com/binhnguyennus/awesome-scalability

Scalability: Cloud scalability, Auto Scaling Groups, Container Orchestration, Content Delivery Networks (CDNs), Cost Optimization in Scalable Cloud Architectures, Database Sharding, Distributed Computing and Scalability, Elastic Load Balancing, Fault Tolerance and High Availability, Horizontal Scaling vs Vertical Scaling, Infrastructure as Code (IaC), Multi-Region Deployment Strategies, Scaling Big Data Analytics Workloads, Scalability Testing in Cloud Environments, Scalable AI and Machine Learning Services, Scalable Message Queues and Event Streaming, Scalable Networking, Scalability in Microservices Architecture, Scalability in Serverless Computing, Scalable Storage Solutions, Serverless Computing. (navbar_scalability - see also nabvbar_load_balancing)

© 1994 - 2024 Cloud Monk Losang Jinpa or Fair Use. Disclaimers

SYI LU SENG E MU CHYWE YE. NAN. WEI LA YE. WEI LA YE. SA WA HE.

awesome_scalability.txt · Last modified: 2024/05/01 04:29 by 127.0.0.1