
Scalable Architecture
Engineering stores, infrastructure insights and everything you need to know about building scalable architecture, straight from the Hacker News community.

Engineering stores, infrastructure insights and everything you need to know about building scalable architecture, straight from the Hacker News community.
The thread examines the technical hurdles of scaling GitHub's infrastructure, specifically noting the complexity of managing on-disk Git repositories alongside Postgres and object storage. Actionable insight: Scaling Git-based platforms requires balancing local filesystem performance with distributed cloud storage architectures.
The thread explores the architectural simplicity of Craigslist, noting that high cacheability and data segmentation (text vs. images) allow for efficient scaling. A key takeaway is the historical reliance on external security teams (e.g., eBay/PayPal) for legacy platforms that lack in-house security infrastructure.
The discussion explores architectural strategies for scaling a 4TB PostgreSQL database facing heavy write traffic. Key actionable insights include: 1) Consider read replicas first to scale read workloads before implementing sharding, as replicas are simpler to manage for consistency. 2) Sharding requires application-level routing and introduces complexity regarding cross-shard transactions. 3) While Two-Phase Commit (2PC) can facilitate cross-shard transactions, they should be reserved for metadata-style tables, whereas high-throughput writes should be directed straight to shards to avoid consistency and performance bottlenecks.
The thread warns that importing FAANG-style hierarchies and complex management systems into startups leads to declining product quality and mass attrition of senior talent. To mitigate this, companies should prioritize maintaining an engineering-centric culture and avoid prioritizing management politics over technical merit during the scaling process.
The thread identifies monorepo usage and strict single-language enforcement as key strategies for managing and migrating massive-scale codebases (50M+ lines).
A brief debate comparing advanced container orchestration (Kubernetes/Kata) against the practical resource scalability of managed EC2 instances, highlighting the tension between technical complexity and cost-effectiveness.