
Scalable Architecture
Engineering stores, infrastructure insights and everything you need to know about building scalable architecture, straight from the Hacker News community.

Engineering stores, infrastructure insights and everything you need to know about building scalable architecture, straight from the Hacker News community.
The discussion centers around the challenges startups face in scaling user concurrency. The original poster argues that very few projects fail due to technical issues before reaching 50k users concurrently; instead, the main barrier is lack of product-market fit. An example, 'OpenClaw,' is cited, implicitly illustrating a project impacted by such challenges. The actionable insight is that ensuring product-market fit to attract users is more critical before worrying about solving technical scalability problems.
The discussion highlights a common issue where Google's infrastructure tools are often tailored to their internal needs, resulting in cognitive overhead and usability problems for external users. This pattern suggests that many abstractions introduced by Google may be renamed or removed after feedback from real users, indicating a need for more user-centric design and adaptability in infrastructure projects.
The original poster shares their experience building a data-driven product involving official APIs and the challenges faced with scaling and user engagement. The follow-up comment reflects on a different perspective (creator vs. consumer) and expresses interest in seeing the existing product. Key insights include the importance of considering data sharing permissions, the difficulty in scaling API-dependent platforms, and understanding user incentives to adopt the product.
The discussion highlights AWS's reliability, primarily because AWS uses its own platform internally, resulting in a practical and stable environment. The availability of support when issues arise is also emphasized. Additionally, the contrast between AWS's cell-based architecture and the global services of other providers is noted, with the latter having larger outage blast radii, though global services can be appealing from a scalability perspective. These insights suggest that building on AWS offers operational confidence, and organizations should consider architectural design trade-offs when choosing cloud providers.
The thread features a detailed experience report from a former Azure Compute Fabric Controller engineer who describes intense workloads, chronic understaffing, and the challenges of balancing system reliability with VM downtime. The report highlights the difficulty of the repair policies, the high stress and long hours leading to burnout, and the impact of hiring practices on team quality. Another contributor succinctly criticizes the 60+ hour work weeks as poor engineering culture. The actionable insight is to acknowledge the human cost of understaffing and overwork in critical infrastructure teams; improving staffing levels, realistic workload distribution, and fostering psychological safety are essential to sustainable engineering culture.
The conversation critiques an argument about the scalability and distribution of Slack Apps in relation to Claude code. It highlights that most web apps are basic CRUD and that the majority of companies have fewer than 50k users, questioning the relevance of scale concerns. A follow-up comment points out the importance of distinguishing between total users and paying users, underlining a key consideration in app scalability discussions.