Couple of days ago, I had an interesting converstation with a friend of mine - who is a civil engineer - about how do they plan and build stuff, one thing that actually grabbed my attention was the designs and what considerations they keep in mind before start doing real work, I noticed that they also build stuff that actually scale.
What is scalability?
In short, it's doing what you you are doing in a bigger way, it's allowing more users to use your web application, and maintain the application performance under the increasing load.
Many people mix scalability with performance - which is the speed measure of single request to be executed - and how to increase it, or what framework or protocol to use.
- Vertical - adding resources to the unit, an example would be adding a CPU to an existing server, or adding a hard drive to an existing storage.
- Horizontal - adding multiple unites of logic and making them work as one unit, an example would be clustering, distributed file systems, or load balancer.
A scale factor is a number which scales, or multiplies, some quantity. In the equation y=Cx, C is the scale factor for x. C is also the coefficient of x Wikipedia, so if you lose 10% of processor power every time you add a CPU then your scalability factor is 90%, which means you will only be able to use 90% of resources.
When scaling hardware, you might think that scaling vertically is a quick win in the beginning, but when you scale more and more, your overhead cost will be very high. On the other hand, horizontal scaling is not that costly since it relies on serve solutions, however, it requires you to build you application to run on multiple servers as one application.
Scalability is not just about CPU (processing power) and storage, for a successful scalable web application, all layers have to be scalable, database layer, application layer (memcached, scaleout, terracota), the web layer, load balancer, and firewall.
Scaling a database can also be though in two ways, vertical (running database server with more than one CPU thereby dividing tasks between the CPU's on the same machine), and horizontal scaling, may involve data partitioning and distributed transaction. Database in general rely on 3 basic components:
So When scaling a database the 3 components determine the maximum performance, so you cannot add unlimited CPU's and expect to see an increase in performance without improving memory capacity and disk.
- Master/Slave - single master server for all write (Create Update or Delete, or CRUD) operations, and one or many additional Slave servers that provide read-only operations, however, this solution has side implications.
- Table Partitioning - here data in a single large table can be split across multiple disks for improved disk I/O utilization and reduce bottleneck, but make joins a little bit slower.
- Cluster computing - utilizing many servers operating in a group with shared messaging between the nodes in the cluster, most often this method relies on Storage Area Network and each node in the cluster is running a single instance of the database server. The partitioning can be done horizontally or vertically which in a result will reduce the I/O bottleneck for a given table.
- Database sharding (Shared nothing) - taking a large database and breaking it into smaller databases across servers which will result in an easier-to-manage and faster databases, however this approach also has some challenges to be kept in mind such as ensuring a fault-tolerant and reliable service, handling distributed queries results, auto-increment key management ..etc
Application layer can be cached using many solutions and approaches, the most common is Memcached - is a hash table that can be distributed across multiple machines that is used to speed up dynamic database-driven web applications by caching data and objects in RAM to reduce the number of external reads on the server. So what to cache? Commonly, database query results, objects with much calculations or anything that takes a while to generate must be cached - but remember that only cache when you need to.
Memcached is pretty simple to configure with your web application, however, cache invalidation can get tricky sometimes, so best use TTL.
Load balancers fall in three categories:
- Application level load balancers - which are slow comparing to load balancers on lower levels
- L3 load balancers - operating on the IP level and faster than application level load balancers
- L2 load balancers - operating on TCP and extremely fast
While this information is still an overview and very basic, it does provide the basic considerations for determining how to think of your scalable web application. One more thing I would like to touch on here is that scalability can go into small little details in your application, for example, if your website rely on an external API, considering the API's limitations and how to deal with also fall in your planned architecture for a better scalable web applications.