A Cloud Service Architecture

Tagged as: cloud

Trying to present a birds eye view of a cloud service. If you leave out lot of specifics, then it looks very simple (and possibly generic) !

A generic Cloud service Architecture

Services need to serve dynamic and static content. Static belongs to CDNs (Content Delivery Networks), they ensure the content is distibuted across various geographies and you always get served from a closer location (less latency). Leader is Akamai; AWS Cloudfront is popular for self serve and pay-as you go model. There are many others — some with geographical advantages, some on pricing points etc.

Now, when you look at the dynamic part, it is mainly two kinds of servings — Dynamic pages (say a page of your latest tweets) and APIs, which return structured information for programs (not humans) to consume. In both these cases, there is a typical architecture:

  1. A cluster of servers with a load balancer in front. Servers are added (automatically or manually) depending on load. Balancer sends the requests to the one which is has less load.

  2. Then you have a Database which stores the persistent information. You need to read from this and serve.

  3. Database reads are typically costly, so you need to have a cache between the server and Database. Everytime we read an item, it is added to the cache (with an expiry time). Further reads are taken from cache. When a write happens, the cache is invalidated. This approach makes the reads fast. Another point is whether the Cache is common for all the servers — it is better for easy invalidations and faster serving. If the database is distributed, this also scales the system — the read/write speeds are no more throttled by a single database server.

  4. Writes — You would have noticed, it happens directly on the Database. We dont have any other option here (obviosly cache doesnt help). To speed this up, you can potentially partition the database and keep different data in different databases. Makes things complex, but the good part is most applications are read-heavy. On distributed and fully-managed database like AWS DynamoDB, we can configure more capacity for writes, if your application is write-heavy. One way to handle this.

  5. Latency — Another consideration is how long a user has to wait if the Server clusters are geographically far off for dynamic content. This is when you will have to consider having multiple clusters of application servers, specific to geos (and you route traffic to the nearest one using the latency based DNS routing — AWS Route53 supports this).

I tried to simplify things a lot, but hoping this would serve as a good starting point for many !

Tagged as: cloud
comments powered by Disqus