Microsoft Azure App-Fabric Caching Service Explained!

MIX always comes with a mix of feelings…excitement at the prospect of trying out the new releases and the heartache that comes with trying to understand  “in depth” the new technologies being released…and so starts the “googling..oops binging”,,,blogs,,,videos etc…What does it mean?? How does it impact me??

One such very important release at MIX 2011 is the AppFabric Caching ServiceAt Cennest we do a lot of Azure development and Migration work and this feature caught our immediate attention as something which will have a high impact on the architecture, cost and performance of new applications and Migrations .

So we collated information from various sources (references below ) and here is an attempt is simplify the explanation for you!

What is caching?

The Caching service is a distributed, in-memory, application cache service that accelerates the performance of Windows Azure and SQL Azure applications by allowing you to keep data in-memory and saving you the need to retrieve that data from storage or database.(Implicit Cost Benefit? Well depends on the costing of Cache service…yet to be released..)

Basically it’s a layer that sits between the Database and the application and which can be used to “store” data prevent frequent trips to the database thereby reducing latency and improving performance


How does this work?

Think of the Caching service as Microsoft running a large set of cache clusters for you, heavily optimized for performance, uptime, resiliency and scale out and just exposed as a simple network service with an endpoint for you to call. The Caching service is a highly available multitenant service with no management overhead for its users

As a user, what you get is a secure Windows Communication Foundation (WCF) endpoint to talk to and the amount of usable memory you need for your application and APIs for the cache client to call in to store and retrieve data.


The Caching service does the job of pooling in memory from the distributed cluster of machines it’s running and managing to provide the amount of usable memory you need. As a result, it also automatically provides the flexibility to scale up or down based on your cache needs with a simple change in the configuration.

Are there any variations in the types of Cache’s available?

Yes, apart from using the cache on the Caching service there is also the ability to cache a subset of the data that resides in the distributed cache servers, directly on the client—the Web server running your website. This feature is popularly referred to as the local cache, and it’s enabled with a simple configuration setting that allows you to specify the number of objects you wish to store and the timeout settings to invalidate the cache.


What can I cache?

You can pretty much keep any object in the cache: text, data, blobs, CLR objects and so on. There’s no restriction on the size of the object, either. Hence, whether you’re storing explicit objects in cache or storing session state, the object size is not a consideration to choose whether you can use the Caching service in your application.

However, the cache is not a database! —a SQL database is optimized for a different set of patterns than the cache tier is designed for. In most cases, both are needed and can be paired to provide the best performance and access patterns while keeping the costs low.

How can I use it?

  • For explicit programming against the cache APIs, include the cache client assembly in your application from the SDK and you can start making GET/PUT calls to store and retrieve data from the cache.
  • For higher-level scenarios that in turn use the cache, you need to include the ASP.NET session state provider for the Caching service and interact with the session state APIs instead of interacting with the caching APIs. The session state provider does the heavy lifting of calling the appropriate caching APIs to maintain the session state in the cache tier. This is a good way for you to store information like user preferences, shopping cart, game-browsing history and so on in the session state without writing a single line of cache code.


When should I use it?

A common problem that application developers and architects have to deal with is the lack of guarantee that a client will always be routed to the same server that served the previous request.

When these sessions can’t be sticky, you’ll need to decide what to store in session state and how to bounce requests between servers to work around the lack of sticky sessions. The cache offers a compelling alternative to storing any shared state across multiple compute nodes. (These nodes would be Web servers in this example, but the same issues apply to any shared compute tier scenario.) The shared state is consistently maintained automatically by the cache tier for access by all clients, and at the same time there’s no overhead or latency of having to write it to a disk (database or files).

How long does the cache store content?

Both the Azure and the Windows Server AppFabric Caching Service use various techniques to remove data from the cache automatically: expiration and eviction. A cache has a default timeout associated with it after which an item expires and is removed automatically from the cache.

This default timeout may be overridden when items are added to the cache. The local cache similarly has an expiration timeout. 

Eviction refers to the process of removing items because the cache is running out of memory. A least-recently used algorithm is used to remove items when cache memory comes under pressure – this eviction is independent of timeout.

What does it mean to me as a Developer?

One thing to note about the Caching service is that it’s an explicit cache that you write to and have full control over. It’s not a transparent cache layer on top of your database or storage. This has the benefit of providing full control over what data gets stored and managed in the cache, but also means you have to program against the cache as a separate data store using the cache APIs.

This pattern is typically referred to as the cache-aside, where you first load data into the cache and then check if it exists there for retrieving and, only when it’s not available there, you explicitly read the data from the data tier. So, as a developer, you need to learn the cache programming model, the APIs, and common tips and tricks to make your usage of cache efficient.

What does it mean to me as an Architect?

What data should you put in the cache? The answer varies significantly with the overall design of your application. When we talk about data for caching scenarios, usually we break it into the data types and access patterns

  • Reference Data( Shared Read Data):-Reference data is a great candidate for keeping in the local cache or co-located with the client


  • Activity Data( Exclusive Write):- Data relevant to the current session between the user and the application.

Take for example a shopping cart!During the buying session, the shopping cart is cached and updated with selected products. The shopping cart is visible and available only to the buying transaction. Upon checkout, as soon as the payment is applied, the shopping cart is retired from the cache to a data source application for additional processing.

Such an collection of data would be best stored in the Cache Server providing access to all the distributed servers which can send updates to the shopping cart . If this cache were stored at the local cache then it would get lost.



  • Shared Data(Multiple Read and Write):-There is also data that is shared, concurrently read and written into, and accessed by lots of transactions. Such data is known as resource data.

Depending upon the situation, Caching shared data on a single computer can provide some performance improvements but for large-scale auctions, a single cache cannot provide the required scale or availability. For this purpose, some types of data can be partitioned and replicated in multiple caches across the distributed cacheimage

Be sure to spend enough time in capacity planning for your cache. Number of objects, size of each object, frequency of access of each object and pattern for accessing these objects are all critical in not only determining how much cache you need for your application, but also on which layers to optimize for (local cache, network, cache tier, using regions and tags, and so on).

If you have a large number of small objects, and you don’t optimize for how frequently and how many objects you fetch, you can easily get your app to be network-bound.

Also Microsoft will soon release the pricing for using the caching service so obviously you need to ensure usage of the Caching service is “Optimized” and when it comes to the cloud “Optimized= Performance +Cost”!!

Hope this helps you understand this new term better wrt Azure.

Until Next Time

Cennest!                                                                                                                                                                                           We can help you move to the cloud!”


Azure Scalability:- Use “Queues” as your bridges..

I feel the importance of Queues as a communication medium between Web and worker roles is quite a bit under-glorified…apart from acting as reliable messengers , if implemented properly Queues pretty much hold the key to your application being Extensible and Scalable

Windows Azure Queue allows decoupling of different parts of a cloud application, enabling cloud applications to be easily built with different technologies and easily scale with traffic needs.

This above definition is full of some very important keywords..decoupling.…different technologies…. scale with traffic needs…all very important to make a scalable and extensible application..

Lets see how Queues help in all these aspects.

1.  Scalability:-

  • Observing the queue length and tuning the number of backend nodes accordingly, applications can effectively scale smoothly accordingly to the amount of traffic. A growing queue length or an almost always zero queue length is an indication of how loaded your worker roles are and you might want to scale them up or down accordingly
  • The use of queues decouples different parts of the application, making it easier to scale different parts of the application independently. This allows the number of web and worker roles to be adjusted independently without affecting the application logic.
  • Separate queues can be used for work items of different priorities and/or different weights, and separate pools of backend servers can process these different queues. In this way, the application can allocate appropriate resources (e.g. in term of the number of servers) in each pool, thereby efficiently use the available resources to meet the traffic needs of different characteristics

  2.     Extensibility:-

  • Different technologies and programming language can be used to implement different parts of the system with maximum flexibility. For example, the component on one side of the queue can be written in .NET framework , and the other component can be written in Python.

3.   Decoupling:-

  • Changes within a component are transparent to the rest of the system. For example, a component can be re-written using a totally different technology or programming language, and the system still works seamlessly without changing the other components, since the components are decoupled using queues.  The old implementation and the new implement can run on different servers at the same time, and process work items off the same queue. All these are transparent to the other components in the application

But to be able to achieve all these benefits you need to make sure your queues are implemented in the most optimal way . There are a lot of best practices around Queue implementation such as Invisibility Time, Delete queue messages, Worker role instance manipulation etc…more on that some other time…meanwhile if you want more inputs on “Queueing your way to scalability” drop us a a note at

Until then!


Programming Table Storage in Azure…Choosing your Keys!!

Azure holds a lot of promise for applications with

1. Spikes in usage

2. Requirement for low set up cost

3.Huge storage requirements

Out of these 3 major benefits, most applications will exploit the 1st and the 3rd potential of Azure!..While Spikes in Usage is more related to how the application is architected etc(we’ll talk about it in another post), huge storage requirements is a pretty interesting topic…the potential here is enormous..

You want to learn more on Silverlight 4, won’t it be cool if you could just see what articles people have read and which ones they found useful and why?? I am sure it will help you make more out of your time by helping you weed out the “not so useful” articles….but who would take the onus of storing such HUGE data?? Well something like Table Storage in Azure could be the answer to is a scalable, extensible and inexpensive storage medium to store just this type of data…

While just having fun with Azure, we made this small app which allows a user add a weblink onto the Azure table storage…


he can then also search the entire table for weblinks of his interest based on categories..


One can take this forward to search by submitter, or Category+ rating etc etc..

The entire application is uploaded here…its a rough sample (so no commenting etc etc )but it will give you a idea..if you have any issues with it…its!!!

There are lots of samples out there so i don’t want to focus on “how to program” this….i am more interested in the most important aspect of using the Table Storage in Azure….How to select you Keys!!(Partition + Row Key)

Basically your partition key should be the most commonly used filtering criteria for your application….so basically you need to realize the main purpose of the application and in what way it is going to be used, what kind of queries are going to be most frequent…one tip here is that the you should be able to use your partition key in the where clause of almost all your queries..

Lets see what kind of querying makes sense for this simple application

1. Get me all weblinks in CategoryX

2. Get me all weblinks in CategoryX with Rating Y

3. Get me all weblinks submitted by User T

4. Get me all weblinks submitted by User T in Category X

etc, etc

If you notice most querying is going to involve the “Category”, so we should be able to use Category in the where clause!!….so the partition key becomes category!!

You must use Category as a condition in most of your queries


Now for the Row Keythe rule here says that if there are 2 frequently used keys, use the most freq as the partition key and the other one as the Row Key…so if you see the querying, you will be tempted to use the “SubmittedBy” as the row key…but there is a problem…PartitionKey+RowKey should be unique for every row in the table…so if you use SubmittedBy as the Row key, a user will be able to submit a weblink for a category only once…makes no sense!!

For applications such as this…it makes sense to use Guid.NewGuid as the row key to allow multiple entries and still maintain uniqueness…

Like i said, let me know if you want the source code…will be happy to load it somewhere for you!!…Azure table storage has huge potential as long as the keys are selected properly!!!



Bulb Flash:- Few tips to Reduce Cost and Optimize performance in Azure!

Applications deployed on azure are meant to 1. Perform better at 2. Lesser cost.

As a software developer its inherently your job to ensure this holds true..

Here are a few pointers to keep in mind when deploying applications on Azure to make sure your clients are happy:-)

Application Optimizations

  • Cloud is stateless , so if migrating an existing ASP.NET site and you are expecting to have multiple web roles up there then you  cannot use the default in memory Session state..You basically have 2 options now
    • Use ViewState instead . Using ASP.NET view state is an excellent solution so long as the amount of data involved is small. But if the data is huge you are increasing your data traffic , not only affecting the performance but also accruing cost… remember..inword traffic is $.10/GB and outgoing $0.15/GB.
    • Second option is to persist the state to server side storage that is accessible from multiple web role instances. In Windows Azure, the server side storage could be either SQL Azure, or table and blob storage. SQL Azure storage is relatively expensive as compared to table and blob storage. An optimum solution uses  the session state  sample provider that you can download from The only change required for the application to use a different session state provider is in the Web.config file.
    • Remove expired sessions from storage to reduce your costs. You could add a task to one of your worker roles to do this .
  • Don’t go overboard creating multiple worker roles which do small tasks. Remember you pay for each role deployed so try to utilize each node’s compute power to its maximum. You may even decide to go for a bigger node instead of creating a new worker role and have the same role do multiple tasks.

Storage Optimization

  • Choosing the right Partition and Row key for your tables is crucial.Any process which needs to tablescan across all your partitions will be show. basically if you are not being able to use your partition key in the “Where” part of your LINQ queries then you went wrong somewhere..

Configuration Changes

  • See if you can make the following system.configuration changes..
    • expect100Continue:- The first change switches off the ‘Expect 100-continue’ feature. If this feature is enabled, when the application sends a PUT or POST request, it can delay sending the payload by sending an ‘Expect 100-continue’ header. When the server receives this message, it uses the available information in the header to check whether it could make the call, and if it can, it sends back a status code 100 to the client. The client then sends the remainder of the payload. This means that the client can check for many common errors without sending the payload. If you have tested the client well enough to ensure that it is not sending any bad requests, you can turn off the ‘Expect 100-continue’ feature and reduce the number of round-trips to the server. This is especially useful when the client sends many messages with small payloads, for example, when the client is using the table or queue service.
    • maxconnection:-The second configuration change increases the maximum number of connections that the web server will maintain from its default value of 2. If this value is set too low, the problem manifests itself through "Underlying connection was closed" messages.

WCF Data Optimizations(When using Azure Storage)

  • If you are not making any changes to the entities that WCF Data Services retrieve set the MergeOption to “NoTracking”
  • Implement paging. Use the ContinuationToken and the ResultSegment class to implement paging and reduce in/out traffic.

These are just a few of the aspects we at Cennest keep in mind which deciding the best deployment model for a project..



Azure:- What Azure Offers and what exactly does “Compute” mean?

An architect always has the tough job of convincing the client that the solution deployed is “optimum” under the given constraints!!.

Especially when it comes to Azure where there is a constant Cost vs Compute fight!!.

Got this excellent diagram from the Windows Architecture Guidance which illustrates what you get when you create an Azure account..


Basically when you select Azure as an offering in your subscriptions you get

  • One Azure Project(Portal).
  • Within this portal you can host upto 6 Services(Websites)
  • Upto 5 Azure Storage Accounts(each with 100TB of data).
  • Interesting Limitation:- Each service can have only upto 5 roles(Web+ worker in any combination)
  • Very Interesting Limitation:- The entire project can consume only 20 compute instances!!

The last point is extremely important, but to understand its significance we need to understand some basics in Azure.

You create your service as a combination of web + worker roles, each role you configure runs on a Virtual Machine(VM) 

For each role you can define the size of VM you prefer, based on the memory and disk space requirements expected or that role.

Here are the configurations for the VMs


Compute Power is equal to the number of CPU cores being consumed by your VM!!

Suppose you had 2 role configured for 3 instances, each on Large VMs then you consumed 2*3*4=24  instances of the CPU core and are actually out of space!!.

The  calculation is Role*instances*No of CPU Cores in VM.

A very important point to remember when you define the tasks being accomplished by your roles and the kind of VM they are being hosted on!!