Let's get to the example of a simple counter for which the frequency of updates is too high for a single entity (such a counter could be used for counting the number of views for a YouTube video):
"""Shards for the counter"""
count = models.IntegerField(default=0)
name = models.CharField(primary_key=True,
NUM_SHARDS = 20
Retrieve the value for a given sharded counter.
total = 0
for counter in SimpleCounterShard.objects.all():
total += counter.count
Increment the value for a given sharded counter.
index = random.randint(0,
SimpleCounterShard.NUM_SHARDS - 1)
shard_name = 'shard' + str(index)
counter = SimpleCounterShard.objects.get_or_create(
counter.count += 1
The basic idea is to divide the counter into N sub-counters and to compute the counter's real value (
get_count) by summing over all values of the N sub-counters. While trying to increment the counter's real value (
increment_count) we select one of the N sub-counters at random. As a result we avoid contention. By increasing the number of shards you can increase the maximum throughput.
There exists an App Engine specific article about sharding counters from Joe Gregorio. If you want to learn more about sharding counters take a look at that article.
I ported the code (including GeneralCounterShard) from that article to native Django such that it can theoretically be used for any database supporting optimistic transactions. The code is available here and you can test the live demo.
Django's advantageNow we can get to an exciting example of how Django's ORM can get us out of the stone age of non-relational database development practice. Let's see an example using F() objects:
YouTubeVideo.objects.filter(pk=keyname).update(The database backend can detect such "counting updates" and use sharding for App Engine or other techniques like updates via background tasks for other databases automatically in order to perform such an update. Thus Django gives us the advantage of an additional layer of abstraction. Moreover, we can formulate "counting updates" independently of the database used, so we can switch to a different database without having to change the code.
Keep in mind that sharding is a technique which can be used for much more than just counters. This is our first App Engine app ported to native Django. If you plan to port some other App Engine app to Django you can use the sharding counter port to get an idea of how to do it. And please let us know about it.