Template Caching in TurboStan
Filed under: turbostanHaving solved the inheritance problem, my next goal is to implement some caching features in TurboStan.There are many issues around caching dynamic content and here's some of my preliminary thoughts with regard to caching TurboStan:
- Stan is basically a Python expression, so caching .pyc files is an easy option that will give at least minor improvements.
- Caching and serving static HTML is going to be the fastest.
- Most pages are a combination of static and dynamic content, so the programmer should be able to specify whether a fragment can be considered static. This way even a dynamic page can benefit from caching the bits of content that are static as HTML.
- Even dynamic fragments are quite often temporarily static. That is, they change infrequently, so these should be able to be cached as HTML given there is a trigger to indicate when the cache should be rebuilt.
- Session-based caching might be handy as well. Content may be dynamic across users but static per user. Having a mechanism to support this could be beneficial, but cache storage might explode, so this would need to be carefully considered.
Most of this is pretty standard I think, the question becomes how to specify these concepts within the constraints of TurboGears. Since templates are specified via a decorator, that seems one logical place to pass this information, however, due to the fact that inheritance may bypass this decorator that doesn't seem likely to work. Also I'm not certain that would give the fine-grained cache control I'm looking for.
The solution I'm leaning toward is to specify the cache directives directly within the template. While this will add some clutter to the template, it will also allow the programmer/designer to have cache control tied directly to the item being cached. Also, because this puts cache control directly within TurboStan, it will help insulate the caching system and reduce worries about some future caching system in TurboGears conflicting with whatever scheme I come up with.
# proposed caching directives within a fragment
inherits ( 'index' ) [
override ( 'title' ) [
cache [ # static cache, expires only on template change
xml ( ''
This never changes
'' )
]
],
override ( 'content' ) [
cache ( expires = vars.cache_expired ) [
# expires if the content_changed variable is True
vars.content
],
cache ( expires = vars.time_expired ) [
# expires based on static time, i.e. at midnight
vars.remote_rss_feeds
]
]
]
What I'm thinking here is that the expires keyword will take a function callback that will return a boolean informing it whether to expire the cache or not. TurboStan can then internally keep a table of these callbacks for each fragment and call them prior to rendering the fragment and if they return False, then instead simply return the static HTML from its cache. If no expires keyword is given, then a failed lookup in that table will return False by default (i.e. the content never expires).
I've never implemented a caching system prior to this so I'm sure there's plenty I'm overlooking. Regardless, I suspect implementation will reveal most of these.
Another idea that occurs to me is to simply provide a communication method with an external cache (i.e. Squid) much like Zope does. By this I mean, don't implement caching of actual static content within TurboStan, but rather simply do cache control, notifying the external cache when the cache is dirty. The downside to this is that it requires an external cache, which isn't always a pleasant option for small sites, so perhaps the ultimate solution would be to provide a simple caching hook within TurboStan that lets the developer choose between internal and external caching.
Sounds like fun.







So I went ahead and added the simplest form of caching, which in this case is caching the compiled Stan template (the Python bytecode) and did some quick tests.
Rendering a moderately sized TurboStan template without bytecode caching:
and with caching:
Pretty good improvement for little work. I did this part first since it was
The reason for the last item in the list is that in order to maintain a cache, there needs to be a way to uniquely identify cache fragments. When the templates are re-evaluated on every execution, all previous information about them is lost. The only unique identifier is the template path which is insufficient for tracking fragments within the template. By pre-compiling the template, my plan is to inject constants into the compiled output that uniquely identify cached fragments. Whether this is actually doable is yet to be seen but at least I have a basic plan of attack.
Incidentally, this test isn't really reflective of how long it takes Stan to generate 10KB of HTML. The page in question is a real page from a project I'm working on and includes fetches from a database among other things. Nevertheless, I'm mostly interested in relative times, so provided all else is equal the time differences are all that matter. Plus I think it's better to test against real-world data in this sort of thing, so while I'll certainly be doing some artificial benchmarks against Kid later on, for the time being this type of testing will be what I'm interested in.