... indistinguishable from magic
effing the ineffable since 1977
Programming and Performance

Programming and Performance

The approach I take to performance issues while coding is that performance issues should be in the back of your mind at all times. Not to ignore them, but also to resist the temptation to focus on performance too much during the design and initial implementation of a feature, and planning to revisit the issue if performance problems become apparent later.

This philosophy has both strengths and weaknesses and recent events have showcased both of these.

cmScribe uses a complex and flexible fine-grained permissioning mechanism where permissions can be granted to all kinds of actions on all kinds of objects. Having certain permissions can cause others to be granted implicitly, and the rules for this kind of implication can be any arbitrary C# code. Since the permissions are so fine-grained, any given page hit can require a large number of permissions to be evaluated. Furthermore, the implication rules mean that evaluating one permission may require a number of others to be evaluated as well.

The system is so complex, in fact, that I struggled quite a lot during the initial design process to come up with a way of meeting all the requirements at all. (Is it over-engineered? I don't know. I do know that after using it for a year there's only one feature I'd have cut, and that's never been used and doesn't add any complexity) The first and biggest advantage of keeping performance issues on the backburner is that if I'd had to juggle performance along with all the other constraints I was trying to meet, I don't know whether I'd have been able to produce a working system in the first place. In this case, deferring performance for later may have made the difference between impossible and possible.

Since then I've had to revisit this code for performance reasons on two or three separate occasions. You could look at this as a disadvantage of the approach I took: surely if performance had been designed in from the very beginning then I wouldn't have had to repeatedly fix performance problems later. But you can also look at it as a strength: the code worked adequately to start with without spending the time on performance. Later, as more demanding scenarios came up, it was possible to fix it without too much trouble to again perform adequately, by a combination of caching frequently-used information in memory, tweaking the order of operations to make the common cases use less steps, and micro-optimizing the individual steps to eliminate avoidable database hits and other expensive operations. I'm in the middle of an iteration of that process right now, and I'm entirely confident that I can have it performing adequately again shortly.

The weakness of the approach, however, is that an architecture designed without considering performance (since I was struggling so much with all the other issues, performance was probably further back in my mind even than usual) has turned out to have some performance bottlenecks that simply can't be removed without changing the architecture itself. There are situations where it's possible to know based on fixed information that there's no way a user could possibly have a particular permission, but that fixed information isn't available within the architecture, so the code will still chase down a number of dead ends before it arrives at the answer. And there's no way to make that information available with little caching tweaks and micro-optimizations. It needs a whole new structure.

For now, I can continue to tweak the heck out of the existing architecture and I'm confident it will perform adequately for quite some time. Which leads to the final advantage - even when you do reach the point where there's nothing to be done but throw out the whole thing and start over with performance at the very front of your mind, the experience gained from the first attempt will be invaluable in designing the system the right way. Every tweak I make to the existing code will be designed into the next version from day one.

Sounds a lot better than being stuck a year ago unable to write the thing at all because I couldn't get my head around how to make it fast, doesn't it? :)