22 Dec 2012
Software development happens in various scales, right from little applications, hacks and tools that help us do one thing really well to monolithic operating systems, virtual machines and distributed systems that are managed by large teams of engineers. Having worked in most parts of this spectrum, whenever I have to pick a new language or framework, I ask myself at what scale the project I intend to take on is going to operate in the near future.
By now, most people tuned-in to the lean start-up bandwagon already know about the importance of hitting the market first, iterating quickly on a product and so on. If you’re building something for just testing the waters, it really does not matter if your tech stack cannot scale beyond 100 concurrent users. It also does not matter if you don’t test-drive your application, or if you borrow a free theme or use Twitter bootstrap (still beats enterprise software by miles). Scaling problems might eventually plague you in the future, but from experience, that’s often a happier place to be than convincing folks to fork out their cash for something that they don’t need.
Whenever someone writes a rant about a technology that they are having pains with (hello MongoDB), I pause to ask myself whether the author is trying to use a knife to chop a tree, as more often that not, most rants expect something to be fast, simple, elegant, horizontally & vertically scalable and also help them paint rainbows in Sahara desert. In reality, that doesn’t work, and a lot of people do know it. Yet, somewhere along the way, we get so vested in our tools that we expect it to dynamically grow with our application needs.
28 Oct 2012
I have been tackling some incredibly hard problems at work over the past few months. While I can’t quite yet talk about all of it, I wanted to share a meta observation that I gleaned along the way.
It’s hard to dispute that repeatedly failing at something tends to increase our chances of succeeding at it. This is the basic premise of Gladwell’s 10,000 hour rule of success as well: over time, we learn from our mistakes and get better.
There is however, one more aspect of repeated failure that’s a lot more subtle, and worth mentioning here: repeated failure helps us understand what it takes to succeed, and in the process even redefine the notion of success. Success is a fairly arbitrary term, and in a complicated field of work, it’s usually not very easy to define it in an isolated, cut and dry fashion. We also tend to have multiple goals, often in conflict with each other. For instance, you might fail to solve a simple problem, but in the process partially solve a different problem, having far greater impact on your other goals. Or you might even realize that you did not in fact want to succeed at what you wanted for various reasons. Both of these have occurred to me in the past few months.
18 Mar 2012
I just finished watching this talk by Jack Diederich (Python core developer) – somewhat flamebait-ishly named “Stop Writing Classes”. In his talk, Jack shows, through various examples, how introducing a class just for the sake of it actually makes the code harder to read and maintain.
While reflecting on the talk, I realized that the actual problem here is that people take OO too far. I don’t know how OO is taught elsewhere in the world, but from my own experience, a large part of this abuse can be attributed to the way OO is preached in various CS courses. A lot of emphasis is placed on grilling into young heads the virtues of OO and it’s place in the big enterprise world, without actually explaining why OO works when developing large applications. And, do courses on OO actually highlight when it does not work?
In interviews, when asked to solve a tricky problem, I find it disturbing that people immediately start off with skeletons of classes. It has become some kind of protocol that one should be modeling everything in classes and objects to come off as a competent software developer. Jack shows a succinct solution to Conway’s game of life problem. A solution that highlights the programmer’s knowledge and elegant use of Python’s
yield, but which will probably be rejected as “poor procedural code” in a lot of places.
In many ways, OO has become a safety belt. By having a few classes, atleast no one can fire you for writing procedural code right? If poorly written procedural code is death by repetition, poorly written OO code is death by multiple levels of insane abstractions.
One technique that has worked for me when solving a problem from scratch is to first actually solve it by using functions that perform highly specific operations. As the solution evolves, I start identifying fragments of code that can either be pulled into a separate class for better abstraction or moved from one class to another. This way, my code moves towards object orientation based on actual need, rather than because of a hypothetical high level “modeling” of the problem. This way, I also don’t end up with a class like this:
Simply because I (hopefully) would have felt stupid refactoring something, anything to that.
16 Mar 2012
A client wanted to prevent paid users of their product from sending messages with email addresses to the free users. The client felt that allowing such exchanges to happen would make the free users less inclined to upgrade to a paid account. Anyhow, we went ahead and implemented a robust email masking “feature” which blanked out any fragment of text that appeared to be an email address. We felt pretty smug about it because it could even catch smart users who pulled tricks like john at example dot com. Heck, we had automated tests to cover all those edge cases and hairy scenarios!
The users defeated the system in the following ways:
You can contact me off here – jack sp 1967 at g mail (all in one address).
Take the first letter from each of the following words: please don’t count rabbits because they increase everyone’s expectations at great mayhem and internal lost dot clouds over mountains.
When we were implementing the email masking functionality, at one point, I was wondering whether we were going overboard in coming up with all kinds of ways to break the system. In fact, I’m sure I even thought, “Huh, these are probably non-technical folks, so we don’t have to go to really convoluted extents”. Boy, was I wrong.
My favorite exploit included sending the email address using NINE separate messages:
I’m sure even if we had spent two more weeks on the masking feature, we wouldn’t have been able to catch that one!
NOTE: the above messages do not of course contain the actual email addresses of the users.
2 Mar 2012
In Git, if you find yourself constantly typing
git push origin branchname to push your local commits to the remote branch, here’s a tip: make your local branch automatically track your remote branch.
When creating a new remote branch, here’s how you can make your local branch track its remote branch:
# creates a new local branch
git branch foobranch
# creates a new remote branch, and makes local branch track that (-u)
git push -u origin foobranch
If you want to set-up tracking for an existing branch, you can do so by:
# set-up tracking for an existing branch
git branch --set-upstream foobranch origin/foobranch
If you are checking out an already existing remote branch, you can set-up tracking in a single command:
git branch --track foobranch origin/foobranch
# or, alternatively
git checkout --track origin/foobranch
git checkout -b foobranch origin/foobranch
Setting up tracking offers two advantages. Firstly, it reduces the number of characters you have to type to push your changes to a remote branch. More importantly, it prevents you from accidentally typing
git push which will end up pushing out the local commits in all your other branches as well (which you might not be ready to push yet!).
Of course, you can also set-up tracking by directly modifying your