Interesting use of Computer Vision Algorithms
I recently came across an application called Sikuli, and upon digging in to learn more was very interested to learn that it accomplishes its goals through computer vision algorithms in part.
CV has some amazing applications, but this is one of the more practical and applicable to my daily life. Essentially, Sikuli is an automation environment for anything graphical. You could use it to automatically configure your IP address in OS X, or to test a web application. Where CV comes in is in the way you instrument your automation. Sikuli provides a high level scripting environment where you tie actions to the user interface by taking screenshots of the elements you want to interact with automatically. Pretty ingenious really, the CV algorithms compare your screen shots to a view of what’s going on currently, and decide where to move your cursor accordingly.
I’ve used a variety of GUI automation frameworks in the past for both work and personal automation, but in terms of effort required by the user, and power provided by the tool, Sikuli definitely takes the cake. I created a few small automations to give a first hand spin, and to say I was impressed is an understatement. It’s a great project/application, is cross-platform compatible, and is so good I’m trying to figure out all the possible ways I can incorporate it at home and work right now. If you need to automate a GUI for any reason, start with Sikuli first.