Saturday, January 15, 2011

The next big thing in digital: Visual Search and Image Recognition









 




The likes of Google and Facebook are going beyond the hypertext form of indexing into a visual realm. Say hello to the hyperimage.

It's being used for online shopping in Google's Boutiques.com website and will be coming soon to all Facebook users as part of its 'facial recognition' photo tagging feature, enabling your name to be suggested to friends when a photo looks like you.






But the potential for this goes far beyond the mere enhanced shopping or stalking experience. This could be another step closer to the real Memex.

With the sea of data that traverses the internet every day, the web now faces the same challenges that traditional media, librarians and scholars have struggled with for years. How do we sort all that information out and present it so that it makes sense? And how does one find quality information that's relevant?

NYU Professor Clay Shirky highlighted this issue during his presentation at the Web 2.0 Expo in New York:
"All of those other media types have the same economics. Whether it's a printing press or a TV tower: 'It cost me a lot of money to get started, and so I had to filter for quality'.
So here's what the internet did: it introduced for the first time post-Gutenberg economics. The cost of producing anything by anyone has fallen through the floor, and as a result, there's no economic logic that says you have to filter for quality before you publish."


It's Not Information Overload. It's Filter Failure.

In the pioneering article, 'As We May Think', published in the Atlantic Magazine's July, 1945 issue, Dr. Vannevar Bush clearly vents his frustrations on the way scientific research is organised and retrieved, based on the system design issues he observed which are quite similar to what's happening on the web today:

"Our ineptitude in getting at the record is largely caused by the artificiality of systems of indexing. When data of any sort are placed in storage, they are filed alphabetically or numerically, and information is found (when it is) by tracing it down from subclass to subclass. It can be in only one place, unless duplicates are used; one has to have rules as to which path will locate it, and the rules are cumbersome. Having found one item, moreover, one has to emerge from the system and re-enter on a new path.

The human mind does not work that way. It operates by association. With one item in its grasp, it snaps instantly to the next that is suggested by the association of thoughts, in accordance with some intricate web of trails carried by the cells of the brain. It has other characteristics, of course; trails that are not frequently followed are prone to fade, items are not fully permanent, memory is transitory. Yet the speed of action, the intricacy of trails, the detail of mental pictures, is awe-inspiring beyond all else in nature.

Man cannot hope fully to duplicate this mental process artificially, but he certainly ought to be able to learn from it. In minor ways he may even improve, for his records have relative permanency. The first idea, however, to be drawn from the analogy concerns selection. Selection by association, rather than indexing, may yet be mechanized. One cannot hope thus to equal the speed and flexibility with which the mind follows an associative trail, but it should be possible to beat the mind decisively in regard to the permanence and clarity of the items resurrected from storage.

Consider a future device for individual use, which is a sort of mechanized private file and library. It needs a name, and, to coin one at random, "memex" will do. A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory."
Perhaps we could all be using visual 'filters' for trawling the web in the not too distant future. As image recognition reaches maturity it could very well become an an alternative method or welcome addition to the keyword search. So here's hoping for Bush's, "Selection by association".