Saturday 2 January 2010

Durer’s Drunk Bunny Must Feel Inverted Too (Session 8)

I used Boolean search terms in Bing: all of these terms "arts & crafts movement" AND "architecture" AND "best" produces great results, in fact even better than when I did not specify 'all of these terms' in the search box. When I replaced AND with OR in the exact same query as above it gave me results I have no interest in (most were about cooking). No coherent answer was returned if I asked "Who is the best architect in the arts and crafts movement?" Not surprising, as the answer is debated widely.

To find a photo of the Durer statue of the Drunken Bunny in Nuremburg, I created two inverted files:
DOC 1: Durer and the Nuremberg rabbit statue
DOC 2: Gothic sculpture photos and archived in a ‘collection’

Inverted files allow for full searches of terms in a document. They are the central data structure allowing typical search engines to operate. Rather than Google searching through a forward index of listed words in a document, which would take significant power and time, inverted indexes are lists of how many documents contain a specific word, like, for example, ‘Durer’. So instead of searching a list of words for each document that exists on the WWW, there are lists of words that correspond to a number of existing documents that contain the word ‘Durer’. (Lecture materials) Inverted files don’t work as well for images, as they must be specifically tagged by the creator or user. This is why Flickr is better than Google, because users throughout the WWW tag photos with descriptive terms on Flickr. On Web 1.0 sites, photos can not be tagged to make them more searchable.


No comments:

Post a Comment