Grrrr – I hate not having the time to write the post I want to write on this, but here goes…

Tags are labels attached to things. This procedure is absolutely orthogonal to whether professionals or amateurs are doing the tagging.

Professionals often think tags are covalent with folksonomies because their minds have been poisoned by the false dream of ontology, but also because tagging looks too easy (in the same way the Web looked too easy to theoreticians of hypertext.) Not only are tags amenable to being used as controlled vocabularies, it’s happening today, where groups are agreeing about how to tag things so as to produce streams of e.g. business research.

More importantly, tags are not the same as flat name spaces. The LiveJournal interests list, the first large-scale folksonomy I became aware of (though before the label existed) is flat. The interest list has one meaning: Person X has Interest Y, included as part of in List L. All L is attached to X, and all Y’s are equivalent in L.

Tags don’t work that way at all. Tags are multi-dimensional, and only look flat, in the way Venn diagrams look flat. When I tag something ‘socialsoftware drupal’, I enable searches of the form “socialsoftware & drupal”, “socialsoftware &! (and not) drupal”, “drupal &! socialsoftware”, and so on.

Hierarchy is a degenerate case of tags. If hierarchy floats your boat, by all means tag hierarchically. If I tag so that A &! B returns no results, and a search on A alone returns the same items as A & B, then A is a subset of B at the moment.

This last point is key — the number one fucked up thing about ontology (in its AI-flavored form – don’t get me started, the suckiness of ontology is going to be my ETech talk this year…), but, as I say, the number one thing, out of a rich list of such things, is the need to declare today what contains what as a prediction about the future. Let’s say I have a bunch of books on art and creativity, and no other books on creativity. Books about creativity are, for the moment, a subset of art books, which are a subset of all books.

Then I get a book about creativity in engineering. Ruh roh. I either break my ontology, or I have to separate the books on creativity, because when I did the earlier nesting, I didn’t know there would be books on creativity in engineering. A system that requires you to predict the future up front is guaranteed to get worse over time.

And the reason ontology has been even a moderately good idea for the last few hundred years is that the physical fact of books forces you to predict the future. You have to put a book somewhere when you get it, and as you get more books, you can neither reshelve constantly, nor buy enough copies of any given book to file it on all dimensions you might want to search for it on later.

Ontology is a good way to organize objects, in other words, but it is a terrible way to organize ideas, and in the period between the invention of the printing press and the invention of the symlink, we were forced to optimize for the storage and retrieval of objects, not ideas. Now, though, we can scrap of the stupid hack of modeling our worldview on the dictates of shelf space. One day the concept of creativity can be a subset of a larger category, and the next day it can become a slice that cuts across several categories. In hierarchy land, this is a crisis; in tag land, it’s an operation so simple it hardly merits comment.

The move here is from graph theory (arrange everything in a tree graph, so that graph traversal becomes the organizing principle) to set theory (sets have members, and the overlap or non-overlap of those memberships becomes the organizing principle.) This is analogous to the change in how we handle digital data. The file system started out as a tree graph. Then we added symlinks (aliases, shortcuts), which said “You can organize things differently than you store them, and you can provide more than one mode of access.”

The URI goes all the way in that direction. The URI says “Not only does it not matter where something is stored, it doens’t matter whether it’s stored. A URI that generates the results on the fly is as valid as one that points to a disk.” And once something is no longer dependant on tree graph traverals to find it, you can dispense with hierarchical assumptions about categorizing it too.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Academia and Wikipedia

[In direct response to various points in Clay’s K5 Article on Wikipedia Anti-elitism which responds to Larry Sanger’s Why Wikipedia Must…


Folksonomy, a new term for socially created, typically flat name-spaces of the ilk, coined by Thomas Vander Wal. In…