In What Do Tags Mean, Tim Bray says “There is no cheap metadata” (quoting himself from the earlier On Search.) He’s right, of course, in both the mathematical sense (metadata, like all entropy-fighting moves, requires energy) and in the human sense — in On Search, he talks about the difficulties of getting users to enter metadata.

And yet I keep having this feeling that folksonomy, and particularly amateur tagging, is profound in a way that the ‘no cheap metadata’ dictum doesn’t cover.

Imagine a world where there was really no cheap metadata. In that world, let’s say you head on down to the local Winn-Dixie to do your weekly grocery accrual. In that world, once you pilot your cart abreast of the checkout clerk, the bargaining begins.

You tell her what you think a 28 oz of Heinz ketchup should cost. She tells you there’s a premium for the squeezable bottle, and if you’re penny-pinching, you should get the Del Monte. You counter by saying you could shop elsewhere. And so on, until you arrive at a price for the ketchup. Next out of your cart, the Mrs. Paul’s fish sticks…

Meanwhile, back in the real world, you don’t have to do anything of the kind. When you get to the store, you find that, mirabile dictu, the metadata you need is already there, attached to the shelves in advance of your arrival!

Consider what goes into pricing a bottle of Heinz: the profit margin of the tomato grower, the price of a barrel of oil, local commercial rents, average disposable incomes in your area, and the cost of providing soap in the employee bathrooms. Yet all those inputs have already been calculated, and the resulting price then listed on handy little stickers right there on the shelves. And you didn’t have to do any work to produce that metadata.

Except, of course, you did. Everytime you pick between the Heinz and the Del Monte, it’s like clicking a link, the simplest possible informative transaction. Your choice says “The Heinz, at $2.25 per 28 oz., is a better buy than the Del Monte at $1.89.” This is so simple it doesn’t seem like you’re producing metadata at all — you’re just getting ketchup for your fish sticks. But in aggregate, those choices tell Del Monte and Heinz how to capture the business of the price-sensitive and premium-tropic, respectively.

That looks like cheap metadata to me. And the secret is that that metadata is created through aggregate interaction. We know how much more Heinz ketchup should cost than Del Monte because Heinz Inc. has watched what customers do when they raise or lower their prices, and those millions of tiny, self-interested transactions have created the metadata that you take for granted. And when you buy ketchup, you add your little bit of preference data to the mix.

So this is my Get Out of Jail Free card to Tim’s conundrum. Cheap metadata is metadata made by someone else, or rather by many someone elses. Or, put another way, the most important ingredient in folksonomy is people.

I think cheap metadata has (at least) these characteristics:

1. It’s made by someone else
2. Its creation requires very few learned rules
3. It’s produced out of self-interest (Corrolary: it is guilt-free)
4. Its value grows with aggregation
5. It does not break when there is incomplete or degenerate data

And this is what’s special about tagging. Lots of people tag links on, so I gets lots of other people’s metadata for free. There is no long list of rules for tagging things ‘well,’ so there are few deflecting effects from transaction cost. People tag things for themselves, so there are no motivation issues. The more tags the better, because with more tags, I can better see both communal judgment and the full range of opinion. And no one cares, for example, that when I tag things ‘loc’ I mean the Library of Congress — the system doesn’t break with tags that are opaque to other users.

This is what’s missing in the “Users don’t tag their own blog posts!” hand wringing — they’re not supposed to. Tagging is done by other people. As Cory has pointed out, people are not good at producing metadata about their own stuff, for a variety of reasons.

But other people will tag your posts if they need to group them, find them later, or classify them for any other reason. And out of this welter of tiny transactions comes something useful for someone else. And because the added value from the aggregate tags is simply the product of self-interest + ease of use + processor time, the resulting metadata is cheap. It’s not free, of course, but it is cheap.

