Skip to content
11 July 2008 / erikduval

Snowflake number

A theme I’ve been mulling over lately is the question of how unique we really are… More specifically, what set of characteristics makes us really unique?

A simple example: my name is not unique- or at least I don’t think so, though the first five pages of a google search on my name seems to bring up nobody else? Anyway, in general, the first and last name of a person are not unique.

A more advanced example: consider all the playlists of songs – the iTunes Music Store lists 1.421.247 public iMixes at this moment, and almost 7 million votes on those playlists. I suspect that there are no two identical iMixes. (If you know how we could check this, then please do let me know in the comments or by email!) I suspect that my playlists (public as iMixes or private) are unique for me.

Let’s make this a bit more complicated. Take my listening history: probably, if you only consider the last song I listened to, then that is not very unique. There must be other people for whom this is the last song they listened to. Take the last two songs: there may still be other people for whom these are the last songs they listened to. But there must be a number n that makes me unique: there would be no other person that listened to the same last n songs.

I’ve started to refer to such a number as the snowflake number: it is the lowest number of items to consider that makes me unique, just like every snowflake in a snowstorm is unique – it is where the snowflake effect starts to play.

Once you start to think about things this way, a whole set of variations come to mind:

  • what is the snowflake number for grocery shopping, i.e. what number of items in my shopping cart do you need to take into account to identify it as uniquely my shopping cart?
  • what is the snowflake number for travel, i.e. what number of my last trips do you need to consider before you find nobody else with the same sequence of last trips? (Anyone else has BRU-VIE-BRU as the last trip? Probably. And BRU-LHR-MIA-LHR-BRU before? Probably down to a small number of people now? And BRU-LGW-PSA-LGW-LHR-BRU before Miami? Guess I’m unique now?)
  • what is the snowflake number for books? Anyone else who just finished “Exit Ghost” – seems like I’m not alone. Anyone else read “Everything is miscellaneous” before that one? I may be unique having that as the last two books I read? In that case, my snowflake number for books is 2. (Do let me know if you share this sequence for the last two books you read!)
  • etc.

As a smart reader, you will have noticed some nice characteristics of snowflake numbers:

  • you can either consider the ordered or unordered list of items: if nobody else read those two books in that order, then my ordered snowflake number is 2, but if some have read the same books in another order, then my unordered snowflake number is 3 or higher;
  • you can either consider all the items or from last to more recent only: if you consider all the items, then the question becomes how many of the books I’ve ever read you need to consider before you can no longer find anyone else who has read the same books. I suspect that may be more than 10, but doubt that it is more than 30…
  • etc.

The more eccentric your taste or habits are, the lower your snowflake number will be…

This snowflake number is something I’d really like to explore more. Do you have any other suggestions for contexts in which it would make sense? Do you know of any research on this topic? Do you know how we could compute this for interesting data sets? Do let me know…

8 Comments

Leave a Comment
  1. erikduval / Jul 13 2008 3:31 pm

    Martin Wolpers (http://www.cs.kuleuven.be/~martin/) sent me some good comments by email – which he doesn’t mind me sharing here with you:

    I was looking at your snowflake number post when I remembered something where they try to avoid (!) such a thing. k-Anonymity is such an approach where they try to have duplicates in any sequence of data so that the data sequence cannot point to one single person. Therefore, I think that data security and privacy is a good field to look for answers to the questions you pose in the blog. You might also want to talk to the people of TAS3: http://www.tas3.eu/

    Another area that might be interesting is searching history. There is much work ongoing there to find duplicates in sequences as well…

  2. Daan / Jul 13 2008 5:42 pm

    A subject on this in social networking perhaps could be the last friends you made (on Facebook for example). Their will be a snowflake number of last friends that you made, before you couldn’t find someone else who made the same last friends.
    I think they already use these thinks like last made friends for their ‘friends you might know’ tool and suchlike.

  3. Tom / Jul 23 2008 12:33 pm

    Does this number have actual use, or is it mere play?

  4. Thomas / Sep 11 2008 10:07 am

    I read an article about a professor researching at an american university about the social interconnections on facebook. I could imagine that it was possible to use that data to find out more about the “snowflake number”. Unfortunately I can’t give more details. I already browsed through my notes but I can’t find the article or his name any more.

  5. Thomas / Sep 11 2008 10:50 am

    I found the article (I scanned at least the first page): it is Nicholas Christakis at the Harvard university. I think it is not exactly what your are looking for but maybe the data is helpful if interpreted in a different way.

  6. erikduval / Sep 13 2008 11:27 am

    MANY thanks for all the pointers to related work. This is something I will work on quite a bit more, so any additional references, comments, pointers, etc. are much appreciated!

Trackbacks

  1. The Snowflake Number « Erik Duval’s Weblog
  2. Blogging About Generational Differences « Virtual High School Meanderings

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: