Quantitative analysis of user-generated content on the Web
My Very Smart Friend and PhD. student Xavier Ochoa is at the WWW2008 conference in Beijing, China. Today, at the Web Science workshop WebEvolve, Xavier presented our paper on “Quantitative analysis of user-generated content on the Web“.
The abstract says:
User-generated content (UGC) is becoming the most popular and valuable information available on the WWW. However, little serious research has been conducted to measure the properties of its production process. This paper presents an in-depth quantitative analysis of 9 popular websites that are based on different UGC types. The Information Production Process is used as a framework for the analysis. The findings provide for first time strong scientific evidence for previously anecdotic knowledge: UGC production follows “long-tail” distributions and it is marked with a strong “participation inequality”. Also, the analysis arrived to unexpected findings: not all the UGC types follow the inverse power-law distribution, and large content collections could be dominated by the presence of ultraproductive users. The analysis results also have implications for the administration of UGC-based websites.
Seems like there is quite a bit of interest now in “fat tails“… I am VERY excited about this work and how we can actually measure things rather than just fantasize about them: for instance, it seems like the often-cited “reusability paradox” doesn’t exist – but that is a topic for a future post 🙂
I am also quite happy to see the Web Research Initiative make some progress with a more fundamental understanding of how the web changes everything.
And, most of all, it is a privilege to work with Bright Minds like Xavier on this topic…