Wikipedia:WikiXRay

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by GlimmerPhoenix (talk | contribs) at 13:29, 21 January 2007. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Jump to navigation Jump to search

Main Goal

The main goal of this project is to develop a robust and extensible software tool for an in-depth quantitative analysis of the whole Wikipedia project. This project is currently developed by José Felipe Ortega (User:GlimmerPhoenix) at the Libresoft Group, at Universidad Rey Juan Carlos en:Rey_Juan_Carlos_University.

Currently, this tool includes a set of Python_(programming_language) and GNU_R scripts to obtain statistics, graphics and quantitative results for any Wikipedia language version. Current functionality includes:

  • Downloading the 7zip database dump of the target language version.
  • Construction and decompression of the database dump in a local storage media.
  • Creating additional database tables with useful statistics and quantitative information.
  • Generating graphics and data files with quantitative results, adequately organized in a per-language directory substructure.

Most of these capabilities still require manual configuration of a centralize config file, though also most of them work automatically.

The source code is publicly available under de GNU GPL license, and could be found in the official WikiXRay project page WikiXRay Project Page at BerliOS en:Berlios. The home page is currently under development, so please wait a few moments for the project summary page to load.

Please note that this software is still in a very early development stage (pre-alpha level). Any useful contributions will be of course welcomed (first contact with the project admin).

Following, we summarize some of the most relevant results we have obtained so far. These results comes from a quantitative analysis focused on the top 10 Wikipedias (attending to their total number of articles).