User:Kokiri/WQA

From Wikipedia, the free encyclopedia

There are quantitative data about Wikipedia (statistics), but this article is an attempt to test the quality of Wikipedia. I have sampled 50 random pages using the Random Page function on 23 November 2003. I haven't counted the disambiguation page. Please note that the sample (50) is relatively small, but I do hope it helps to highlight some characteristics of Wikipedia. I don't think this test is very good, but its a start.

Update[edit]

I have now added a further column to the data, showing how many article link to each of the articles tested. The data of this column are as of 22 February 2004. Kokiri

Key[edit]

Entry/CriteriaComments
ArticleLink to the article
TypeThis states the kind of entry found. The following categories were used: stub for stubs; bot for bot generated entries on US places; gaps for entries that are essentially fragments; article for real articles; list for lists of entries; disam for disambiguation pages [not counted].
LengthThis comments on the length of the article. This criteria, whilst applied in a consistent manner, is rather arbitrary. The following categories were used: stub for stubs; short for short articles (up to 1 screen); medium (1 to 2 screens); long (over 2 screens). For further research I suggest a less arbitrary system, such as words.
MediaThis comments on the presence of any media files (e.g. pictures) and states the number if any.
Britannica 2002This compares with the Britannica 2002 DVD edition. If there is an article of the same title in Britannica, this is indicated, alongside with the lenght of the article found in Britannica. For the length the same criterion was used as was applied to the Wikipedia articles. Entries in brackets () identify the name of an article in Britannica that essentially covers the same ground.
CommentsAny comments on the entry.
LinksThis is the number of article that link to the one in question (What links here).

Results[edit]

CategoryNumberDetails
Articles198 long; 6 medium; 5 short
Bot Entries12
Gaps4
Lists3
Stubs129 of which not marked as stubs
  • Please note the small size of the sample (50) and the methodology used.
  • Out of the 50 entries tested, a mere 19 (38%) were articles. However, this is more than the number of entries that do not even look like a proper article (i.e. stubs and fragmented gaps; 32%).
  • Bot entries make up a great number of entries (24%) which might be considered disturbing. At first glance they look as if there was a perfectly finished article, but in fact no further work has been done on these entries. They are mere collections of statistical numbers. A fine detail is maybe that the bot entries not even mention that the entry is about a place in the USA (it only mentions the US state).
  • With 24% of the entries stubs make up a significant number (over a quarter) of the total. Out of the 12 stubs a staggering 9 (75% of stubs) were not marked as stubs (i.e. did not have a stub alert attached). This means that - unless the user changed her or his preferences - all links to the stubs look as if there was a proper article behind. Not being marked as stubs they do not appear on the list of stubs. Also, an outsider may assume that the rest of Wikipedia is no good either and disregard the many good articles there are on Wikipedia.
  • Comparing the stubs with Britannica is interesting. This helps to identify the quality of the stubs. A stub which has an equivalent entry in Britannica is bound to develop, one without may be on an abstruse topic, a geek subject or simply on something that does not belong to an encyclopedia. Interestingly three of the stubs (23% of stubs) do have entries in Britannica. It is the stubs on two Japanese cities and a glacier. Most other stubs have equivalent entries in Britannica, but as part of a larger article. This suggests that there might not be enough substance to the entry to justify an individual article. Only four of the 12 stubs (25% of stubs) have no equivalent entry in Britannica, one of which is a simple dictionary definition (Wikipedia is no dictionary).
  • The assessment may be interpreted as supporting Kill the Stubs. This so, as most of the stubs (75% of stubs) do not seem to have the potential to grow since there might simply not be enough to the entry. Many stubs cannot justify their existence without a supporting article in which they probably should be incorporated.
  • The story of the short, fragmented gaps is similar to that of stubs. About half of them have no equivalent in Britannica (2 entries), an equal number only as part of a larger article (2 entries). This again suggests that there is maybe no justification for an entry on its own. One short article has an equivalent short entry in Britannica.
  • None of the bot articles in the test had an entry in Britannica. This suggests that these places are not of great significant other than to their inhabitants. The existence of these bot articles contributes to the US bias in Wikipedia.
  • It is striking that there were only three elements of media in the sample (6% of entries with media). This was two pictures (4%) and one map (2%).
  • Lists do not appear in Britannica to a great extent, and where they do, the equivalents in Wikipedia tend to be more complete (yet generally no less US biased). (Of the three lists in the sample one was empty, one had a significantly shorter entry equivalent in Britannica and one had an equivalent article in Britannica.)

The Data[edit]

Here is the data that was collected for the assessment.

Article Type Length Media Britannica 2002 Comment Links
Thetford Township, Michiganbotmediumnoneno-1
Tsustubstubnoneyes (short)stub not marked6
Downtown Houstonarticlelongnone(Houston)partly list of (empty) links12
Rhythmarticlemediumnoneyes (long)-100+
Firehosestubstubnonenostub not marked5
Direct access storage devicestubstubnone(computer science)stub not marked2
Callaway Township, Minnesotabotmediumnoneno-1
Ante-Nicene Fathersarticlelongnone(patristic period)partly list of (empty) links10
Thermoplasticitystubstubnone(industrial polymers, chemistry of)stub not marked3
Corporalstubstubnone(private)stub not marked17
Long Creek, Oregonbotmediumnoneno-2
MXFgapsshortnoneno-4
Cartagodisam2
Millis, Massachusettsbotmediumnoneno-3
Bridgeport Charter Township, Michiganbotmediumnoneno-1
List of criminal justice notableslistshortnone(criminal law)-4
Spurius Cassius Vecellinusarticleshortnoneyes (short)-4
Diarmuid Ua Duibhnestubstubnonenostub not marked4
Mike Wattarticlemediumnoneno-4
St. Louis Post-Dispatchartcilemediumnoneyes (medium)-11
Herbsaintartcileshortnoneno-3
Chobitsarticlemediumnoneno-12
Emperor of Japanarticlelong1 picture(Japan)-300+
Word sense disambiguationarticlelongnoneno-2
European Parliamentarticlelong1 pictureyes (medium)includes a table150+
Roodmasarticleshortnoneno-3
Simon Magusarticlelongnoneyes (long)-15
Perpetual checkgapsshortnoneno ?-3
Burnstown Township, Minnesotabotmediumnoneno-1
158 BClistemptynonenoonly framework, no specific links2
Doe Maararticlelongnoneno-1
Network engineeringgapsshortnone(engineering) ?-1
Kitab-i-Iqanstubstubnoneno-1
Farmington, New Yorkbotmediumnoneno-2
Villa Ridge, Missouribotmediumnoneno-3
South Browning, Montanabotmediumnoneno-1
Ammistubstubnone(biblical literature) ?stub not marked2
Information Commissionerarticlemediumnoneno-11
Army Tactical Missile Systemstubstubnone(rocket and missile system)stub not marked1
Dogmatic definitionstubstubnonenostub not marked11
Crater Lakearticleshortnoneyes (medium)-12
Thomas Walkerarticlelongnoneyes (long)-0
Osseo, Minnesotabotmediumnoneno-2
List of national anthemslistlongnoneyes (incomplete)-450+
Aletsch Glacierstubstubnoneyes (short)stub not marked; not wikif.3
Richard Pankhurststubstubnoneno-1
Rough Rock, Arizonabotmediumnoneno-1
Tokorozawagapsshortnoneyes (short)-4
Alcona County, Michiganbotmedium1 mapnoincludes list of cities17
Susana Gimenezarticlemediumnonenoincludes list7
Peter III of Portugalarticleshortnoneyes (short)includes a table9