User talk:Evercat/PP

Nice detective work, although, for comparison, I suggest you choose someone else with a comparable number of edits to PP who we know is CERTAINLY not Lir, and compare THEIR edits to Lir's, just to see how many pop-up by chance alone. --Dante Alighieri 17:34, 29 Jul 2003 (UTC)

Yes, I was wondering about that. Would need to ask Daniel, he did that bit... Evercat 17:45, 29 Jul 2003 (UTC)

Regarding the edit corrilation stats. I think this is misleading. I can almost garentee that you could find another user that has made contributions in a similar statistical manner to Lir. It may provide reason to be suspicious (which you have evry right to), but it doesn't provide any real proof that Lir == PP.

Indeed. But it's the combination of all the different reasons to believe Lir is PP. No one piece of evidence makes the case totally concrete, but together the evidence is very strong, I think. Evercat 19:11, 29 Jul 2003 (UTC)

This info could be presented in a more convincing manner. Given that Lir was one of our most prolific editors (10th highest number of edits), you could make similar lists with just about anyone. What would be more interesting is the correlation in their editing habits, though that's a statistically tricky calculation. Martin 19:23, 29 Jul 2003 (UTC)

Agreed. At the very least, someone can use this information to look at specific edits and look for similarities. The statistically tricky calculation is not too hard (using a Bayesian classifier to compare edits of other users to Lir, you would need a corpus of text deleted by Lir and another of text added by Lir), but I'm not getting paid to do this and it's not really that interesting of a problem. Most drastic outcome is that someone decides that Lir is PP and PP is banned in which case, PP just registers a new account and the cycle begins again. Daniel Quinlan 20:24, Jul 29, 2003 (UTC)

I wasn't thinking on a per-word basis - just the existing info presented differently, as I describe below. As you say, you're not being paid, and I find this more interesting than you... ;-) Martin

Disregarding the idea that this evidence be presented at the level of a court document or a thesis project on Lir-- it's good to see this done-- as a substantive, professional and to-the-point piece of work. It sets a precedent for future problem users, and the ways to deal with them.-戴;&#30505sv 20:45, Jul 29, 2003 (UTC)

(to Martin) Thing is, the original PP page gave you some idea of ratios, ie what's the ratio of pages edited by Lir and Vera Cruz (say) compared to the ration of pages edited by Lir and PP (say).... Evercat 23:31, 1 Aug 2003 (UTC)

All irrelevant now, but...

That's not a statistically useful ratio. The important ratio would be the number of pages edited by Lir and Vera Cruz but NOT Pizza Puzzle, compared to the number of pages edited by Lir and Vera Cruz, AND Pizza Puzzle. I hope you can see why this is a better reflection of the degree to which Lir and Pizza Puzzle cross-edited.

What I wanted to do (and may yet do) was to get two percentages: what proportion of edits by Pizza Puzzle are to pages that have been previously edited by Lir, and what proportion of edits by Lir are to pages that were subsequently edited by Pizza Puzzle. I think those two figures would be more convincing than the entire list you have, if they were high. If low, they would, to me, make the entire list unconvincing. Martin 14:17, 2 Aug 2003 (UTC)

Numbers and common sense[edit]

Just to inject some common sense into this discussion: I was the first person to point out that PP is identical to Lir, based primarily on his editing behavior. It was simply common sense -- it was unlikely that there was another user with this exact combination of editing traits. It is difficult, however, to translate this common sense into numbers, because these numbers are hard to get by:

How many users make a similar numbers of small edits per article?
How many users respond in the same short sentence style as Lir does?
How many users put lots of headlines on pages, often one headline per paragraph?
How many users have a good grasp of all the functionality of Wikipedia immediately after signing up?
How many users have a similar anarchist/anti-American mindset?

Now, based on our Wikipedia experierence, we could all give some guesstimate answers to these questions. I would answer question 1) with "very few" (yes, there are many users who make several edits per page, but very few who make as many as Lir), question 2) with "very few", question 3) with "very few", question 4) with "few" and question 5) with "some". Now, the last question is of course, "Is it likely that many users exhibit all these traits together -- in combination?" Since each of these traits is fairly unlikely, and they have little to relationship each other, the common sense answer to that question would be: not fucking likely.

Don't get me wrong -- I appreciate any and all attempts to collect hard data. Hard data on the above questions is useful. Hard data like the Lir/Vera Cruz etc. comparison is especially useful. But again, the way this hard data has been questioned shows a substantial lack of applied common sense. Sure, a control group would be nice -- but looking at some of these pages, again, you can ask questions like: Is it likely that two users get in an edit war, after one another, on the same obscure page (New Imperialism), using the same arguments, and the exact same editing style (as above)? The answer is, of course: not fucking likely.

Again, once you start using common sense, it's easier to see methods to verify such probability statements. For example, from the above we can conclude that it is not only important which pages have been edited by the same users, but also within which timeframe these edits have been made, and how many different users have edited the page in total. For example, while many active users will show up on pages like Wikipedia:Vandalism in progress, how likely is it that two people edit many of the same, rarely edited pages (William Herschel, Diurnal motion, Charles Moose etc.) in the same style, after one another? Again: not fucking likely.

However, how much support is really necessary? If common sense answers to many of the above questions are clear, then you should just proceed, give the user a last chance to explain theirself, and then re-ban them. If you set the standard of evidence too highly, any intelligent user can easily circumvent the banning process. If you set it too lowly, wrong people will get banned. It's a tricky balance -- but right now, we're way off the scale. The current strategy seems to be to kill this particular troll by overfeeding him.—Eloquence