Words by c.z.robertson

P2P porn

2001-07-28 01:00:00 UTC

Two US congressmen, Henry Waxman and Steve Largent have commissioned a report on pornography in P2P systems which states that:

On these [Gnutella-based] programs, six of the top ten searches on a recent day were for "porn," "sex," "xxx," and other terms intended to elicit pornography.

While this may effectively demonstrate that Gnutella users search for porn in significant quantities, their place in the top ten actually tells us very little about their relative importance for Gnutella users and absolutely nothing about the actual numbers involved, and this leads me to believe that the compilers of the report are either incompetent statisticians or are acting disingenuously.

When I search for music on Gnutella my actual query will never include the terms "music" or "mp3". It will typically only consist of the name of a specific band I am interested in.

If I were to search for pornography on Gnutella, I would probably have a different approach. It's not obvious to me how I'd search for porn without using high-level terms such as "porn", "sex", "anal" and so on.

I can think of many more band names than I can high-level porn terms. Therefore, even if the vast majority of searches were for music, they'd be spread over such a wide range of query terms that those terms could easily be less likely to reach the top ten than the few porn terms which turned up in every porn search.

It is also worth noting that while an appendix lists the top thirty search queries on one day, they have not listed the numbers of times those queries occured, nor how many queries there were overall. Also, which day was this? How was this day chosen? Did they sample every day and only use the result that was most in their favour?

It is work like this that gives both statistics and politicians a bad name.

