Tuesday, December 23, 2008

Internet content filtering

Loose thoughts by Bjorn Landfeldt

There was an article in the Sydney Morning Herald this morning regarding a report I was involved in preparing for the Australian government late last year. The study was concerned with the feasibility of filtering content on the WWW at the ISP level. My own contribution to the report was not the major part and I mainly carried out a limited technical study

I feel that it is necessary to make a few things a bit more clear concerning the news paper article.

First, I don't think the study was very secret. In fact, the study made a wide consultation with the Australian ISP industry, content providers and other organisations / stake holders. There has been wide spread knowledge of this study even though the findings have not yet been widely released as far as I understand after reading the article. It is not my place to comment on at which time the government releases its reports even though I see no real reason not to release this specific report.

The issues raised in the report have largely been covered in preceding reports, at least the sections I was providing input to and even though they are very important issues to consider, I don't think they are damning since the issues are well known.

My stand on this issue is that there is a need to increase the scale of investigations if any such scheme should be made mandatory. If such a scheme is voluntary many of the difficult issues become obsolete or at least manageable. The following opinion merely reflects publicly available information that is not limited to the report. Anyone can search for this information in their local library or on the world wide web.

So, what is the big issue as I see it? A blacklist requires manual effort in order to determine what should be included. The Internet is a network of networked computers that carry information in many forms and realms, one of which is the World Wide Web. If we restrict ourselves to talk about only WWW, we have a global network with billions of pages worth of information. The information is made up of all different languages of the world and incredible diverse. Some information has a very high profile and some information has very limited visibility. Since a blacklist would rely on user reporting, it is questionable how efficient it would be to locate unwanted content in the first place. Second, every case would have to be tried to see if it breaches Australian law and falls within the categories specified for the filtering list. It will be a very difficult task to do this for content in the grey zone in all different languages. If the point is to stop child pornography, determining if a model is 19 or 25 in content from a different country with different jurisdiction is not an easy task and would be quite labour intensive. The next question is who is responsible for blocking of material that is legal if the wrong judgement is made?

The only way to identify such material quickly and significantly limit the risk of accidental access is to do some form of dynamic content filtering. However, the state of the art of such technologies is very limited in accuracy and if they are to be used there is a consequential performance impact on the response times of systems or at least an increased cost for the service provider. Current filters are rather good at detecting certain patterns of information such as a combination of many images and certain keywords usually means a porn site. However, there are at least two additional dimensions to consider. First, the current filters only look at such patterns, they do not try to analyse the actual content in any meaningful way. It is therefore difficult to distinguish between different types of content where there are similarities. For example, if a web site contains information about sex education or erotic content. Second, more and more content moves to other forms of multimedia and filtering and detecting the nature of content is much much harder in this case. For example, analysing a video and detecting that it has adult content is not a lightweight computational task. Separating sex education from porn is even harder. Third, if indeed there would be widespread filtering of content the providers would see a need to obfuscate content to fool filters. When we step into this realm it becomes very difficult for any filters to keep up.

This discussion can then be applied to an environment with other addressing realms than the WWW such as P2P networks and social networking applications and realms which shows that the level of difficulty is very high indeed. There is also a strong movement to anonymise users on the Internet to counteract information logging. Using simple tools such as VPNs to cross the Australian border would also enable circumvention of any centralised filtering scheme.

I believe it is a shame that the issue became a political issue in the latest election campaign. The question of protecting us from certain content is important and there should be a healthy and public debate about it. I have no knowledge of what other steps the government is currently taking to investigate this matter but I hope the scope of investigation is much broader than only doing a performance study of blacklist filtering.

I tried to download one of the free netalert filters for the computer my oldest daughter is using a while ago but the provider web site seems to be down. It is a shame, because I don't want her to see many things out there at her age, but I am making that choice and I accept responsibility for the over blocking.

Bjorn Landfeldt

5 comments:

Anonymous said...

Dear Prof. Landfeldt:

Even if only for security's sake (in this case citizen security), these matters should not be taken so lightly.

You do not need protection from "content", unless that content is part of a specific crime against yourself. But, as a citizen, you DO need protection from government abuse, bad law, and any kind of criminal abuse of 'safety' and 'security' measures.

Even if you do not agree with freedom of speech ­- including, by definition, whatever any of us may find repulsive, offensive or false - once filtering and surveillance mechanisms are in place they WILL be (ab)used in "unforeseen" ways you may not applaud.

Those defending with the best of intentions some concept of "illegal content" will end up understanding (hopefully not too late) how the belief in the possibility of "good" censorship ­- as some matter of technical fine-tuning and 'democratic oversight' ­- is both naive and dangerous.

Yes, some "content" may be part of a crime against someone (for instance, violating someone's privacy). The way to deal with that is good, classical, police work; which, in some ways, is easier in this internet age, even without any increased surveillance
or privacy and speech restrictions.

Unfortunately, for politicians in the usual rush of "doing something now", investment in filtering and surveillance seems more glamorous than investing in police training and appropriate police resources.

There is also a growing tendency to consider a vague "right not be offended" (not personally, but in some generic way, as part of some groups) a fundamental right, more "fundamental" than free speech, privacy and others. In this fashionable, politically correct view (which in itself would deserve a long discussion) lies much of the danger of and motivation for the current spreading of censorship.

Once filtering mechanisms are in place, can we be "protected" from the slope?

Unknown said...

I like the point you raised about a black list being driven by a complaints system. This is a major flaw, even with the government's use of international blacklists.

There is no way a complaints system is going to be effective when the providers of such content are so capable at staying hidden. They do not need to advertise and so a complaints system lacks any logical basis for implementation, especially in the case of the world wide web or even the internet at large.

The government, if it were to pursue such a filter, would eventually be pressured to implement a dynamic filtering system once this became clear. I use one of the government's filters on my own PC and can tell you that the overblocking is horrendous. Half the blog posts on the issue are blocked despite containing no porn whatsoever.

This is not a 'slippery slope' arguement. The fact remains that a blacklist is laughably innapropriate for online content and that calls for dynamic filtering will be made by the Family First Party.

Bjorn Landfeldt said...

Dear J M Cerqueira Esteves.

You are absolutely correct in that there are many mode dimensions to this issue and that is one of my points in suggesting that the current trials are only a small subset of the matters to investigate thoroughly.

When I was contacted by media I only commented on the role I played in preparing the report and not my personal views, and the same goes for this blog post.

Freedom of speech is fundamental to our society and the issue should be dealt with openly and very carefully.

Censorship of traditional media is quite different to censorship of an information network in many ways. I believe a better comparison between Internet content filtering and current processes would be to consider if airlines would have to go through your bags and remove books and videos that are banned in Australia for all passengers coming from overseas. The issue is closer to that than to censor commercial products for mass release.

Bjorn Landfeldt said...

Dear Websinthe

This issue is quite interesting. You are correct. The complaints based system is not well investigated. It is evident that a black list with some thousands of entries will be able to stop access to potentially very offensive or even harmful material but this is not the issue. We have no idea how important it is to target this material, how many children actually stumble across this and what effects does it have. We also don't know how much material would not be on such list. Especially, we don't know how much material would not be on such list if the way of doing business for the content providers would change.

We also have no idea of the lag involved in detecting information and testing its legality in a manner suitable to the Australian legal system. We are not talking ACMA classifications of MA or PG here, we are talking if something is illegal or not. There should be a proper legal procedure to deal with this as we have a court system for all other legal considerations. If not, we are moving into a greyzone indeed.

Bjorn

Start losing weight said...

Well there is a lot of filtering in internet cause you practical can speak with every one so i decide to filtering my self.