Category Archives: Uncategorized

How big a problem is Click Fraud?

According to the latest Click Fraud Index, 15.8% of clicks on pay-per-click ads are fraudulent, rising to 25.6% if you look solely at content networks, such as Google Adsense.

This is a can of worms. Almost any question you can ask is hard to answer with confidence.

What is the real level of click fraud? Google’s Shuman Ghosemajumder says that estimates such as the Click Fraud Index are overstated. In addition, a proportion of the fraudulent clicks are detected by Google and not charged for. Which leads to another question –

How many fraudulent clicks are actually charged to advertisers? Google says very few – but they would. Intriguingly, it also apparently discounts content network ads on the grounds that there is more fraud there:

We know there is a more direct incentive for fraud on the content network and we do much more to protect advertisers, ban bad publishers, and improve ROI through SmartPricing discounts. As a result, average ROI on our content network is nearly the same as on Google.com. Yes, you read that right. ROI is the same on average – and not by accident, but because we automatically provide discounts to advertisers to make it so.

Is that not a tacit admission that advertisers are paying for some fraudulent clicks? It sounds like that to me – Google discounts all the clicks to compensate for for the fact that some are bogus.

How is click fraud detected? Some of this is obvious – rapidly-repeated clicks from the same IP number, for example, and clicks from known botnets. As I understand it, the Click Fraud Network goes further by attempting to analyze the activity after the click has landed. However, nobody is going to publish the exact algorithms they use, for fear of helping the fraudsters (and giving away secrets to competitors).

What about clicks from unknown botnets? What about click fraud programmed skillfully enough to replicate likely human behaviour? I’m sure the fraud detection is imperfect; the question is, how imperfect?

Probably the best way to figure this out would be to penetrate the criminal underworld that executes the fraudulent clicks and discover their techniques. Even this would not catch the casual fraud – for example, someone who goes into an Internet cafe from time to time and clicks on a few ads on their sites.

How can advertisers audit their bills? Difficult, because as I understand it the agencies (such as Google) do not supply enough detailed information. You would need the exact time and IP number of every billed click, which you could then match to your log records, analyze behaviour, and check out against known compromised IP numbers. There is a general lack of transparency.

About the only thing you can measure is results. In other words, attempt to correlate sales against ad expenditure. Google says the same – watch your ROI. The reason advertisers meekly accept a certain (unknown) level of fraud is because they feel it is worth it. After all, there is waste in every other form of advertising – TV viewers who skip ads, print ads that are never opened, etc. 

That’s all very well, but the main difference with click fraud is that it is not just waste, it is also fraud. How many people are getting prosecuted? Another question that I can’t answer, but I suspect, nowhere near enough.

UK Mix07 announced

Microsoft is holding a mini-Mix in London, September 11-12. It appears to be focused on Silverlight, Expression and Windows Live, and speakers include Scott Guthrie of ASP.NET fame (but now with wider responsibilities) and Danny Thorpe of Delphi fame (but now at Microsoft and working on Windows Live APIs).

Windows Vista and Office 2007 may have under-delivered, but this stuff is pretty interesting; worth a look if you can make it. Of course I’m particularly interested in the Day 2 session called “Sneek peeks”.

It’s a while after Mix in Vegas earlier this year, but Microsoft’s Daniel Moth says there will be new content.

Technorati tags: , , ,

LINQ: “Customers are massively confused”

I’ve just completed an article for Hardcopy magazine on database APIs – it’s for a forthcoming issue so the piece is not online yet. I interviewed Mark Troester, senior manager of product marketing at DataDirect, and he gave me some interesting comments on LINQ (Language Integrated Query), Microsoft’s new database extensions for the .NET Framework.

There’s a lot of confusion out there because there is LINQ to SQL and LINQ to the entity model, and the LINQ to SQL stuff that has just got released in beta is specific to SQL Server, so Microsoft needs to do some work in terms of getting things better organized. I think their argument would be that they need a lot of flexibility, but from what we see when we talk to customers, people are just massively confused.

As I continued my research I could see what he meant. What I think is the official home page has plenty of resources, but it’s a bit of a mess. If you click the link to the FAQ – actually a forum post – you find it dates from March 2006 and talks about DLINQ, which is now an obsolete term (I think). The project overview is pretty good, but gets you deep into Lambda Expressions and Extension Methods without mentioning important practical issues like the fact that LINQ to SQL seems to be exclusive to SQL Server. This unfortunate fact is confirmed by Microsoft here:

Q: It was mentioned that there is currently no LINQ to SQL support for Oracle or MySQL databases.  Will it be possible for developers to implement their own SqlProvider implementations for these database engines?
A: We don’t really have a provider model or provider writers’ SDK in this release (Orcas). So it is possible to build a LINQ provider similar to LINQ to SQL; but unfortunately, in this release, we won’t be able to offer much to make that as simple as we would like.

Of course you could still use LINQ to DataSet with Oracle or MySQL data … see why it gets confusing?

By contrast, I like Granville Barnett’s Introducing LINQ series (though not the pop-up ads), which takes a hands-on approach, as does Scott Guthrie’s LINQ to SQL series (the link is to part 5, but it references the other articles).

I also highly recommend Matt Warren’s blog post on The Origin of LINQ to SQL and the interesting comments.

LINQ looks compelling; I hope Microsoft manages to improve the clarity of its LINQ information soon.

Common misconceptions about Rich Internet Applications

Ryan Stewart blogs about Why do tech journalists get Rich Internet Applications so wrong.

I don’t agree with everything he says, especially this one:

AIR is a difficult thing to grasp because running web apps on the desktop hasn’t been done before.

I suppose there might be a way to define “web apps” that excludes everything prior to AIR, but it would be difficult. I’m composing this blog post in Windows Live Writer – I consider this to be a desktop web app, especially the latest version which synchronizes the local copy of a post with what is on the web. Many of the widgets and gadgets in Vista or Yahoo! Widgets or the Mac Dashboard are web apps. Apple’s iTunes is a hybrid web/desktop application; it includes an online store that runs outside the browser. Any Java or .NET application which retrieves data from the internet via web services is a web app. Even running HTML applications on the desktop has been done before, for example with Microsoft’s HTML Application model. If you exclude anything that is not cross-platform then the list is shorter – but you still have Java and Mono; iTunes is cross-platform; and even AIR won’t do Linux in its first release.

That said, I agree that there is a far amount of confusion out there. Half the problem is that the terms are overloaded. As I understand it, Adobe’s use of the term Rich Internet Applications includes almost any web application beyond HTML, whether or not it is running in the browser – though usually what Adobe really means is “any application that uses the Flash runtime”.

I often see Silverlight and AIR treated as close competitors, yet Silverlight is for browser applications, and AIR is for desktop applications.

Some assume that Silverlight is Windows-only, but it is not, it runs on Intel Mac and Mono is doing amazing work with Moonlight – Silverlight for Linux.

Another wrong assumption is that because AIR applications run on the desktop, they can do anything a native code desktop application can do. In reality they have no access to native code libraries.

Adobe had us all thinking that it was somehow adopting Google’s Gears – I was fooled by this – but it is not at all; it just happens to have the same open source database engine in its own project.

So the other reason tech journalists “get it wrong” is because the whole “beyond HTML” story is complex and hard to put across in a few words. That is what we are meant to be good at, but there is always a danger of over-simplifying to the point of inaccuracy.

Technorati tags: , , , , ,