Thursday, March 31, 2011

Web Analytics Wednesday Toronto is on a Thursday

The title is funny and accurate.

Web Analytics Wednesday Toronto is on a Thursday in March and April.

I hear that people like technical presentations and business presentations. So, the March 31 edition at the Charlotte room and will feature presentations by Simon Colyer and Robin Ward. I will be presenting a recent study on Increasing Campaign Effectiveness from the business side.

The eMetrics Toronto edition happens on April 28th and will have Jim Sterne, Jim Novo and Eric T. Peterson.

Why should I come out?

The Toronto analytics community is exciting and growing.

According to Indeed, there are six published open recs for web analysts in Toronto. There are 806 open recs for analytics in Toronto. Publicly posted recs are only representative of what's really going on out there in the information economy. The unpublished market is much deeper.

Canadian analytics practitioners are taking on the world and winning.

Web Analytics Wednesday Toronto is a gathering of people who really practice analytics.

If you're in analytics, work with analytics, or want to get into analytics - WAWTO is the place to come out and meet those who practice while asking and answering questions.

Tuesday, March 29, 2011

Quantifying Creativity

Creativity is measurable.

A long time ago, two scientists, Yang and Smith, demonstrated how creativity can be quantified and linked through to marketing performance.

How can you tell if a message or ad is creative? On the dependent variable side, they enumerate attention to the ad, motivation to process the information, depth of the processing, ad attitude, brand attitude, and purchase intention.

What causes something to be creative? They identified divergence, relevance, and production quality. This gets broken down again - into originality, flexibility, synthesis, elaboration, artistic value, relevance of the ad to consumer, and relevance of the brand to consumer. And then, if you break it down further, you have very specific criterion, accumulated from multiple false starts on the topic. "The ad was out of the ordinary', 'The ad was unique', 'The ad connected objects that are usually unrelated', 'The ad was uncommon', 'The ad was useful to me'.

At the root of all the onion layers...there's a list of criterion that a human analyst, or, better yet, a group of three human analysts, can use to assess the creativity of a piece.

The model turns out to be valid. That is to say, the model and method they put forward generates results that are accurately predictive of marketing performance.

I used the Yang and Smith model for a guerrilla paper during a hiatus in 2009. It was incredibly fun to try and apply.

It's important to note that the method is a weapon to go beat creatives with. It isn't. It's a tool one can use for tricky situations, and, will give an analyst a good basis for engaging with creatives with language you can understand. It's also particularly important for analysts not to say anything negative about ideas or otherwise jam the creative process with the word 'but'.

The model is useful is advancing our understanding of marketing performance, and to make it better.

Friday, March 18, 2011

Embedding Cause and Effect into Analytics Communication

I encounter a lot of artifacts of analytics communication: dashboards, ppt decks, and excel files.

You can tell a lot about an organization from such artifacts. You can see sandbagging. You can see staff transfers riddled throughout some of them, and you can sense the ghosts of analysts promoted or churned. You can definitely see the ghosts of EVP's long gone. You can sometimes make out the intended audience, the originally intended audience, and how incredibly diluted something became over time.

An analytics report is akin to sand on the beach. Sometimes the tide comes in and scrubs away the footprints. Much more frequently those footprints add up, muddle the situation, and then fossilize.

Why it happens and a possible solution follows.

Different people have different conceptual models. Instead of making a choice about which factors matter, and which do not, people ask for data that supports their own mental model. For instance, if you became a Director of eCommerce through the paid media path, you have a very specific way of thinking about profit. If you became that Director by way of CRM, you think extremely differently. You can see how the report would differ between the two. If a company has two directors of eCommerce, you're going to get data inflation. And a third person, looking in, won't have any idea what's going on.

The Leviathan has their own mental model. And they will ask for numbers that reflect that. The troops under the Leviathan will always attempt to assert their belief structure into it.

What's an analyst to do?

I'll put forward an idea. How about the analyst produces a piece of communication from which they are the sponsor of?

Going through how I'd want to understand the effectiveness of my company, at the highest level, would be the conceptual model report. I've taken a non-creative stab at how I think about the factors.


I constructed it by working backwards. The ultimate goal is profit. Profit is the difference between cost and revenue. There are COGS, which I cannot control in this context, and paid spend/non-working dollars, which I could. Then I work backwards again - revenue is average price times number of units sold. I know these figures from the number of checkouts, which is where the second split comes from. Customers place orders. How many customers do I have, how many are new, how many are repeat, how many were lost (from the at-risk category) and how many were saved. A major driver of all that is the affinity score. And so on.

You might say, 'well what about ____ ?', isn't that an important factor? And what about the checkout completion rate? What about my great OOH campaign? And so on. Indeed, there are dozens of factors that I believe are relevant and useful. They're just not relevant enough to be seen all the time.

There is a lot to be gained by editorializing analytics through selection of what is important, and putting them in context of one another. I use a few ugly arrows to demonstrate causality. Please don't derp over correlation. I'm pretty certain that an increase in average price and an increase in units sold causes revenue to increase.

The best way I know how to communicate 'this causes that' is through an arrow. The market generally doesn't like it. In Powerpoint, I deliberately lay out charts to infer cause-effect. Four quadrants on a sheet assist in that process. It's more accepted, but it's more confusing.

I think it assists the analyst in writing relevant and strong analysis and recommendations. It lends itself especially well to statistical modeling. It also lends itself particularly well to forecasting and scenario analysis.

Reports in this format are less susceptible to moar boar additions. For instance, any new factor to the model will have to be significant enough to matter, and actually have a cause.

The bad news is that there is no trend line. Each of these boxes could be replaced with a box with a sparkline in it. But you can see how it would devolve into a nine page extravaganza. It's on this point, where regression starts to come in, replete with lag factors, that the real science begins and bull reportage ends.

So how would you build out a report? If you sponsored a report, and it was yours, not the Leviathan's, what would you put in it? What would you focus on? More importantly, if it was your company, what would you focus on? How would you demonstrate that your course of action was correct? How would you inject cause and effect into analytics communication, and keep the moar boar at bay?

Wednesday, March 2, 2011

Intelligence requires selective ignorance.

Intelligence means selective ignorance.

Imagine how intelligent and ignorant we used to be as a people, just 120 years ago.

Some of the first uses of sampling techniques in quantitative methods centered around the use of alcohol in society. They really didn't have very much machine readable data back then (the first use was for the 1890 US Census), so, the practices of data mining weren't possible. The entire purpose of sampling, and sample statistics, was precisely because no machines could be used to quantify the entire population against some policy question.

You try calculating Chi Square on a very large dataset without a calculator or a spreadsheet!

Indeed, sampling continues to be used to this day as a cost effective way of asking the world intelligent questions.

Abolitionists, or 'Smashers' as they were called in my home province, were looking for evidence to prove their point of view. They were a bunch of convenient reasoners. For instance, they sought to prove that a child's height and weight were inversely correlated to household alcohol consumption. They sought to use quantitative evidence to support their point of view. (Inb4 correlation isn't causality.)

Yet, fundamentally, the idea of making public policy decisions based on evidence, to produce better outcomes for all of society, is a good one. Upon seeing the awesome power of applied statistics to do tremendous evil (second year university), I imagined the tremendous power for good. What if we didn't make policy decisions based on gut? What if we didn't leave things to chance. What then?

Indeed, the abolitionists were right about a few things. After all, we later deduced fetal alcohol syndrome using evidence based medicine. Public policy officials then took remedial action to exhort the population as a whole not to encourage pregnant women to drink. To be sure, some women continue to drink and do harm to their child, yet the risks have been mitigated for so many.

Ordered, machine readable information used to be rare. It was very difficult for Ontario proponents of abolition to rally evidence to support their position. It's different now.

Today, a database containing the medical records of 10.6 million people (people die and do leave the province...). It's easier, with a very high degree of certainty, to understand the impact of alcohol. And many other types of diseases for that matter.

There are pitfalls to be sure. In one of my favourite studies, Peter Austin revealed just how easy it was to be fooled by false positives, and it demonstrates that there are pitfalls involved in using evidence.

We are literally drowning in the data. And it's easy to get really fooled by randomness because of it.

The Smashers, in spite of their ill-guided attempts to restrict the range of freedom within society, knew what they were trying to prove. They knew what they were trying to build, and knew what sort of evidence was needed. They weren't all these evil people bent on telling others how to live their own lives. Some of them were genuinely good people who wanted a better outcome for all.

They had a focus.

The danger of so much cheap data is that it's very easy to lose our own focus.

What do you think? Do you think it's important to try to quantify as much as possible to mitigate the risk of not being able to know something, or do you think it's more important to focus on an interesting problem (profitable problem) and examine additional questions deliberately - (and mitigating the risk of getting fooled by false positives)?

Why know so little about so much when you can know so much about what likely matters?