Wednesday, November 23, 2011

Thoughts on "The Closed, Unfriendly World of Wikipedia"

Danny Sullivan wrote a pretty good blog post about an article getting deleted. You can read it here. I'm not so interested or outraged about it.


This spawned a Hacker News thread. You can read the whole thing here.

The comment I want to draw attention to comes to us from Phil Welch. It's so good that I'm quoting it below.


"Turns out if you throw together a few thousand neckbeards and convince them to play status games around building an encyclopedia, you get an encyclopedia.


You also get a whole lot of stupid politics, wasted energy, process wanking, flamewars, and acronym-laden cryptic discourses where words like "arbitration" have strange, Orwellian connotations. ("Arbitration" is Wikipedia's name for the process governing, among other things, removing administrator privileges and banning contributors for long periods of time.) 

Contributing to Wikipedia is a usability disaster because of all the red tape, process, policy, and other crap the core group of contributors has constructed.

But the project is big and popular enough, and the work is easy enough, that people still throw themselves into it. You start by just making a few small edits to something, or adding some sources and information to an article. Then it goes on a bit, and you learn some of the terminology and process, and you start feeling personally invested in it. 

I mean, it's Wikipedia! 

It has lots of information about everything! 

It's free! 

The human race needs something like this! 

Before you know it, you're in a "community" of core contributors. There are people with impressive titles like "arbitrator", "administrator", "bureaucrat". You start playing status games. You worry about your edit count. You try to get an article up to "featured article" status. You network. You try not to make enemies. You try to sound wise on "Articles for Deletion", or the "Administrator's Noticeboard", or a million different talk pages. 

Eventually someone nominates you for administrator. You answer a few questions and other contributors vote for you. 

Let's assume you played politics well enough leading up to this point. You win! Now in addition to all the crap you were doing before, you get to play with your administrator tools as well. 

Fun, eh? Well, for some people it is. But at that level, Wikipedia is 90% politics. 

There's an awesome encyclopedia there for sure, but man, you don't want to see how the sausage is made."


-------------------


Phil Welch's narrative is enlightening and refreshing.

What's really amazing is how generalizable Phil's narrative is. The same can go for a number of Toronto's community of communities.


Wikipedia has power because it's so heavily trafficked. Quantcast lists it as the seventh most visited website in the United States. The edit wars on some pages is incredible. Wikipedia is widely thought to be 'collective truth'. It isn't really. It's just a good starting point.



Many communities organize themselves around systems. The power of communities is that they can modify those systems. Political scientists recognize this as path dependence. It's real - just ask Thelen.


Path dependence is measurable and predictable.

What would you use it for? Well, that's up to you.

Monday, November 7, 2011

DJ Patil on the traits of a good data scientist

Friday, November 4, 2011

How to think about the attribution problem

Joe Stanhope wrote a good piece for Forrester. If you have a subscription to Forrester, read it. It summarizes the state we're in, and has a few very good points on the last page.

In that piece, web analysts themselves list 'attribution' as a major challenge.

This is a wicked problem. All the energy you put into untying that knot only causes it to become tighter. But let's try this again, together.

If you haven't seen this previous post, it's new to you. I drew out a conceptual model report, in part to demonstrate how cause-effect can be embedded into a report.





Alright - so that's a conceptual model. I believe that paid spend causes paid visits. I believe that affinity score is a predictor of returning and lost customers. And I believe that non-working dollars should be part of your profit figure. I also don't believe in returns.

A lot of the math here is pretty tame. It should be pretty obvious. And it served its purpose.

So, if you're staring at that diagram and you want to think about which levers you could pull to make that profit number increase, what would you do?

What's the relative strength of each lever?

I don't think of this problem as one of 'giving credit'. I have no dog in this fight - with the paid media person fighting for budget against an insurgent customer affinity group. I'm as objective and pragmatic as possible in setting up the model.
 There are methods to execute attribution modeling - all technical. But crafting the solution means having a clear model that links cause and effect. I'd argue that we should be thinking in terms of a system - and that it's not couched in language around 'credit', but rather, how to make the entire system more effective.


Tuesday, November 1, 2011

Systems Thinking and Marketing

The complexity in measurement ramps with the complexity of the channel. In this post, I'll write a bit about an interpretation of systems thinking, and how I apply it to marketing and marketing modeling.

We all seek to minimize complexity and maximize predictability. We want to minimize risk and maximize empowerment. We want to synthesize a huge amount of information and boil it down to a handful of levers. Levers cause empowerment and they enable people to make really good decisions.

Systems thinking is an actual thing now.

Some organizations already have models in place, and are all fairly standardized. Not every organization has them. Understanding them is pretty important.

This is my approach:

I write a load of variables out onto cards. I talk to a lot of people to get all the variables onto the table.

And then you're confronted with complexity.

I arrange all the variables in a way that cause and effect makes sense. You want to put the outcome you really want to achieve on your far right, and you want to put the variables you have the most control over on the far left. And then you start piecing things together

It's very natural to believe you have far more levers at your disposal than you really do. Areas on the left will start to migrate towards the center.

It's not entirely about your gut. Certain relationships, like the one between age and income, are very well known. Spend and reach is usually well known. Other relationships, like between recall and creativity, are known directionally but not certain.

You've now created a model.

Classically trained statisticians are taught to avoid reinforcing variables in their models, as it leads to a very particular error in regressions. I gravitate towards reinforcing variables. I can deal with the regression later on - but it's truly an artificial barrier. When you're laying things out, you want to pay special attention to such variables and dynamics. This is where you can get some of the best efficiency and/or effectiveness. A little difference there can go a long way elsewhere.

It's also where a lot of the most optimistic thinking comes in. It might be attractive to think all reinforcing variables go on reinforcing indefinitely. But in this house, we obey the laws of thermodynamics.



(Obligatory Simpsons Reference)

This is also the root of the 'virality' argument. That somehow 2 people will each tell 2 that will each tell 2, and before you know it, the entire planet knows of something. If virality really worked that way, we'd all be aware of everything, ever. But we're not.

Reinforcing effects typically have a time limit. Even halos dissipate. Even virality dissipates.

Once it's laid out, you can run a few simulations on it to get a sense of the range and impact of the variables. Then, for the purpose of communicating a model with the fewest number of words, delete the least predictive variables and levers, and communicate a simplified version of the model.

You do the usual 'what-if' scenarios and encounter assumptions about the world that don't quite make sense. You have to aggressively inquire about why people think the way they do. Assume the best in people. Be aware that sometimes people forget that targets are a logical consequence of building a business case. Not everybody thinks ahead, and, in some organizations, there is truly no linkage between targets and business cases.

Remember that the model you generate is one in trillions. Your goal is to generate the most predictively accurate model you can based on what you know. It is all but certain that more accurate models exist, we just literally don't know what we don't know at this point. It's not a question of 'wrong', but a question of 'better'.

Good enough is defined as good enough to make a confident assertion about a set of causes that have likely effects. You're not aiming for perfect because perfect isn't possible for another few hundred years (or certainly some other time scale that exceeds your life).

Once you have a model, you have a system. You can use that system to write powerful recommendations that link actions the firm can take with outcomes that are likely. A system ultimately leads to a general understanding, and we move into a process of normal science.

Evidence accumulates that certain relationships are growing weaker, for instance, between TV Spend/GRP and Sales. New channels emerge and fragment customer attention and behavior. In short, systems evolve and much of what was true in 1990 isn't true in 2011. Much of what is true in 2011 won't be true in 2020. It's not that anybody was wrong. It's only bad if we fail to update our present view of a system with new knowledge.

That's my take on systems thinking and marketing. Useful?