Saturday, November 14, 2009

The Schism in Analytics, A response to Carrabis, Part II

The central scar, the central schism, as I view it, is in the disconnect about what analytics should be and what it actually is.

There are those who look to the past. It is perfectly possible to do very thorough analysis about why what happened in the past, happened. There's a large amount of valuable competitive advantage to be had that way.

There are those who look to the past only to find evidence to confirm what they remember having thought. These are proof-seekers or justifiers. No further analysis over and above the baseline amount of proof is required. And, if the proof is unsatisfactory - then the data must be inaccurate. Frequently, all that is required is a simple, static report listing a few numbers.

There are those who look to the future. It is perfectly possible to do very thorough analysis about what could happen in the future and optimize against those scenarios. There's a large amount of valuable competitive advantage to be had that way.

There are those who look to the future only to find evidence to justify what they want to do next. These are validation-seekers. No further analysis over and above the baseline amount of evidence is required. And, if the proof is unsatisfactory - then the data must be faulty.

I think there's a huge market for justification. In fact, I think this is why so many vendors go where the market is. They just respond to the market, don't they? And for most people, it's purely about justification. The tools aren't set up to explain anything in the past because that's not where the market is at. The result is predictable. Tens of thousands of reports generated daily, going unread and ignored: all a function of market demand.

To pin the blame solely on vendors is like blaming obesity on fast food companies. They're only giving the market what it wants, negative externalities be damned.

(Who is anybody to resist market forces?)

I think there's a smaller market for validation, largely because most analysts don't really play that game. One of the reasons why web analysts are infrequently invited to the table is because they kill creative ideas and sometimes, are completely disconnected from how managers really make decisions (this is directly to Hillstrom's point). Because web analysts are generally not very good validators, they're condemned to the bowels of the company, reporting on the past, and infrequently asked about the future.

That's the schism: between Scientist-Practitioners and Everybody Else.

And Everybody Else is kicking our ass.

There's reason for hope though!

I think there's a growing market for actual learning and competitive advantage - driven by science. This isn't justification seeking or validation seeking behavior though.

This is driven by upper management - people like Alan Wurtzel of NBC Universal (September 2009) who literally tired of drowning in data. If anything, they're looking for a consolidation of data by way of scientific methods.

There's a generation of HBR people too - who actually know what innovation really means. They know you have to get there through science.

Therein lies the gap.

Can anybody say, really honestly, that Web Analytics, as it is practiced by 80% of practitioners, is Scientific? Of course not. And we find fault with everybody else: tools, skillsets, and market demand.

Look, the tools have been there all along: SPSS, GGOBI, R. The courses have been available online for a long time. So, alright, SPSS is expensive and budgets are tight, alright, fine. GGOBI is free and so is R. But if that isn't enough, there's more help too:

One of the things that Google is doing to make it easy, part of their grand plan, is to introduce statistical functionality into the baseline tools. Google is creating a world of abstraction - where a web analyst won't need a degree in quantitative methods to be able to operate the scientific method. Truth be told, an analyst doesn't need to be a statistician to use Google Website Optimizer. No, an analyst only needs to have the political skill set an alien ambassador on Babylon 5 to get the tags actually put in.

(Hint to those threatened: That entire political world - about getting things done through a large org chart - is where a whole world of value-add can be found.)

I'm under frieNDA about what is going on elsewhere. They're a huge part of the solution too.

And, I'd like to believe, on sunnier days, that I'm also part of the solution.

Look, you don't need to understand it to use it.

And, to touch briefly on the whole 'Cult of the Amateur' Easy/Hard Towards/Away debate that Carrabis touches upon in his "hard" versus "easy" passage -

Web Analytics didn't exist when I was growing up. The new economy giveth.

Web Analytics, as it is presently practiced, won't exist someday. The new economy giveth away.

Sundry reportage - the generation of justification and validation - will probably take at least twenty years to be destroyed because of the a stubbornly long S curve. And you know what: good enough is good enough for huge swaths of the economy. I understand that there are companies out there that turn trees into toilet paper. I salute them and believe that there are analytical products that are perfect for them. I won't dare call those products "science". (And when they're ready for real science someday, I'll be there too).

I think there's a solution in the schism: honesty and retitling.

If a web analyst has the drive and desire to actually be a real scientist-practitioner, and their company isn't going to go there, then they have the duty to get out or STFU.

If a web analyst doesn't have the drive and desire, then I'd argue that we should retitle that segment of the industry as 'web reporting'. It's not worthy of the term 'analytics' at this point.

I think that vendors who are clearly in the business of web reporting need to come out and say, "we do web reporting", and that vendors who do analytics need to come out and say, "we do analytics, scientifically". That said, we need people who are honest enough and loud enough to call bullshit when a vendor is just that. If it gets nasty, so be it.

I'm convinced that there's a large market for real science - for real strategic value - for real actual learning. It's pent up and generally angry with the web analysts fighting each other.

Whether or not we take the same people who are in the industry now will be with me in five years, is the next cause for debate.

(Monday Morning EDIRT: Eric Peterson wrote "Are you ready for the coming revolution". Some of his sentiment is echo'd in this post. Check out his post and white paper.)

Thursday, November 12, 2009

The Schism in Analytics, A response to Carrabis, Part I

I applaud Joseph Carrabis for writing "The Unfulfilled Promise of Online Analytics, Part 1".

You need to read it if you want in on the debate.

There's been a fundamental schism in analytics since the 1930's - between 'advertising' and 'marketing', really, as far as I can tell, since Hopkins died and was forgotten.

So when Joseph holds up a mirror to the web analytics industry of course it's going to be ugly.

Of course you're going to see a massive, gaping, puss-filled wound running diagonally across the face.

I'm not going to shoot the person holding the mirror. Neither should you.

And I'm not going to personalize the debate, either. I think we might be tempted to boil this down to a difference between two wildly successive authors.

It's more than two authors. They're just latest incarnations of that schism.

How do we stitch the face back together?

Sunday, November 8, 2009

South Park and implications for Social Marketing

An episode of South Park called "The F-word" aired on Friday night in the US, and Saturday night in Canada.

Matt Stone and Trey Parker aired what many of us in social analytics knew already: the re-appropriation of the F-Word.

The word has a lot of history attached to it. I don't like the hateful connotation of the word myself.

I'm not using it in that connotation. Far from it. I can get past history that to discuss an important phenomenon and the implications.

So, if you're uncomfortable with the implications of the term - stop reading and move along. I'm stating, very clearly, that if you don't like the word - stop reading.

.
.
.
.
.
.
.

I'll start by bringing everybody onto the same page, and then I'll write, at length, about the implications for advertisers (Harley in particular) and the implications for social marketing.

The episode starts with a pretty huge insight: the damage that bikers do to communities. In effect, bikers produce a negative externality, the production of a massive amount of noise, to the detriment of all others. (Ever been on a patio in Toronto? Yup - South Park covers that too.)

You can watch the full episode at the link below and follow along. If you don't have the time, a summary follows.

If you want to watch on the Comedy Network, you're looking for South Park, Season 13, Episode 12.

http://www.southparktv.info/season-13/season-13-episode-12

In the beginning, the boys are enjoying a wonderful day outside.

Then they hear that loud noise of the motorcycle.

The motivations of bikers for making such noise is laid bear by Parker and Stone in the subsequent scene. It's all about the need of bikers for attention.

What then follows is the appropriation of the term 'fag' to bikers at 2:28 by Eric Cartman.

The bikers end up being even louder, which prompts the boys to hatch a plan. At 5:43 into the episode, Butters does a nice speech empathizing with the bikers. He's summarily dismissed and the boys plan to take dumps on the bikers seats and to write "go away fags" in a number of places about town.

Predictably, at 8:20 into the episode, Big Gay Al and Mister Slave, two recurring and tolerant characters on the show, are walking down the street and sees these huge words written everywhere.

Outrage ensues and a town meeting is held. The mayor is pissed.

Stan and Kyle take total credit for it.

At 9:20, they remark that the term 'fag' has nothing to do with 'gay people'. In effect, they don't see the relation at all between the term 'fag' and 'gay'.

At 9:40, the bikers retreat the library and to the dictionary. Stone and Parker actually recite the long history of the word fag, and are clearly setting up the next huge act of the show.

At 11:40, the boys are pulled in front of a board, where the adults proceed to try to understand the term. Fag becomes known as "an inconsiderate douchebag".

The boys ask: "Don't you people keep up with today's lingo at all"? (I lolled.)

At 13:00 Big Gay Al goes into an organization to rally behind the boys, supporting the re-definition.

At 14:00 the mayor signs an ordinance making the term permissible and re-defined.

At 15:15, the mayor freaks out and calls the boys into her office, now. Predictably, the progressives are freaked out about the actual term within the dictionary as being pejorative to homosexuals.

At 16:00, the boys solution is that they have to change the dictionary definition so they could be free to call Harley drivers 'fags'.

Predictably, at 16:50, the bikers decide to resort to violence.

At 17:10, South Park welcomes the head dictionary editor in an effort to convince them to change the term.

This sets up the climax.

In a delicious piece of esoteric satire, Emmanuel Lewis is the dictionary editor ("What Choo Talkin' 'bout Lewis?") comes to town. The bikers show up to beat the crap out of everybody, including Emmanuel Lewis.

At 21:30, Stone and Parker make a direct appeal to the audience, breaking the fourth wall, calling upon all children to call all bikers faggots across America. At 22:00, the editor of the dictionary declares the definition.

Kyle walks forward, breaking the fourth wall again, and says, "this day, we've made history".

The new definition of the word 'fag' appears on the screen:

"1. An extremely annoying, inconsiderate person most commonly associated with a Harley rider. 2. A person who owns or frequently rides a Harley"

The episode cuts to the familiar credits.


Implications for the Harley Davidson Brand

This is a complete, unmitigated disaster, for Harley Davidson.

South Park, while on the decline with the general pubic since 1999, has a very strong influence on Millennial through to late Gen X males. It has one of the biggest groups on Facebook (nearly 3 million fans) at the time of writing.

(At the time of writing, Harley has a share price of $25.73).

South Park has a major impact on language. Terms such as "derp" and "three fiddy" continue to be used by this self-referential market segment - years and years after their introduction.

While Harley's current demo is very much geared towards older males, the tainting of the Harley customer is total and complete. Because Parker and Stone started off with a massive insight: that the behavior of Harley drivers is inconsiderate to the extreme, the truth resonates throughout the episode. In fact, the lingering fear that people will be beat up by Bikers is directly referenced in the episode itself. The smearing is made complete with Butters becoming the voice of empathy. (A regular watcher of the show will understand why that, unto itself, is devastating.)

The second implication is that Parker and Stone are calling on Children to actively socially denigrate bikers. Having watched 11 year olds shout "giggity" at the top of the lungs, I can say with some degree of confidence that this could really happen.

Parker and Stone are actively trying to introduce a negative externality into the experience of riding a Harley. They seek to return negative attention and return the favor. This might have their desired impact of making the experience of riding a Harley socially unacceptable, and so allowing all of us to enjoy the peace.

This is a deliberate social experiment directed at the heart of the Harley Brand.

I have good reason to believe that it will be effective in some quarters.


Implications for Language

Marketers really like tag lines that stick out in the head (it helps message recall).

"Where's the Beef" is one example. "I don't always drink beer, but when I do, I prefer Dos Equis" is another.

If you're going to be a social marketer, you have to understand language and context.

What's interesting here is that it's the first time, at least to my knowledge, that the producers of a show have actually tried to force the re-definition of a word.

It's this transposition of a hate word from one group to another that is particularly interesting.

This is an important case study.

Yes - it's just a show. Yes - it's such a hateful word. Yes - there's a lot wrong with it.

I'm incredibly interested in how this experiment goes down. What is the impact of a carefully seeded message, hammered away over a period of 22 minutes, with some 2 to 3 million people watching?

Will the definition actually shift, or will it go down as a failure?

And if so, with such high penetration amongst such a dedicated group of people, what are the implications for social marketing?

The choice of subject matter is unfortunate. I didn't write the experiment and I don't necessarily support it.

Serious social analysts need to sit up and take note.

Monday, November 2, 2009

Analytics as a source of competitive advantage and the Medium

What should the outputs of an analytics program be?

I'll argue the outputs should include: profit, evidence (historical), clarified-concrete-measurable goals, decision support (scenario analysis), forecasts, customer intelligence, and competitive intelligence: all resulting in an expanding base of knowledge and competitive advantage.

I didn't mention mediums in that description.

The medium is both the message and the problem preventing most programs from becoming a source of competitive advantage for their organizations.

We've got some pretty big problems with the mediums.

There's a huge amount of work that's going into moving data from multiple systems into a single system: a process called aggregation.

What's the most common aggregation point?

A spreadsheet. It's excel.

It's a not a very good solution. Usually it isn't semi-automated or effectively QA'd.

I've pushed excel beyond it's natural limit. Even when it's made very pretty and functional it is at best a stop-gap solution.

Excel should be to an analytics practitioner as Visio is to an information architect.

It can be a useful tool to express the model, view, and controller to a tech: but it shouldn't be the platform. It shouldn't be the principle medium through which a practitioner communicates with a huge audience.

I'm expanding on that word: communicates.

Have you ever watched somebody open up an excel spreadsheet? Have you ever watched them consume the data on the page?

Just watch them.

Chances are you see a heavy sigh and a whole lot of squinting. You'll also note that the time spent with the page will vary from just a few seconds to a few minutes (at most).

I've watched others take the spreadsheet and run sums and functions on the data. They're effectively torturing it themselves to make the spreadsheet talk. They're trying to learn something from the data.

Spreadsheets don't teach on their own.

I've done it myself. Once I'm satisfied with understanding what is going on, I immediately jump to finding out why it is going on and how I can improve it.

In figuring out that why, my first instinct is to go to the go-to people and start asking questions. Frequently an hour worth of talking can save a week's worth of digging. Sometimes I need more data - but I know how to phrase a query.

I can report that phrasing a query can be very hard to do and it's seldom done really well.

The memegenerator below demonstrates what happens next.


Humans have a hard time guessing how long it takes to put something together. If it only took somebody a minute to read something - then it must have only taken the analyst ten minutes to put together. (right?)

Excel sheets within any organization proliferate (how secure are all those sheets, anyway?). The result is the perception that they're cheap. And if they're cheap, the demand for MOAR comes far and furious.

But that's the wrong medium. They might scream for more dashboards - but a dashboard can't possibly answer a complex query or tackle a complex problem.

A dashboard doesn't tell anybody why something is happening. It tells them what is going on.

It offers a very small incremental competitive advantage. Sure, they might know MOAR WHAT is going on, but better decisions are made using causality, not knowledge of past state alone.

Analytics becomes a source of spreadsheets instead of a source of competitive advantage.

If Excel is not the right Medium - what is?

I'm arguing that the medium ought to match the objective.

If you are replying to a complex query, a presentation - or dare I even suggest it - an animation/video through visualization ought to be the right medium (GGOBI is free). A complex question typically results in a simple answer that need a long explanation to have face validity.

For instance - if the CFO were to ask the web analyst "what are the traits of our most valuable customers?" - the answer might be simple: "People who buy often and say good things about our products". The story explaining why is just that: a story.

You don't use a spreadsheet for that.

If you are asked for ongoing data so that a manager knows what is happening with their section of the website - then it ought to be a tight dashboard based on clear business goals: tied as closely to how that manager is bonused as possible. That dashboard ought to be in a web based format that is designed for dashboarding. Ideally, the dashboard would have a function that enables self-exploration. (Enter the world of vizualization and democratic access).

There's another reason too:

Inquiries for WHY something is going on naturally lead to demands for MOAR metrics to be added. This means that what is born as a small dashboard of 10 metrics grows into a 111 metric disgrace of a report.

The medium of an excel spreadsheet is simply incapable of keeping up with the wave of human curiosity that is aroused from seeing the surface of the data.

If you believe that the principle output of an analytics program is just data: then the MOAR cycle of metrics is not a problem for you. That's just fine and we have nothing to talk about.

If you believe that the principle output of an analytics program is competitive advantage for a firm: the mediums we use as practitioners must shift. If you agree with me - then we have a lot to talk about.

Tuesday, October 27, 2009

How Communities Learn

Most communities have jargon. Buried within that jargon are all the biases, beliefs and worldview that are held by that community. (This can be referred to as a paradigm.)

The web analytics community is no exception.

The terms 'analytics', 'optimization', 'engagement', 'unique', 'pageviews', 'funnel', 'A/B Test Split Test', 'personalization', and 'filter' all have their own baggage that anybody outside the community might not fully understand.

Sometimes people get into disagreements over definitions in an effort to gain specificity. These activities are really quite important. An outsider might be mystified by why such disagreements become so heated. That's because sometimes the real fight is over the paradigm or some feature of the paradigm.

(For instance, the fighting over the term 'unique' was much more about the tension between accuracy and understandability.)

These shifts are indicative learning. I'm finally backed up on this whole hypothesis that language and learning are inextricably linked by Bickerton, 1995.

For instance: Google and Web Analytics. Let's set aside the disruptiveness of FREE for now (besides, it isn't free, because time has a cost), and turn to some of the new words.

"Filter" is one of them. It's about three years old now and is really just a proxy for the word 'dimension', which is a data warehousing / data modeling term. Now we use the term 'custom segment' instead.

Two new terms: "possible causal factor" and "statistically significant", have been recently introduced. I welcome the additions to the Google Analytics product.

Usually I need to export a large amount of data out of tools like Google, reformat them, and then load them into SPSS to look for 'possible causal factors' that are 'statistically significant'.

Now Google promises to democratize that process for all those who don't have SPSS, or know how to use such tools.

There's going to be a learning cycle where somebody will have to explain the difference between sampling error and Types I and II error and between confidence interval and confidence level. If we're indeed merging with the Business Intelligence and Data Mining communities - we'll need to learn a harmonized language. It might as well be the right language.

The process of how this community will learn will be contentious and heavily based in definitions.

We can be fairly certain that some vendor will try relabel a word like 'confidence' in an effort to get first mover status. Somebody will mislabel the word 'correlation' for 'causality'. And I'm fairly certain that we're going to spend two or three years undoing the damage.

This is basically how communities learn. Through jargon and discussion about what the underlining terms mean.

Enjoy!

Thursday, October 22, 2009

Last Day At Critical Mass

Friday will be my last day at Critical Mass.

There are implications for what is written in this space.

For one, there was a large body of material that I simply self-censored. This will change.

For two, I'm anticipating that post volume will go up, at least in the short-run until I'm gradually consumed by this next role.

Much doesn't change.

You can expect the same length of posts and the occasional rants and use of images.

The subject matter will probably continue to focus on the meta and larger social issues around analytics.

Most of the relentless plugs for the Web Analytics Association, Web Analytics Wednesday, and TDMF will persist.

As for the next challenge:






I'm pretty excited.

My twitter is of course @cjpberry and if you need to get in touch, you know how to at me.

Thursday, October 15, 2009

Survey Methods and On-Line and Off-Line Thinking

I'm on the final chapter of what has been a very difficult read: "Language and Human Behavior" by Bickerton.

He tackles some very difficult concepts in a clear cut way, with frequent deep dives into certain pockets of goodness. It's hard read because it's very dense, and perhaps I'm not horribly familiar with the subject matter.

The material in there about consciousness and the notions of On-Line thinking and Off-Line thinking are driving this post. I haven't figured out a way of expressing the differences in one paragraph or less without Bickerton finding out and reaming me out for getting it not quite right.

Into the meat of the post:

I frequently draw the line between observed behavior and reported behavior. One of the reasons for my caution with online satisfaction surveys is because it's reported by the user and frequently involves some form of prospection.

In an obscure reference, the Canadian Election Study, if taken at face value, would predict voter turnout several percentage points higher than it actually is at the ballot box.

That is to say, the survey predicts, based on the questions "Will you vote" and the post-survey "Did you vote" - a much higher rate of turnout than what really happened.

So, is the opt-in sample skewed (A person who is likely to fill out a massive survey about politics is naturally more inclined to vote anyway) or are people just very bad about prospection? (I told what I believed was the truth: I will vote. But the odds of me actually going to vote on voting day will be low.),

Or - did the survey actually raised some form of awareness in the person and made it more likely that they would actually vote: and the self-reported voting rate actually happens to agree with what actually happened to them at the ballot box. (Ie. they're telling the truth about their turnout).

I've frequently argued, quite unsuccessfully I might add, that a survey is unto itself a form of user experience that impacts perception. An on-site survey is one of the few ways that people can actually communicate with a company. After years of combing through comments and applying longtail analysis it becomes readily clear that a comments box is some sort of a cross between a help-desk box and an invitation to engage in 4Chan anonymous behavior.

Customers frequently see companies as being monolithic. Why wouldn't they? And why shouldn't expect a survey to be some form of vital communication instead of a research tool to make things better. Customers don't care. And I happen to agree with them.

It's for this reason that 'voice of the customer' online survey software is to be treated as a proxy for the truth and not as gospel. It has uses, to be sure, but it should be handled with care. The feedback contained within the survey is valid, and if the survey is constant over time - it can be used as a KPI. It has 'internal validity', but I'd become really uncomfortable about taking a sample size of 1000 and asking them "will you buy this product" and applying that rate against all visitation to the website. At least you're not guessing. (And we don't guess). But it is very dirty.

It's something.

I wouldn't bet the farm on a survey though.

The best feedback is observed. If you want to know what people really think and how they really feel - one should focus on watching them.

So - to tie this on back to Bickerton:

I prefer recording observed behavior because the user remains in a state of On-Line thinking. To borrow from physics: I'm not changing the position of an electron by measuring its speed.

Surveys have their place, to be sure, but they're inferior compared to other methodologies.

Tuesday, October 6, 2009

Creativity and Web Analytics

There's a review up on the Web Analytics Association's website on modeling the determinants of creativity in advertising. I think Smith and MacKenzie et al did a good job on the paper.

The term 'creative' is completely loaded. After all, isn't it all subjective?

In our defense, even as web analysts, we often try to quantify the subjective all the time. The feeling thermometer and the probability map are two ways that we've tried to quantify feelings and prospection. Even the concept of satisfaction, when operationalized through a survey methodology, is subjective.

Just because a concept is subjective doesn't mean that we throw up our hands and walk away. Rather, we should be always trying to improve how we ask and derive methods for linking perception with observed behavior. The denial (or ignorance) of this link between reported and observed behavior continues to generally plague the #measure community. (There's a bully in the community that I won't call out. Yet.)

The impact of creative on conversion is typically only spoken about in the context of an A/B test, and very frequently, only within a kind of spitting criticism. The oft-verbalized criticism of Google and their "testing of 140 shades of blue" methodology is one example.

Isn't creative more than just the color of a button or text though?

I think so. And so do Smith and MacKenzie. They break creativity out into 'divergence' and 'relevance'.

Let's tackle 'divergence' first.

Divergence means 'standing out'. What makes an ad stand out from the clutter?

I'd argue that it's the same thing that causes a punchline of a joke to be funny. Something unexpected. It's something that is at least two, maybe three standard deviations from the mean. Something that stands out from the crowd is spiky in nature. I'd argue that making something spiky is a creative process.

Relevance means 'of interest to me at this point in time'. It's another way of saying "right message to the right customer at the right time". We might well have successful delivery of said message, but if there's no divergence, that message will totally get lost in the clutter.

I'm arguing that there's an opportunity here to use web analytics as a force for good in the creative world. It wouldn't be a pursuit of sucking the 'fun' and 'creative license' out of the creative process. Quite to the contrary.

Rather, maybe we could incorporate creativity into our predictive and explanatory models - or at least consider and properly value proper creative.

Friday, October 2, 2009

NeuroCognitivePsychoLingualAnthropology

I read the first 120 pages of Joseph Carrabis’ new book “Reading Virtual Minds Volume 1” last night and polished it off this morning while sitting at the airport.

The book certainly forced me to think about being really aware of being aware of how hard I was thinking. I was engaged the whole way though, and in the end, I asked “wholly shit, what just happened there?"

I spent the better part of the night dreaming about it (always a sign that something upstairs is getting restructured).

I’ll write about the experience without spoiling it for you.

Joseph tells the story about how NeuroCognitivePsychoLingualAnthropology came to be. In spite of how long that word is, the book is very accessible, readable, useful, and intensely personal. The love leaps off many pages. (And one page where the middle finger literally leaps off the page. It’s not directed at the reader and it’s refreshingly honest.)

I’m taking away more than a few things that’ll become part of my every day speech.

The first is how NextStage’s machine runs. Joseph explains the principles of how it works specifically and uses accessible metaphors to expand. Those with an appreciation for collective intelligence and algorithm design will want to pay attention to how he explains it: it’s superior.

The second relates to political science and some of the social ills (suppressed political participation) that a good colleague has been trying to understand for the better part of a decade. There are applications of the technology that could explain what we think we’re seeing in the Canadian Election Study (CES). While I hope that Elections Canada and SSHIRC continue to fund the CES, NextStage offers a method of predicting a breakout election and perhaps a compelling explanation for turnout suppression. I haven’t been more inspired since reading “How Institutions Evolve”.

The third goes to marketing. It’s generally accepted that people think differently. But how differently? And do those differences matter? And if so in which contexts? The book gives a concrete example of how much and how it matters to marketers. The notion of intensity channels is a useful and accessible schema for quantifying those differences and acting upon them.

The fourth goes to changes to how we define experience design. On this point, you really need to read the book for yourself.

The next three takeaways are far more personal.

The first deals with a preference of mediums. One NextStage dimension is ‘visual’, and it explains a lot about me. I’d sooner go over to somebody’s desk and talk before writing an email before sending a text message before picking up the phone. In that preference order. And this includes literally hunting somebody own in a large office to find them in person. If the person is remote, I’d much rather use email. I’m that visual. So whether that means looking a digital signal, composed entirely of words with no tone: at least I can see the shapes of the words and the patterns. Thankfully the world is coming around with video chat.

The second deals with being intuitive and filters. Thankfully, Joseph uses as much common vocabulary as possible. We all know what we’re talking about when it comes to filters. There’s a reason why it’s acceptable to fart in certain social situations and it’s utterly unacceptable to so happen so much as speak a run-on sentence in another: even though they're both forms of passing gas. There’s a certain degree of self-awareness that goes with it: that a big part of understanding how others are reacting also involves the kinds of signals that you’re giving off.

The third will be the subject of future blog posts.

Just go get the book. It's a very good read and most of the people I know who read this space will find it valuable.

Sunday, September 27, 2009

Firm Created Word Of Mouth

Jim Novo wrote a Web Analytics Association Review on Firm Created Word Of Mouth.

I strongly recommend the read.

Although the paper was published in the most recent edition of Marketing Science, it was based on findings that span four decades.

The first finding reaffirms the 'strength of weak links' hypothesis. Let me explain:

Like people tend to clump, alike.

Among my friends, more than half own iPhones with occupations centering on technology and the Internet and most have roles that are heavily steeped in data. Three quarters would be classified by Forrester as being Tech Optimists and Creators. A majority live in the inner city.

Not everybody in my circle are uniformly this way: I used the word 'more than half' for a reason, but you get the picture: like people clump alike.

I'm linked to a number of other communities by way of acquaintances. Lots and lots of acquaintances! Everybody is.

Very few in my circle of friends would ask me if I would recommend Google Analytics over Omniture. They already know enough to have their own opinion.

However: an acquaintance who runs a company might ask me how they would know if their money is well spent - and my response: that recommendation - one that travels along a weak link, would carry more weight.

A weak link is one between acquaintances that span two heterogeneous communities. A strong link is one that spans between friends in a homogeneous community. A reccomendation made over a weak link is stronger than one carried over a strong link.

We use the words 'forward to a friend' all the time in social. In all reality, forward to a friend isn't what you want to drive product adoption. To drive frequency of purchase - yes. But in most cases (there are exceptions!) not product adoption. The 'forward to an acquaintance' action is the one you really want to happen.

In spite of this, we as marketers continue to use the term 'forward to a friend' because that's the base call to action. This entire weak ties is a footnote.

In all, Jim Novo's review is a good read. It's valuable.

Sunday, September 20, 2009

Conference Ecosystem

I had the good enough fortune to talk with Stephane Hamel, a director of the Web Analytics Association, and Andrea Hadley of eMetrics last Friday while at IMC in Vancouver.

As usual with any conference - the really interesting conversations happen in the lobby during the day.

Andrea, being the super-networker she is, got me into talking with Stephane about the Research Committee, and fast tracking was to be had. We also talked about the diverse audiences involved in any industry, and how to try to serve each group really well at a conference.

There are experts, newcommers, and vendors/consultants. Vendors want to sell, newcommers want to learn, and experts want to talk to each other and recruit talent. eMetrics is experimenting with different formats to serve all three groups. In general, vendors don't really care if the experts screw off of to the lobby. They know Vendor X really well and they're not the target. Newcommers are the target. Most newcommers are looking for solutions to the new problems they've just been handed (typically a web analytics login and password!).

In the end, I think all three audiences need to be served by the same set of conferences. There should always be a vendor component and a welcoming environment for the newcommers. eMetrics might want to consider inviting cross-over experts in other fields to share practices and try to get the water as brackish as possible.

Wednesday, September 9, 2009

Fraught with Skepticism

I'll confess that one of my favourite lolcats is Skeptical Cat is Fraught with Skepticism.


Look deep into that expression. The cat really does look skeptical, doesn't he? He's not believing a single word you're thinking right now.

There's also something about that orange background that makes the expression and the entire image that much funnier. I don't know what it is about it. But I'm aware of the effect. I think anthropologists have a term for the tendency of humans to superimpose human emotions onto animals - which there is no evidence that an animal actually feels. I can't remember the term, but it's funny as hell that we all do it.

The reason I bring this is all up is that I deal with skepticism all the time in my line of work as a web analyst and marketing scientist. Sometimes the data disproves long held assumptions about aggregate user behavior and attitudes - which causes skepticism. This rejection of your own hypothesis is something a scientist isn't troubled about. For non-scientists it can cause all sorts of personal problems. Yet, even with empirical proof, skepticism can persist. They don't say it out loud, but you can see it in their faces. They're a skeptical cat. And they're fraught. With skepticism.

I'm very lucky in my career so far. I manage to manage a team of people and multiple programs while still being a practitioner. I still use statistics to make sense of data. When I'm talking to the consumers of the insights and findings that come out of this work, I don't talk about regressions and the origins - I translate the results. I make the results actionable and the insights relevant to their interests.

The consumers of these insights don't need to understand the nitty gritty of sampling error or SPSS syntax. Could you imagine trying to explain all that? "It all started off with a gambler in the 1600's.... and then Laplace....and then just before prohibition they wanted to put statistics on the table and scientific advertising....". Ha!

So it goes with this NextStage skepticism.

Some of the greatest inventions in the world are discovered by combining two unusual technologies and deleting anything that is unnecessary. Invention typically requires strong lateral thinking.

Two years ago I was deeply troubled by the "optimization cycle problem" of web analytics, and reckoned that if physical technology got us into this, it could get us out. It led down a number of wonderful research rat holes and inquiries.

I got introduced to Joseph Carrabis by way of June Li at eMetrics Toronto in March 2008. What prompted the introduction? I was inquiring about behavioral economics (there's a sub-field called "evolutionary programming or collective intelligence" - to be precise) as applied to website morphing (which, at the time I was referring to a vague Dutch test of the technology to a medical website. The test failed because they were morphing the nav bar. Website Morphing has since been updated by Hauser et al the next year.). I was trying to learn how to combine these two technologies. June Li figured she had to introduce me to Joseph.

Joseph announced his patent the next day.

Our second conversation pretty much extinguished any lingering skepticism I had. (And the third conversation was probably more unpleasant for everybody involved as I expressed concern about the usage of the technology.). So what happened between the first conversation, the second, and the third?

First, a herd of intelligent people were not skeptical. He has social validity.

Second, if he couldn't think laterally - odds were that he was a phony. He was a strong lateral thinker and smart to boot. He has personal validity.

Third, he knew things about EP that nobody who was faking it could have known off the cuff. Only somebody who had experienced the data and thought very hard about the problems could have known about it. He answered these needling questions in a manner that had very, very strong face validity.

Fourth, the patent. This is very strong external validity.

NextStage technology incorporates multiple areas and technologies while deleting the extraneous bits. There's neuroscience, anthropology, statistics, language psychology, behavioral economics (EP/CI) and computer science (specifically: computability). It blends together nicely because it is informed by a central theory that is explanatory and powerful.

Since then, much of what Joseph has demonstrated over the past 18 months meshes with what hasn't been published (yet - papers have been presented) in the field of decision neuroscience - an emerging sub-discipline in Marketing Science. I have no reason to challenge the veracity of NextStage's claims because every esoteric challenge has always been met. Their claims have been independently verified and I consider the matter closed. They really invented a new technology. Yes, it can happen.

Back to skepticism.

I don't believe that 90% of the market researchers and 70% of the data miners I've met have ever read or understand Laplace - and yet that doesn't keep any of those people from using statistics to make judgments about vast numbers of people. Most practitioners don't have degrees in mathematical science. Are any of these people who practice linear regression on a daily basis ever skeptical of the science that underlines it?

There's a baseline amount of verification that people need to do on their own. A smell check or a sniff test. Everybody will need their own source of validity. I have mine.

In sum:

You don't have to understand how the cow turns grass into milk to drink the milk.

You don't have to understand how the cat makes that purring sound to enjoy it.

You don't have to understand the mechanism in your mind that makes skeptical cat look skeptical to laugh at the joke or to understand NextStage. Skeptical cat looks skeptical and NextStage technology works. It's as simple as that.

Tuesday, September 1, 2009

You don't need to understand it to use it

At the Marketing Science Conference earlier in the summer, Shaina and I took in the Neuromarketing session. The session was very good, with 3 really great presenters out of the 4.

I learned several important reasons why people make the choices they do. For instance, I saw it empirically proven that self-control is like a muscle: you can hold a certain pose for so long, and then that muscle gets fatiqued and weak. Then you can't hold it for any longer and you break that pose. It's a very attractive causal variable for periodic consumption and lapses in self-control. Prospection - the ability to think of the future - when combined with anchor-and-adjust tendencies, cause discount-rate curves to deviate from what classical-economists would predict. This has important implications for trial-bonus marketing.

So many breakthroughs in behavioral economics and marketing science are originating from neuroscience. It's exciting.

Meanwhile, the great people over at Next Stage Evolution are making advancements in making neuromarketing accessible. You don't need a degree in neuroscience to use their technology to make things better and get benefits.

Take for instance, web analytics. Right now, in a majority of fortune 500 and fortune 5,000,000 companies, managers and web analysts are making decisions on how to improve their site based on numbers most of them assume they really understand. In fact, it's a relatively recent development that the Web Analytics Association has worked with vendors to really define what most of those numbers really mean. Among the most misunderstood include "unique visitor", "time spent on site", and the actual definition of what a "bounce rate" really is. It doesn't matter. They don't have to understand to know that more unique visitors is 'good', time spent on site is 'interesting', and a high 'bounce rate' is generally 'bad' - so long as there's some baseline education and comfort.

You don't have to understand how an airplane flies to be a passenger on one, no more than you need to know how a light bulb really works to derive benefit from it.

There's always going to be skepticism and fear whenever a new technology comes about and it starts to be adopted. Electricity and soap were once feared. Flying was too. We're starting to see some of that around web analytics this year.

That's not to say that people who aren't curious about how neuromarketing is done shouldn't explore and ask. The curious should.

So, when I assert that you should offer visitors with browsing pattern X an offer of $50 paid out in 3 months if they sign up now and visitors with browsing pattern Y an offer of $20 instantly if they sign up now, you might challenge that. Good. Then I'll explain for 15 minutes about the hyperbolic discount rate curve and prospection tendencies of different browse paths. Then, if I've done my job correctly, you'll trust the science just as you would trust a pilot or trust a light bulb. Abstraction can be a wonderful thing. Getting there is a longer process.

You don't need to understand it all to use it.

Wednesday, August 26, 2009

Attention

Considerable effort is going into quantifying the degree of which people paying attention to a medium.

This is a big deal.

Consider how many screens your average Gen Y'er is engaged with, simultaneously, on a Monday night. They could be watching videos on YouTube while watching MTV while tweeting their friends on their iPhone.

There are reinforcing mechanisms here. For instance, getting hit with Stella Artois The Life Legere commercials both on the Comedy Network during a commercial break while getting hit with it on a pre-roll from the Onion News Network. Lately those commercials have been appearing on the fourth screen - the movie theater - during the pre-roll.

Seniors are up to it too: reading the newspaper while listening to the radio while having Wolf Blitzer turned out loudly in the other room. No, seriously, it happens.

Attention is precious. In any given minute, there's only 30 million minutes worth of attention to be had in Canada: and only 4/5th to 3/4 of the time too (we do sleep). There's only 290 million or so minutes worth in North America. (I'm discounting infants here...)

What was special during the the 1890's, before mass live media, was that people could rip out a coupon take action on an offer at a later point. I think something strange happened with measurement during the era of the movie (post 1914?), the radio, and ultimately the TV. For awhile there, during the era of 3 TV networks, advertisers could get the attention of 80% of the people for a full minute. Attribution may have been easier when attention was so concentrated.

Attention is more fragmented not only because of the Internet, but because the more places the Internet is at. At the same time, the Internet medium is more instant than a newspaper or a TV.

It's the 1890's on meth.

It's harder than ever for advertisers to get your attention. And it's more important than ever to measure whether or not you're even breaking through the noise in the first place.

Attention is a useful as a predictor for the only dependent variable that really only ever matters: profit.

It's well worth the effort in trying to quantify and optimize against it.

Wednesday, August 12, 2009

The Common Data Set

If you've been attending the Web Analytics Association Research Committee calls, you'll know that I've been troubled by this question of a 'common data set'.

As it is right now, data that is common, clean, and relevant to web analytics is rare. To be sure, there are heaps of open source log files (I believe the Wiki Foundation made 5 terabytes available for download awhile back), but in terms of there being some manageable CSV file out there - it's pretty rare.

Such a dataset is pretty useful from a few perspectives.

For one, it would enable researchers within our community to use a verifiable data source when making assertions about the importance of different metrics. I'm dissatisfied with what I can demonstrate here: only 'theory and definitions'. I'm certain that several other people are too. A common data set which people could go at and demonstrate their ideas would be invaluable from a community perspective.

I think it would also reduce the bullshit quotient quite a bit too.

For two, it would give directors and managers of analytics talent a great way to evaluate prospective talent using real data. Practitioners can't share a lot of their work. Almost always it is subject to a number of NDA's. (And with good reason).

For three, it would offer a verifiable way to test general claims. That's the root of real community science. Verifying results and allowing other people to confirm those claims.

What's the problem? Why isn't there such a data source now?

First, no company would want to publish current, relevant data. It would allow enterprising competition to take advantage of them. Secondly, the question of format comes to mind: most analytics is hosted on third party systems, and free logfile readers that are 'good' are vanishingly rare.

There are solution spaces I can think of.

Somebody might want to buy the books and databases of a bankrupt company. Somebody might want to leave the site hosted and make all the (non-personally identifiable) information freely available so the community of web analysts could conduct an autopsy on it. We'd have to make the web analytics vendor logins all available (which would be a stretch, but not insurmountable). Though, truth be told, the thought of it being my bankrupt company - out there all exposed - makes my blood go cold. But you never know. Somebody might be into that.

There's also the great guys at Quantcast. While they don't have clickstream data, they do make available an awful lot of data on their website for free. Problem is, of course, that it's not all nicely ETL'd into a format that is convenient and distributable within the community.

I wouldn't blog about it if I didn't think that it's an opportunity for us all to do some good and advance the practice. And it's a practice that is worth advancing.

Monday, August 3, 2009

IMC Vancouver: Social Media for Business

I'm looking forward to paneling at IMC Vancouver on September 18th. The topic is social media for business. Naturally, I'll be talking about social media analytics.

The intersection of business strategy, quantitative methods, and online word of mouth (social media) has the capacity to be really powerful in the hands of somebody who understands that it's actually a medium. There's message, there response, there's measurement of that response, there's an opportunity to improve upon the next message. It's also like any other medium too. You got to pay to play. It isn't free.

It isn't free for your customers either. Social media still takes time and attention time: and that's still a cost. But the difference is that it's never been more convenient for customers to rave or to rant about your company and your brand.

In 1995, a customer needed to know HTML or BBS software if they wanted to rant and rave online. It didn't keep them from using good old fashioned open-your-yap and tell your friends word of mouth marketing. It wasn't convenient to do so online is all. In 2009, you can tell your friends on Twitter in a scant 140 characters. You can get your Twitter hooked up to your Facebook and annoy hundreds of more people with those updates. It's becoming almost too convinient.

Every marketing medium is somewhat measurable in some way. I'll talk aboout practical ways of measuring this 'new' medium.

Tuesday, July 28, 2009

Path Analysis and Journey Analysis

Jose reviewed an interesting journal article: "Path Data in Marketing". You can read it at that link, at the Web Analytics Assocations' Research Committees' Peer Review Journal's Project.

And, it got me into thinking more about path analysis.

To a web analyst, traditionally, a path analysis is examining a sequence of pages that were viewed. Back in the nineties we used to call such analysis 'threading': and we always chose to examine pages and the sequence.

Threading was computationally expensive during the nineties, when volume was low, and it continues to be very computationally heavy for vendors to this day - even with improved algorithms: the volume hurts.

How much does Page Path Analysis (Maddeningly: even when we use the word "click path analysis", we don't really mean 'click', we mean 'pageload'.) really tell us these days anyway?

On highly interactive density sites: not much. One of the most common behaviors on websites is the single page visit. On this point, we treat all 'bounces' like a fail, when in reality, I think there needs to be a differentiation between a 'bounce rate', and a 'reject rate'.

It's entirely possible for a user to visit a website, engage with the copy, and leave to research more (a single page visit) . To the web analyst, that single visit is as a fail. Granted, a more seasoned web analyst would have a filter to differentiate return visitor conversion from first visitor conversion. If they could get that filter or that dimension cut. (Not to trivialize it: but it IS hard in some organizations).

Alright, so this, naturally, goes back to what's wrong with the 'time on site' metric too. A single page visit, with 30 minutes worth of active engagement, would still be treated by most web analytics tools as being a bounce. If you believe what you read in some books, you're told that a bounce is a fail all the time. It is not a fail all the time.

If you're a blogger, for instance, you want to know that people came, they read, they screwed off. I'm not hurt if people don't want to check out my other posts. My other posts are not relevant to everybodys' interests. I'm successful if people are still skimming by the last paragraph. At present, I can't see that. Some bloggers want you to click on their ads, because that's how they get paid, and so, that's a different form of path, isn't?

When we only hit the web analytics server with a pageview, then all we're measuring is pages. And there's more to the path than pages.

There are specific actions that a user does that could be very indicative of interest and/or success. For instance, if a user lands on a page, and leaves before a 7 second load flag has been fired by a piece of Javascript, then we can treat that as a rejection. If a user lands on a page and scrolls slowly, the odds are good that they're reading. If the mouse activity is heavily highlighting, or click on images - then that's indicative of engagement. (For instance, we've trained users over the years to expect that when an image is clicked upon, it becomes bigger - for more detail). If a user begins filling out a form, or engages with a piece of JQUERY or an adverarea - then that's another form of success. If the user has scrolled down as far as they can scroll: that too is success to some people on some pages.

In fact, the very idea of the 'microgoal', or 'goal of a page' is something that should come on back. The notion that some pages are routers, some are converters, some are completers, some are decision-support, and others are branding - I think is important. You WANT people to spend a lot of time with branding pages - but you don't want people leaving at a router page. Sometimes you want people to take a very specific action on that specific visit - and you want them to take another action at a different time.

The broader challenge of path analysis is journey analysis. Stringing individual paths together to understand how multiple visits, over time, add up to a buying or a desired action. This, of course, is totally hard because people frequently begin and end their journeys on different devices at different times. That is not to say that we can't measure the broader patterns and do the legwork to unify each part of the path.

That said, it would be far easier to take on journey analysis if we had a more robust path analysis available.

Thursday, July 23, 2009

What meme-tracking can teach us about text vizualization

"Meme-tracking and the Dynamics of the News Cycle" is an excellent paper put out by Leskovec, Backstrom and Kleinberg.

In it, they track meme's against the news cycle. The empirical findings of their study, which focuses on Palin, is really cool. How they chose to vizualize what was going on that's quite new.

Traditionally, we tend to graph social networks using graph theory: each person is a node, and you draw links. Sometimes we color in the nodes and represent the strength between nodes by thickness of the lines. This kind of social network vizualization is something very cool, but the type of math that's required to derive a real business strategy out of it is not. People have a hard enough time deciding what to do to reduce "bounce rate", little though "eigenvector centricity". But, there's competitive advantage is making something hard to do - easy.

All social networks can be expressed as a mathematical matrix - where if two people know each other, we populate a '1' at the intersection. You can also populate it with an interger or a float, variously to represent the strength or the nature of the relationship between two people.

What I like about what was done here is that the authors populated each node with a pointer to more information. There's something about this transmission of text throughout a social network - how it evolves, twists, mutates and spins - that really is something really quite special. Hopefully the graph below is clickable and you see it better.


Think of it from a marketer's point of view.

Your message, even if you do a good job of seeding it to the right people, will not be spread in quite the same way that you want it to be spread. Expect it to shift and to evolve. Sometimes you'll like the evolution - because it'll drive more people to your site and sales. Sometimes you won't - because people won't like you or what they hear.

For certain companies, at certain levels of social media spend - it's a worthy part of their social analytics programme. It's a lot nicer to see the evolution of what people are saying as a message is getting transmitted in conversation than to look at a tag cloud that doesn't really tell you all that much. There's something more human, and something vastly more conversational about vizualizing conversations over long periods of time in this way.

I applaud Leskovec, Backstrom and Kleinberg. Great work. Thank you.

Friday, July 17, 2009

Marketing to an older generation

Paco Underhill had an interesting interview on the NewsHour with Jim Lehrer that was re-aired this week.

It struck a cord. I'm still thinking about it 72 hours after the fact. You can read the transcript here.

While there is a lot of good content in there, it was this part that really caught me [my emphasis]:

Marketing to a younger generation


PAUL SOLMAN: Well, let's say there are some people in our audience who would like to re-inflate the bubble. How do you get consumers to start buying again at this point?

PACO UNDERHILL: Nobody's going to go back to the old ways. And what we're seeing here is a time in which our retail world is probably going to contract.

It is going to contract, and that's because we are over-stored, meaning that most retail entities would be eminently healthier if they were smaller. Sixty percent of discretionary income in North America is held in the hands of people who are 55 and over.

PAUL SOLMAN: And we don't need stuff?

PACO UNDERHILL: Paul, you and I could live the rest of our lives on fruit, vegetables, pasta, wine, olive oil, and yearly doses of socks and underwear.

I think the other thing that is interesting is that our basic marketing engines are in the hands of people who are 30-something. And they like selling to themselves, and they like selling to a younger generation. They're not that comfortable selling to gray, bearded, bald, paunchy research wonks like you and I.


Wow.
Ouch. And how do we make this better?

Internet Marketing really is dominated by twenty and thirty somethings. It's an intensely young industry. There are exceptions of course, and I really do respect the older people in our field. I like to think that they make us wiser and that we make them younger. To Paco's point: he's right.

We've been building a lot of one-size-fits-all experiences for 15 years now. I'd argue that many experiences are built for the market segments / personas that we think are the customers.

I've been lucky though. I've worked on over fifteen projects aimed at those 55 and older. I know of some people who have gone their entire career without ever marketing to those people. For the majority of us though, he's right.

I think it's time to use data to make experiences better for the older generations. For instance, if a user tells us through their behavior or through their preferences that they don't like drop down menus, then there shouldn't be drop down menus. (And that's just the tip of the iceberg of what is doable! See the posts on Website Morphing to get an idea.).

Web Analysts have a tremendous opportunity here over the next five years to really contribute to the solution.

Either that - or start applying for jobs at wine and pasta concerns.

:P

Monday, July 13, 2009

Social Media Analytics: the hidden cost of 'free'

As with any early industry, there are quite a few 'free' measurement applications out there. Social Media is no exception.

Many of them are quite good at doing one or two things very well. For instance, Twitalyzer is very good at measuring influence, and in particular, the first derivative of influence. The advanced search functions on Google are very good at tracking, at least at a monthly cadence, the number of mentions and backlinks. Very useful. And they're FREE*!

There's a large component of social media analytics that can be done with Google Analytics for 'free' too.

Of course, 'free' has a hidden cost. In some instances, unless you're opting out and getting mutual NDA's, you're giving up some privacy. Of course, privacy has little value to many people anyway - it might as well be free. (You'd be amazed how many people will trade their SSN for as little as a five dollar gift certificate to burger king).

There's also the hidden cost of aggregation and interpretation. It all depends, of course, on what you're actually going to do with it.

In certain situations, when you and you alone are the decider:

you don't need to put a lot of effort into collecting and aggregating the information into a single neat place so that a team can see what's going on.

In most instances though, there are many people who want to see what's going on, and they don't want to log into 9 different tools just to see what is being said about them online.

Then there's the interpretation and processing of information. Try coding 100 comments into 'positive', 'neutral' and 'negative' by hand to see what I'm talking about....

All told, 'free' really isn't free.

That's not to say "go out and experiment". By all means, please, go out and experiment. Check them out. They're great tools. They're great interfaces.

When you're setting expectations on a social media strategy, just don't assume that you suppress the costs of measurement and optimization down to zero or to 'free'. It's just not true, and you're setting yourself up for some real misery down the road.

Not to kill your contact high. It's still exciting.

Thursday, July 9, 2009

The effect of the Internet on Prices

One of the more neat aspects of the Internet is its impact on prices.

Before the Internet, researching the best price for something was relatively hard. Or, I imagine it to have been hard. The price you got for a consumable, like a car, a house, an airline ticket or hotel room largely depended on who you knew or how many people you called and asked.

One of the impacts of the Internet has been relative deflation in prices as a result of the ability of customers to compare prices easily. This decrease in the cost of becoming un-ignorant has eaten directly into the margin. I don't think I can argue that the barriers of entry have been significantly lowered as a result of the Internet: a hotel still requires capital to build rooms, and an airline still requires airplanes and people to fly. But I can see that certain barriers have come down. It's possible for certain companies to compete entirely on price and cost-efficiency models. Customer service is increasingly becoming a relevant differentiator too. Refreshingly.

You also have a whole bunch of aggregator companies these days too who derive their income by making the cost of research to consumer (in terms of time, effort, and ease) come down as well. In way, this represents a refreshing triumph of ingenuity and innovation: how revenue management analytics, web analytics, creative, information architecture, and strategic IT can come together to create revenue streams that didn't exist before.

We could see still more deflation on the margins. I'm more interested to see how companies discover new values that "consumers didn't know they would want to pay for".

Saturday, July 4, 2009

Is "Private Browsing" really "Private"?

Web analysts I speak with are expecting a crash in observed traffic with the launch of Firefox 3.5 and the newer features within IE.

I question if it's really all that private at all.

Some commentators, like Preston Gralla, call private browsing the "porn mode". Gralla goes onto write:

When you browse the Web using it, nothing about the session is stored -- no history, no cookies, no temp files, no forms information, no search information, nothing that can show where you've browsed or what you've done. To turn the Las Vegas tag line on its ear: What happens in Firefox doesn't stay in Firefox.
Well, alright, nothing in-the-browser is really stored. Of course, some memory of the porn run is recorded on the hard-drive and by any sort of spyware that a spouse, parent, partner, or roommate has installed on the computer.

A record, though not linked to anything personally identifiable on its own, is kept on the porn sites' log files (and, if the site requires javascript, I don't really see how private browsing can keep all the web analytics script from firing if it's embedded right into the experience).

The concern of many web analysts is that just as a portion of the population are habitual cookie deleters, there are going to be people who turn on private browsing and leave it on.

What do cookies really do for us in terms of measurability?

Well, assuming that nobody ever deleted their cookies, it would have let me know with a high degree of accuracy, which browsers on a specific computer were returning to the website at a specific recency and frequency. Such information is important to a blogger because you can judge, roughly, just how much audience you're retaining - and make decisions on frequency and content. Of course, cookies don't measure people. Multiple people use the same computer, and a single person uses multiple computers over the run of day. It's a proxy measure of return visitors though, and it's still useful to an analyst so long as they know the real definition.

Another big use is for campaign attribution. Certain websites get paid on performance - meaning that they only get money if you click through on an ad and buy something - typically within 30 and 45 days. Without a cookie - they don't get paid. It makes me wonder if certain websites that get paid for performance just might as well throw up the rule "Want to visit this website? Turn off Private Browswing" as a way to deal with it. I know many developers have been having similar fantasies about denying access to those who continue to use IE 6. In the end, I think that it's the pay-for-performance business model that might end up suffering the most. (And I can't imagine what kind of damage this is going to do to for the porn affiliate programs).

A third big use is for analysts is personalization and making your site experience better: like remembering your username or associating a certain behavior pattern with being a good customer and getting perks that encourage more intense behavior.

Here's a big takeaway:

Your browsing behavior can be recorded by software installed on the client side. The search engines are most certainly recording which queries you're entering. It most certainly is being recorded on the ISP side. Who the hell knows who else is looking at that data from the ISP side after the spliter. Then it's being recorded, at some level, by third party ad servers and analytics companies. And, at the source of where the information is being housed, by server logs. If I was a rational policy analyst or a citizen concerned about surveillance, I'd be far more worried about those touchpoints where the information is personally identifiable.

Let me explain.

The web analyst sitting in front of the ad server report, the server logs, or the nicely formatted web analytics data has a hard enough time interpreting that data at the aggregate. 99.999% of the time, we have no idea of the identity of the person we're looking at. In that 0.001% instance, it's usually a filter I've set up specifically to track my own behavior. (What sort of analyst would I be if I didn't understand how my own behavior is reflected on the tracking software that I use?)

The ISP's, and the people behind the splitter: they know the billing addresses and are far closer to your identity. The person in your household that has installed some software one of your computers: they know exactly who you are and what you're doing.

Privacy on the Internet is such a delusion.

Private browsing really isn't private.

Tuesday, June 30, 2009

Bing

The good folks at Bing Canada were kind enough to invite me to their Bing.ca launch last week.

They're good people over there. Very friendly and genuinely warm.

It's taken me a week to really formulate coherent thoughts that I could write here.

From a search perspective, Bing has very good percentage margins. Of course, I don't bank percentages, I bank dollars - and therein lies the problem for Bing: getting the volumes while maintaining the clickthrough and conversion rates. In the end, of course, search engines dominate through relevancy, and it's through relevancy that Bing will win volume.

It was observed at our table of quantitative search folks that Bing was more consumer oriented. It wasn't necessarily designed for engineers. It has visually nice pictures on the home page. It's usable and feels uncluttered. It's optimized for the top 3 most relevant results to be there. There is a preview mechanism so you can get a flavor of the site before you visit. The results, we speculated, were to be optimized based on consumer relevancy.

It reminds me of the 1990's in a way. I used to use Yahoo! for Entertainment and Alta Vista for business search. Different search engines just seemed to be better at different things. Along came Google, which worked really well for both - and Google Scholar: where they house all their academic searches - and that was pretty much it for Yahoo! and Alta Vista.

If the Bing play is really to 'search and decide' (or put in a more web analyst language: "search and convert"), then it really has positioned itself to be a consumer search engine. So, much of the booing and buuu-urnsing you hear from engineers, developers, academics, doctors - and yea, even web analysts - might very well be predictable. Perhaps Bing simply isn't optimized for that category of people. Maybe people out there - consumers - really are feeling "overwhelmed by search" (a statement that I must admit I still don't feel like it applies to me.) Perhaps it's not about me after all.


It's by being relevant to the consumer that we might see Bing make inroads. It's kind of reassuring (from a creative destruction perspective) that Bing will, like Google, live and die on its algorithm.

And let the best algorithm for the best market win.

Friday, June 26, 2009

Has anyone really been far even as decided to use even go want to look more like?

You read that right.

The intersection of text analytics, social analytics, and neuroanalytics is incredibly interesting and useful.

It's an old meme, and you read about it's origin here.

"Has anyone really been far even as decided to use even go want to look more like?" is a 4Chan-ism. I suspect that it was written by a linguist. Awesome troll is awesome. It can be roughly translated to:

"Has anyone really decided as to even go that far in wanting to do to look more like so?"

Subject - Anyone Verb - decided (modified by "really" adverb) Direct object to "decided" - "that" (pronoun modified by "far") Noun clause that clarifies "that" - "wanting to do" (gerund phrase) Do what? - "to look more like so"

Meaning:

"Has any video game company really taken such measure to make a game so realistic?"


This is the kind of stuff that if a human has a hard time interpreting, a machine is going to have especially hard time codifying and returning some sort of valid output.

This all goes beyond just identifying words in a stream of text and trying to assign some sort of value to them. To be sure, volumetric measurement of mentions is an important first step. Yet, buried in words is what a person is like, how they're feeling, and what they intend the reader to feel.

Copywriters know how to write for a Grade 5 reading level, a Grade 10 writing level, and a university reading level based on a relatively simple algorithm. It follows that since words are machine readable, they can be treated very similarly to numerical input.

Take, for instance, the sequence of words:

"Butterfly violet breeze fizzy"

and:

"Papilio #800080 easy sparkling"

They each individually mean the same thing. Of course, they don't emotionally mean the same thing. The words have different shapes. The speaker of the former would be a normal person, maybe trying to write some poetry. The latter would be some sort of biologist programmer.

Words could be broadly categorized into different buckets, with great analytical effect. But it goes beyond just words. Verbs are where it starts to really get tricky.

If you want to really torture yourself, try reading "Investigations in Universal Grammar" and "The Stuff of Thought" in the same week. Take this quote from page 66 of "The Stuff of Thought":

"Some intransitive verbs resist the intrusions of a causal agent:

The bay is crying.
The thunder is crying the baby.

The frogs perished.
Olga perished the frogs.

My son came home early.
I came my son home early.

And some transitive verbs resist the attempt to strip their causal agents away:

We've created a monster!
A monster has created!

She thumped the log.
The log thumped.

He wrecked the car.
The car wrecked."

It's fairly hard to teach a machine how to interpret things that humans can hardly interpret themselves. Or grammatical rules that only seem to make real sense to the mother tongues' ears.

It's worth figuring out and applying.

Take landing page copy:

The purpose of a landing page, to a direct marketer at least, is to get the person to convert: to take a desired action. Good copywriters know how to use words and tone to compell people to continue reading down the page, like a slide. The theory is that if their head starts nodding at the top, they'll slide down the page, they'll continue saying 'yes' right into a sale. The copy, ideally, should ressonate with who the customer is intended.

I'm fairly certain that certain classes of words are better and convert more than other classes of words. Beyond that though, certain classes of verbs and tones are better at converting than others, in different contexts. The answers could mean the difference between 5% conversion and 20% conversion.

This is one of the thrusts with sentiment analysis. Useful and relevant.

Monday, June 22, 2009

The Culture of Analytics

Jim Novo has really stirred up the hornet's nest now. His post on "Analyze, Not Justify", is a great read.

Down in the comments Jim links to another post "Fear of Analytics". It's another good read.

It all goes to the culture of analytics.

There are people who are fail tolerant and people who are fail avoiding. I don't see how people can survive without a healthy balance of failure and success. Repeated success is required for confidence building and repeated failure is required for learning. Like everything, there's a downside too. Repeated success can lead to arrogance. Repeated failure doesn't guarantee that somebody will learn, either. The fail avoiding behavior, if it persists too long, results in stagnation and, ultimately, long term fail.

An organizational culture that tolerates repeated failure without learning is destined to collapse. An organizational culture that doesn't tolerate failure, even it means learning, is destined to stagnate.

Experimenting with various First (tweaks) and Second Order (major) changes to products and experiences over time, if there is an accumulation of knowledge, ultimately leads to short term and medium run commercial success. The losses from small failures are offet by the gains from an accumulation of knowledge. Some believe that multiple Second Order changes leads to a paradigm shift and to commercial renewal over the long run. (We know that certain countries from a public policy perspective can do this. This idea of 'innovation drift' is kind of interesting.) Admittedly, there are several decisions that somebody at the C-suite only get to make once.

A culture of analytics is a culture that is failure-tolerant and actively manages the risk of innovation.