Guerrilla Analytics is pretty much what it sounds like - it's about going out, without permission and without sanction, and conducting analytics on publicly available information, purely for the purpose of curiosity, case study, or for the common advancement of the discipline or technology.
In many ways, I admire the work that has been done by the dev community. JQUERY is an example of a developer led open-source technology, a common library that many front-end devs dip into. It's just an awesome because it saves them so much time and effort. Many developers are truly technologists. And the really awesome ones go out and experiment. They actually really push the science, and frequently, when it comes to many of these projects, they do so with little expectation of pay or salvation. They do it because they love it. And it's so thoroughly undermined other platforms (including Flash), that you'd have to call it Guerrilla.
Some of the best web analysts I know run their own sites and trick out their sites with analytics. By and large though, because of the nature of our work, we tend to work only on walled off, corporate data. We don't always, necessarily, go around talking about our awesome insights. For most web analysts it's a solitary analytical existence.
In another way, too, the Canadian Election Study offered a common basis of experience for most quantitative political scientists growing up. Everybody who spends extensive periods of time with the study know that dataset inside and out. It's both a collective and solitary existence.
Web analytics, almost by definition, lacks that common dataset. It's something I brought up on the WAA Research Committee yesterday, in the context of training. Granted, the WAA once invited all analysts to engage in a contest to see who could derive the best analysis out of their analytics tool, but admitely, access to the actual numbers was not universal, so it was impossible to really judge the validity of the work.
The production of easily assembled datasets is well upon us.
We're seeing some Guerilla Analytics being done by Peterson, based around the publicly available information on Twitter. We're seeing some of it happen in Social Media, though, I don't know if I'd judge some of it to be 'analytics' quite yet. Though, a few people certainly are trying.
Going out, assembling data sets, and starting to solve problems for the sake of solving problems - is that technologist spirit. There's a huge advantage in stepping outside the narrow possibilities of proprietary data and experimenting - then bringing back a lot of that knowledge, for the benefit of everybody. Is there any Guerrilla in you?
Thursday, April 30, 2009
Sunday, April 26, 2009
Three Large Skillsets in Web Analytics
Patrick @glinskiii once identified three large buckets of skills in his "it takes an orchestra" argument for web analytics programs. It feels like years ago (it's probably only been about a year), and it has since evolved. It goes like this:
There are three large skillsets in web analytics.
T's, or Technical Analysts, specialize in the technical side of web analytics. They're the people who can tell you where to put single quotes versus double quotes in the S.Campaign variable of Omniture.
S's, or Strategic Analysts, specialize in strategic side of web analytics. They're the people who can tell you the social process necessary to take an insight and translate it into action.
A's, or Analytical Analysts, specialize in extracting insight out of web analytics. They're people who know what a social graph is, and how to read it, run statistics, and consult on infometrics.
Not all the boxes are exclusive. For instance, an 'A' ought to know how the S.Campaign variable works. An 'A' also ought to know how to talk to somebody from creative. T's ought to know infometrics if they're doing any front end setup.
In general though, it is exceedingly hard to be good at the skillsets within T, those within S, and those within A. For instance, to be a very good A, you need to practice statistics regularly. It takes years of trial and error to become a very good S. The S skillset demands extroversion, which is sometimes at odds with A's. To be a good T, you need ultimately know how to program. T's are ultimately technologists.
That's not to say that good TAS's don't exist though. They do. They're just incredibly rare. It's far more common to find people who have done T, S, or A at different points in their career, who can, indeed, return to a different skillset with some ease.
Finding people with 5 years + in web analytics, and with exposure to all three, is hard. And it'll probably get much harder post-recession.
There are three large skillsets in web analytics.
T's, or Technical Analysts, specialize in the technical side of web analytics. They're the people who can tell you where to put single quotes versus double quotes in the S.Campaign variable of Omniture.
S's, or Strategic Analysts, specialize in strategic side of web analytics. They're the people who can tell you the social process necessary to take an insight and translate it into action.
A's, or Analytical Analysts, specialize in extracting insight out of web analytics. They're people who know what a social graph is, and how to read it, run statistics, and consult on infometrics.
Not all the boxes are exclusive. For instance, an 'A' ought to know how the S.Campaign variable works. An 'A' also ought to know how to talk to somebody from creative. T's ought to know infometrics if they're doing any front end setup.
In general though, it is exceedingly hard to be good at the skillsets within T, those within S, and those within A. For instance, to be a very good A, you need to practice statistics regularly. It takes years of trial and error to become a very good S. The S skillset demands extroversion, which is sometimes at odds with A's. To be a good T, you need ultimately know how to program. T's are ultimately technologists.
That's not to say that good TAS's don't exist though. They do. They're just incredibly rare. It's far more common to find people who have done T, S, or A at different points in their career, who can, indeed, return to a different skillset with some ease.
Finding people with 5 years + in web analytics, and with exposure to all three, is hard. And it'll probably get much harder post-recession.
Tuesday, April 21, 2009
Web Analytics Planning
Joseph Carrabis wrote something very relevant to our interests. Especially when it comes to planning Web Analytics projects.
It's worth the read. Go check it out. I'll wait.
What's easily missed on the first scan is the passage:
And
Rule 1 is important. I categorize knowledge into three broad buckets:
It's the third category that's the scariest of all. When I move information from What I know I don't know into the What I know that I know bucket, I frequently uncover pieces of things that I didn't know I didn't know. The effort is important, and it does help minimize your chances of making mistakes. This makes the Discovery portion of any analytics project really important.
Which twists me into "Remove Ambiguities" line.
So it goes with analytics planning. You can spend a disproportionate amount of time planning contingency plans, or, you can set a goal and plan on a course of action that you have the greatest chance of controlling. Mastery of own destiny sort of thing.
There's a lot of inside pool to Joseph's blog posting - specifically the passages around gender - and there are reasons for that. In the coming months, I suspect that we're going to have more discussions centered around accuracy, ambiguity, and standard deviation.
It's worth the read. Go check it out. I'll wait.
What's easily missed on the first scan is the passage:
"The purpose of these rules is to tend towards 0 the likelihood that a mistake will be made."And the two rules, which are the meaty bits are:
"Rule #1 - Eliminate Variables"
And
"Rule #2 - Remove Ambiguities"
Rule 1 is important. I categorize knowledge into three broad buckets:
- What I know that I know.
- What I know that I don't know.
- What I don't know that I don't know.
It's the third category that's the scariest of all. When I move information from What I know I don't know into the What I know that I know bucket, I frequently uncover pieces of things that I didn't know I didn't know. The effort is important, and it does help minimize your chances of making mistakes. This makes the Discovery portion of any analytics project really important.
Which twists me into "Remove Ambiguities" line.
So it goes with analytics planning. You can spend a disproportionate amount of time planning contingency plans, or, you can set a goal and plan on a course of action that you have the greatest chance of controlling. Mastery of own destiny sort of thing.
There's a lot of inside pool to Joseph's blog posting - specifically the passages around gender - and there are reasons for that. In the coming months, I suspect that we're going to have more discussions centered around accuracy, ambiguity, and standard deviation.
Sunday, April 19, 2009
The Accuracy of the Unique Visitor Count
An excellent blog post on Estimating the Effects of Cookie-Deletion is timely and welcome, given the relative degree of contention around the Unique Visitor (UV) definition.
The chart above is not gospel, and you should not be running around saying that all websites have 100% human-visitor inflation. That isn't what Angie is saying. Angie has offered up something valuable: a pretty simple model for estimating UV inflation.
What Angie is arguing here that the effect of cookie deletion on your unique visitor to human estimate will depend severely on the use of your website and the inherent habits of its audience.
Let's assume that there's a fanatical group of humans that visits your website. Let's also assume that within that fanatical group, there are a set of behaviors and attributes that tend to cluster. For instance, it just might so happen that this cluster is really into your content or service: say, toddler nutrition. Now, let's assume that the content is delivered with a unique spin that's appealing - let's say, constant references to failed, esoteric memes. Let's also assume, then, that primary audience are concerned fathers about their toddler's health and foodies. Given actuarial tables and education (perhaps derived from Quantcast), we might go so far as saying that these guys are aged 33 to 39, and have online tenures that date around 1998 - which happened to be during one of the cookie deletion hysteria eras.

As a result, a higher proportion of these guys tend to delete cookies, which might lead the poor sap who runs the site to conclude that his monthly unique visitor figure of 1000 means '1000 people', when in reality, he's has a following of 200 to 400 actual flesh and bone humans.
I've slipped a link in there.
Online tenure is predictive of many online behaviors - but buried within that number is the year that somebody 'came online'. What that year says about them and their underlining habits is important, and it can be an extension of the model.
There are predictors of cookie-deletion and cookie-suppression. They're not 100% accurate. But taking the principle of customer centricity to heart, and thinking about the likelyhood of your audience to delete, they are welcome grains of salt to the UV metric.
The chart above is not gospel, and you should not be running around saying that all websites have 100% human-visitor inflation. That isn't what Angie is saying. Angie has offered up something valuable: a pretty simple model for estimating UV inflation.What Angie is arguing here that the effect of cookie deletion on your unique visitor to human estimate will depend severely on the use of your website and the inherent habits of its audience.
Let's assume that there's a fanatical group of humans that visits your website. Let's also assume that within that fanatical group, there are a set of behaviors and attributes that tend to cluster. For instance, it just might so happen that this cluster is really into your content or service: say, toddler nutrition. Now, let's assume that the content is delivered with a unique spin that's appealing - let's say, constant references to failed, esoteric memes. Let's also assume, then, that primary audience are concerned fathers about their toddler's health and foodies. Given actuarial tables and education (perhaps derived from Quantcast), we might go so far as saying that these guys are aged 33 to 39, and have online tenures that date around 1998 - which happened to be during one of the cookie deletion hysteria eras.

As a result, a higher proportion of these guys tend to delete cookies, which might lead the poor sap who runs the site to conclude that his monthly unique visitor figure of 1000 means '1000 people', when in reality, he's has a following of 200 to 400 actual flesh and bone humans.
I've slipped a link in there.
Online tenure is predictive of many online behaviors - but buried within that number is the year that somebody 'came online'. What that year says about them and their underlining habits is important, and it can be an extension of the model.
There are predictors of cookie-deletion and cookie-suppression. They're not 100% accurate. But taking the principle of customer centricity to heart, and thinking about the likelyhood of your audience to delete, they are welcome grains of salt to the UV metric.
Thursday, April 16, 2009
Rails
I'm smitten with Rails.
Rails conforms to my world view in two ways.
DRY stands for 'Don't Repeat Yourself'. It's a great principle, especially when writing difficult SPSS code.
MVC stands for Model, View, Controller - and it's the dominant way that I organize, present, and modify data.
There are other biases that are built into Rails that I like, but mostly, it's those two principles.
I'm looking at Rails as an important way of solving a number of lingering problems in Web Analytics, and once I learn enough to actually start experimenting and solving them, I'll share them.
Rails conforms to my world view in two ways.
DRY stands for 'Don't Repeat Yourself'. It's a great principle, especially when writing difficult SPSS code.
MVC stands for Model, View, Controller - and it's the dominant way that I organize, present, and modify data.
There are other biases that are built into Rails that I like, but mostly, it's those two principles.
I'm looking at Rails as an important way of solving a number of lingering problems in Web Analytics, and once I learn enough to actually start experimenting and solving them, I'll share them.
Sunday, April 12, 2009
The Intersection of Four Threads and Problem Orientation
This post briefly summarizes four threads of thought and a conclusion around problem orientation.
I read "Evaluation of Internet Advertising Research" by Juran Kim and Sally J. McMillan. It's effectively a social graph exercise. The findings themselves are interesting (and you can read about that through the Web Analytics Association once I publish the review), but this reference to "invisible colleges" was especially fascinating - just coming off of the SLAB Karen Stevenson talk at OCAD. Kim and McMillan make the point that visualizing bibliographic graphs (a social graph) is useful for uncovering these colleges.
The second thread has to do with "The Market Valuation of Internet Channel Additions", by Geyskesn, Gielens and Dekimpe. In it, they construct a model to judge whether or not a company should create an Internet channel - and use newspapers as the case study. Given all the thought that has circulated around content and monetization. It got me into really thinking about how the same flattening forces that makes it possible to do great things, also has the power to destroy content. Flattening can be creatively destructive too. Out of negative externalities comes problems needing to be solved.
The third thread has to do with Jim Novo and the Drilling Down Project. The thesis that RFM can be done with no more than a spreadsheet and it's scallable downwards for small businesses, is compelling. It's taking makerting analytics away from the data mining domain and down to a micro-level. The technology has certainly flattened. And this opens up an entire world that is so exciting analytically - with applications for marketing science, revenue projection, and, when combined with the principles of Mason, lumped with web analytics - you get Novo-Mason WARFM. There's incredible power in there.
The fourth thread has to do with extreme customer centricity and some of the material that Maciek Adwent has been emailing my way. A good synthesis of much of that material is: "from small things grows big things", and a derivative of the Matt Milan life/work balance thesis. This thread actually matches well with the whole "Crossing the Chasm" book, which is an original root artifact of extreme customer centricity. There's a lot in there, and much of it is transparent with some of the cross-blog talk.
If we're going to take the customer centricity thesis to it's extreme, then we have to consider their networks. If web analytists and marketing scientists can't communicate what they do quickly and understandably, and the problems they solve, in a nugget - in a sound byte, then how will potential customers ever know what we do? Moreover, why would they ever want to talk about us? How can you use the power of networks and word of mouth if you don't have the bear minimum requirement: repeatable duckspeak words?
It kills me to say it, but I think we have to be more problem-oriented in our communications than solutions-oriented. We need to be incredibly specific about the problem that we're solving.
It's far easier to talk about a problem that you're wanting to solve, especially in an economy that is generating so many new problems to solve.
I think a solutions-orientation is essential. The mark of a great analyst when confronted with a problem is to see several solutions (at minimum) and, at maximum, to recognize when they're looking at a wicked problem. But I think we need to keep more of the solutions-orientation to ourselves when we're communicating what we do.
Taken from this point of view - you wouldn't use the word 'Novo RFM' to describe what it is that you do. You determine the one problem that RFM can really solve really well, the bigger the problem, the more important it is. The more marketable the solution. There's also other interesting problem spaces to be explored. For instance, the problem of online content monetization is a screaming problem - and I think that Novo-Mason WARFM might actually offer an incredible solution. If the term Novo-Mason WARFM puts you to sleep - you're not an analyst (and welcome to my blog!). It's a solution set that has been screaming for a really big, juicy, problem to solve. It makes more sense to talk to people about the nature of the problem and briefly mention the solution. And let curiosity run its course.
That's where I'm at now. I'm at customer centered problem sets.
I read "Evaluation of Internet Advertising Research" by Juran Kim and Sally J. McMillan. It's effectively a social graph exercise. The findings themselves are interesting (and you can read about that through the Web Analytics Association once I publish the review), but this reference to "invisible colleges" was especially fascinating - just coming off of the SLAB Karen Stevenson talk at OCAD. Kim and McMillan make the point that visualizing bibliographic graphs (a social graph) is useful for uncovering these colleges.
The second thread has to do with "The Market Valuation of Internet Channel Additions", by Geyskesn, Gielens and Dekimpe. In it, they construct a model to judge whether or not a company should create an Internet channel - and use newspapers as the case study. Given all the thought that has circulated around content and monetization. It got me into really thinking about how the same flattening forces that makes it possible to do great things, also has the power to destroy content. Flattening can be creatively destructive too. Out of negative externalities comes problems needing to be solved.
The third thread has to do with Jim Novo and the Drilling Down Project. The thesis that RFM can be done with no more than a spreadsheet and it's scallable downwards for small businesses, is compelling. It's taking makerting analytics away from the data mining domain and down to a micro-level. The technology has certainly flattened. And this opens up an entire world that is so exciting analytically - with applications for marketing science, revenue projection, and, when combined with the principles of Mason, lumped with web analytics - you get Novo-Mason WARFM. There's incredible power in there.
The fourth thread has to do with extreme customer centricity and some of the material that Maciek Adwent has been emailing my way. A good synthesis of much of that material is: "from small things grows big things", and a derivative of the Matt Milan life/work balance thesis. This thread actually matches well with the whole "Crossing the Chasm" book, which is an original root artifact of extreme customer centricity. There's a lot in there, and much of it is transparent with some of the cross-blog talk.
If we're going to take the customer centricity thesis to it's extreme, then we have to consider their networks. If web analytists and marketing scientists can't communicate what they do quickly and understandably, and the problems they solve, in a nugget - in a sound byte, then how will potential customers ever know what we do? Moreover, why would they ever want to talk about us? How can you use the power of networks and word of mouth if you don't have the bear minimum requirement: repeatable duckspeak words?
It kills me to say it, but I think we have to be more problem-oriented in our communications than solutions-oriented. We need to be incredibly specific about the problem that we're solving.
It's far easier to talk about a problem that you're wanting to solve, especially in an economy that is generating so many new problems to solve.
I think a solutions-orientation is essential. The mark of a great analyst when confronted with a problem is to see several solutions (at minimum) and, at maximum, to recognize when they're looking at a wicked problem. But I think we need to keep more of the solutions-orientation to ourselves when we're communicating what we do.
Taken from this point of view - you wouldn't use the word 'Novo RFM' to describe what it is that you do. You determine the one problem that RFM can really solve really well, the bigger the problem, the more important it is. The more marketable the solution. There's also other interesting problem spaces to be explored. For instance, the problem of online content monetization is a screaming problem - and I think that Novo-Mason WARFM might actually offer an incredible solution. If the term Novo-Mason WARFM puts you to sleep - you're not an analyst (and welcome to my blog!). It's a solution set that has been screaming for a really big, juicy, problem to solve. It makes more sense to talk to people about the nature of the problem and briefly mention the solution. And let curiosity run its course.
That's where I'm at now. I'm at customer centered problem sets.
Thursday, April 2, 2009
Of Networks and Hierarchies
It's rare that somebody forces me to really look at something differently - but Karen Stevenson in an SLAB lecture at OCAD did.
Karen pointed out that the three human variables that matter are: transactions, authority, and trust. Transactions among people are easily handled by technology. It's been long standardized, and in fact, we're making incremental improvements in that all the time. Where there's ambiguity in transactions, you need authority to make decisions. What was really left unsaid, but what I'm concluding, is that since humans are very creative people, they always manage to get themselves into non-standardized problems, and as such, they will always need authority. (Look no further than to Judge Judy for daily evidence of that.)
As Karen rightly points out, the factor that matters the most in human networks is trust. It's this trust that gives networks their power to undermine hierarchies. Hierarchies, however, are far more durable than networks over the long run, which thankfully for me, doesn't challenge what I already know from Kathleen Thelen in "How Institutions Evolve". Hierarchies last a long time without their creators knowing that they will (all the time).
Whether or not genuine trust can be built through electronic social networks is the biggest question. We're seeing some evidence of it happening through eBay.
The implications of her work are certainly relevant to our field of social analytics - a sub-domain of web analytics. It's a field I really don't talk much about in this space, but, it's relevant to our interests.
Thanks to S-LAB and to Karen Stevenson for coming up to Toronto, talking to us, and changing the way I think about hierarchies.
Karen pointed out that the three human variables that matter are: transactions, authority, and trust. Transactions among people are easily handled by technology. It's been long standardized, and in fact, we're making incremental improvements in that all the time. Where there's ambiguity in transactions, you need authority to make decisions. What was really left unsaid, but what I'm concluding, is that since humans are very creative people, they always manage to get themselves into non-standardized problems, and as such, they will always need authority. (Look no further than to Judge Judy for daily evidence of that.)
As Karen rightly points out, the factor that matters the most in human networks is trust. It's this trust that gives networks their power to undermine hierarchies. Hierarchies, however, are far more durable than networks over the long run, which thankfully for me, doesn't challenge what I already know from Kathleen Thelen in "How Institutions Evolve". Hierarchies last a long time without their creators knowing that they will (all the time).
Whether or not genuine trust can be built through electronic social networks is the biggest question. We're seeing some evidence of it happening through eBay.
The implications of her work are certainly relevant to our field of social analytics - a sub-domain of web analytics. It's a field I really don't talk much about in this space, but, it's relevant to our interests.
Thanks to S-LAB and to Karen Stevenson for coming up to Toronto, talking to us, and changing the way I think about hierarchies.
Subscribe to:
Posts (Atom)