The Net Takeaway: Page 30


Class Action Suit filed against SPSS · 05/17/2004 12:10 PM, Analysis

I guess this should surprise nobody. Whenever a stock has trouble for a nanosecond, lawyers dive in “on behalf of their clients”. Funny enough, its rare that a class rises out of the masses and finds a lawyer; it always seems to go the other way…

Geller Rudman Announces Class Action Lawsuit Against SPSS, Inc. on Behalf of Investors

Obviously, SPSS is still in a transition stage and is still trading as SPSSE. More at Yahoo if you want current tickers… And more on my take in a previous entry.

In my “I’m not a stock analyst and my meager 401k shows it” mode, I suggest we just continue watching events. SPSS isn’t going out of business anytime soon, but we may see the beginning of a reduction in services (longer wait times for tech support, etc.) and other cost-savings measures while resources are reallocated to deal with all these distractions.

Disclaimers are listed in my previous entry, and they all still apply.


* * *


Judoscript Logging · 05/14/2004 03:24 PM, Tech

Check here to find a logging component for Judoscript. No more print statements!


* * *


Analysis in PHP · 05/12/2004 04:01 PM, Analysis

There are many times when we want to use our modeling tactics or analysis techniques outside of the tool environment. There are also new techniques which are not well represented in the mainstream commercial analytic systems of SPSS and SAS.

So, I’m very excited to point to some fantastic work which shows how to code up a variety of modern analytic techniques in PHP. Yes, PHP of LAMP fame.

The main code is at, where Paul Meagher (CEO and founder of Datavore) has written post after post on how to code analytic functions, including vector and matrix math, cosine similarity between two vectors, web mining functions (markov approach), etc. He has packaged many of these into a PEAR module, which is like the CPAN modules of PERL, an easy way to add functionality to a PHP installation.

But that’s not all! The prolific fellow has written a slew of articles and tutorial on the IBM developerWorks site, including:

Each of these has an excessively well documented Resources section, including both online sources and offline books and articles.

These were all a joy to read, both for bringing these techniques out into a new domain which doesn’t require the installation of new software (no R libraries, no runtime modules, etc.) and for the cogent explanations of the theory to an audience which rarely gets treated to this level of respect regarding why and how we analyze data.

And yes, even if you think you know a lot about this stuff, Mr. Meagher probably found a link you hadn’t seen yet. All highly recommmended.


* * *


Web Metrics vs. Web Analytics · 05/10/2004 09:03 PM, Analysis

Jim Sterne of is another one of those guys always worth listening to. While I don’t agree with everything he says here, its in the right direction (free registration may be required to read it).

Basically, he posits “web metrics” as the descriptives around the internet as a whole: popular sites, broadband penetration, etc. He considers this the bastion of Keynote and Comscore’s MediaMetrix. The down and dirty of analyzing your own web site is what he calls web analytics.

This sort of reflects a marketers’s slant, and ignores the researcher/analyst way of looking at indicators. In the database world, these concepts fall out in a smoother way, and the terms are already in wide use. The difference between metrics and analysis becomes easier if you read “reporting” for “metrics”.

Mr. Stern is reinventing terms that researchers have already defined. For example, what he calls web metrics is what many of us would refer to as a census. That is, the review of the population of web sites, how they act as a group, and the behaviors of the denizens within the Internet, these all make up the population our interactive marketing live within. That’s not “web metrics”, that’s “describing the Internet”.... or a census.

My definitions are much more workable, and they force the tools to put up or shut up. Web metrics are the counts of behaviors and actions on your site. They can be script executions, page views, or combinations of pages making up an action. They can also be counts of user-agents, browser types, and return visits. In some cases, they can be counts of “new registrants”, and, of course, purchases.

Web Analytics is breaking apart the web metrics, often in an interactive fashion, to understand patterns and estimate causes of behaviors (and changes in behavior) in those metrics. It can also involve linking “web metrics” to “other metrics” including retail sales for the same time period, etc. Can one do such a data examination with reports (“web metrics”)? Sure, up to a point. If you know that you have different sources of visitors, you can pre-configure the reporting to split on that variable to show different patterns of behavior. And if there are no differences one day? Your reporting may not be able to tell you why, so you shift to ad-hoc reporting which is the first step to analysis.

So, the cheap and open source tools do a nice job at making web metrics or web reporting. Very few of the tools (except at the high end) allow the ad-hoc and analytic approach.

One of the few shows focused on web analysis is Mr. Sterne’s Emetrics summits. Anyone in this space is well advised to attend, either in the US or the UK. More info here. Mr. Sterne’s books are also recommended reading. Whether you like my definitions or Mr. Sterne’s, note that we are both in the same direction and I would gladly follow any path he chooses to examine.


* * *


BBClone Counter and Textpattern · 05/09/2004 02:24 PM, MetaBlog

Apfelsoft has graciously offered a plugin to provide BBClone tracking on Textpattern.


* * *


Clementine 8.1 in Infoworld · 05/09/2004 01:38 AM, Analysis

Very positive review of Clementine 8.1 in Infoworld. I am still on 7.something, so sounds like its time to upgrade…

Read all about it here


* * *


Java Classpath issues... · 05/06/2004 07:15 PM, Tech

One of the most confusing aspects of Java (both running and programming) is the annoying ClassPath.

This is like the executable search path in DOS/Win32 cmd.exe, but in Java, you have to add each .JAR file to the list!

The Java runtime command (and the javac compiler) allows the -classpath option to define a specific path, but that often means accidentally excluding any classpath you may have already defined. BTW, you can have multiple entries in the classpath, look out for Windows vs. Unix differences:
Unix: java -classpath dir1:dir2:dir3 …
Windows: java -classpath dir1;dir2;dir3 …

As K-Zone points out, “The reason for the difference is that Windows uses the colon (:) character as part of a filename, so it can’t be used as a filename separator. Naturally the directory separator character is different as well: forward slash (/) for Unix and backslash (\) for Windows.”

Ok, so how to “append” to the classpath for this run? Use the environment variable:
Unix: java -classpath $CLASSPATH:dir1:dir2 …
Windows: java -classpath %CLASSPATH%;dir1:dir2 …

A quickie is to use the “current dir” or ”.”:

BTW, to see the classpath in Judoscript:
usage { desc = 'Prints the classpath nicely.';}
for x in systemProperty('java.class.path').csv('${:}' )
{ println x;}


* * *


Flying... on a jet plane... · 05/06/2004 05:14 PM, Personal

For the first time, I flew in a “personal” or “private” jet. We needed to visit a client in Warrandale, PA for an early morning meeting. After adding up the hotel rooms for the night before and the rapacious USAir pricing into and out of Pittsburgh, it was clear that we had to find another solution.

Enter “Boston Air Charter”. We flew on a Citation S/II with room for 7 people. The plane was based at Norwood Airport, a civil airport south of Boston, and an easy drive on the 128/95 highway around Boston. The pilots were very professional, and had colas, coffee, and donuts for the early morning flight, and boxed lunches for the flight back. They also had stocked the plane with the day’s papers and that week’s newsmagazines.

We literally parked right on the runway, walked to the plane, and flew. No painful parking at Logan airport, no tolls, no security lines, no angry people treating us like cattle.

Upon takeoff, we flew low (6000 feet) over the suburbs south of Boston and saw some unbelievable mansions hidden away down there. We then gradually headed to 32000 feet with dips up and down as required by the flight plan. Though the plane was smaller than a commercial jets, and turns felt very “tilty”, on the whole, it was no more uncomfortable than a big liner.

There is a small bathroom… well, a seat with a lid and a partition curtain; you wouldn’t want to have an “emergency” there. And yes, it is difficult to stand up, and if you have long legs, the plane it a bit small… but overall, not much more annoying than a full 727 which is the norm these days for short flights.

This chartered plane was really a nice way to travel. Obviously, we can’t afford it all the time… but with the pains of modern commercial air traffic, these smaller carriers can really change the way to get from place to place. I’m keeping my eye on companies like LinearAir as the wave of the future.


* * *


Jamaica Bytecode Assembler · 05/03/2004 12:27 PM, Tech

This is hard-code java stuff.

James Huang, creater of Judoscript, has published an article at JavaWorld where he describes his latest creation: Jamaica, the JVM macro assembler language.

While I will never directly use this thing, its important because its a huge step on the way to getting Judoscript to compile down to bytecode for speed.


* * *


More on "cookie killing"... · 05/03/2004 10:53 AM, Analysis

Eric Peterson wrote the book Web Analytics Demystified: A Marketers Guide to Understanding How Your Web Site Affects Your Business. He used to be with WebSideStory, and had earlier worked with Webtrends and Coremetrics. From reading his blog and parts of his book, he’s a pretty sharp character (and he hangs out with a good crowd)

He had a similar reaction to the recent reports of “perhaps 40% of users kill their cookies” that I did.

He questions the size of the numbers, and the lack of data around these broad claims. I agree that the numbers are probably wiggly, but I don’t really care: it doesn’t matter what the broad percentage is across the web, what matters is the specific group of people who visit your site.

I would point out that its a classic sampling problem. There are certain kinds of people who kill their cookies (or have friends or children who set up cookie killers on their machine for them). They are also more likely to be heavy users of the web. Doesn’t it stand to reason, therefore, that any broad “percentages” ignoring what segment the person resides in is useless? If I have 100% of my “heavy users” who kill their cookies, and 3% of my light users, then saying “40%” kill their cookie is rather silly.

Instead, measure it on your site. Do the same thing Redeye did: Link any identification system to the cookie placement, and segment your users by actions (initially). You will find that people who don’t get far past the home page may delete their cookie 100% of the time, but people who customize your site and register never do. NOW you can start talking about %ages… per groups of people with similar behaviors.

So, my post provided some potential strategies to work around cookie deleting, but again, the point is not that “people everywhere are rampantly killing cookies”. The questions to ask should be “I understand that some poeple do… but how bad is the problem on my site?” and “How important is it to me and my business to track them via cookies, or give them a reason to keep my cookie?”

Again, I don’t care what some broad research says about how the fruit salad of users kills their cookies. I want to treat my apples differently from my oranges.

(Ok, ps: An interesting approach would be to use the MediaMetrix/NielsenNetRatings panel, see how they deal with cookies, and segment them. Due to their weighting, you can start to tell how the “grey cloud of users” behaves in this regared. You can even match the segments up by how they behave on your site, if your site shows up in the behavioral panel tracking. But seriously, can we get past this?)

Comments? [2]

* * *


