The Net Takeaway: Page 31


Upgraded to Gamma 1.18a... and Plugins · 05/01/2004 10:51 PM, MetaBlog

New version of TextPattern installed here. Please let me know if things are not working correctly…

Also, from Nathan Pitman, we have some more sources of interesting plugins:

BTW, the best (read: only) documentation on Textpattern plugins at publication time is Dean’s post here.

In addition, a somewhat clearer explanation of tags is present here.


Savarese on Groovy · 04/30/2004 02:05 PM, Tech

If you do any Java, you quickly run across Daniel Savarese. This fellow made one of the first regex engines for Java, and continually pumps out interesting and useful info.

He writes here in JavaPro about the Groovy scripting language. (It was nice that he gave my personal fave Judoscript a mention.)

One of the reasons I enjoy Daniel’s work is that he is a straight shooter. So many articles about Groovy are just glowing; Daniel points out that the learning curve is a bit tougher than you might think when learning Groovy. Some of its strengths (loose typing, “convenience” features), as he says, “offer many pitfalls.” Now, this is true for practically any scripting language, though I’ve found that Judoscript seems to be more logical to me in what it offers compared to the others.

One thing he points out is the potential value of the “SwingBuilder” construct for making Swing easier, and I can defn use anything which simplifies the monster known as Swing.

He ends on a positive note that Groovy is a language growing and changing, and I’m sure I’ll love it when I finally understand it… and p/jython… and LISP.

BTW, found this on the “Groovy News” Tiki site, Like most Tikis, chock full of potential and slow as molasses. I still love Tiki, but until it gets performance up and bugs out, its hard to use it for much of anything which might scale.


Genetic Analytics... · 04/27/2004 01:37 PM, Analysis

A friend suggested I look at Genalytics. Based out of Newburyport, MA, these folks have been quietly creating business analytic software based on genetic algorithms.

They provide services using their software, but their pride and joy is the Genalytics Predictive Suite. It includes some good software to make the suite an all encompassing analysis tool

The Modeling portion provides the usual visualization tools, etc. and they aren’t anything special. Their claim to fame is really the creation of one of the first publicly available genetic data mining tools for “general” use.

This class of analysis is also referred to as evolutionary algorithms. They model analysis after a Darwinian Evolutionary theory. In general, they have rules which predict things, and those rules interact with the “strongest” surviving to provide a final result. In general, they focus around two main activities: Crossover and Mutation. Also, their fit evaluation is usually more encompassing of all variables instead of the partials most of the rules inductions tend to use.

Now, I don’t know how Genalytics does this, but there is a “Michigan Approach” whereby you start with a group of rules, your “chromosomes”: half are random, half are based on most frequent values. The chromosomes make up a rule. You cycle through an evaluation (do these rules fit?), select the strongest, and then do a “reproduction” phase, where you allow some rules to cross (swap antecedents and precedents), some rules to pass through, and some to mutate (new rule based on old rule and a change).

Now, this is a bit of work to set up, and not something which you just set up in SPSS or Clementine. Instead, you have to code it. R has some packages at CRAN which start down this path, but most of the work is still high end researchy stuff.

So, Genalytics has productized it. Doug Newell, their CEO, has been tireless in trying to get people to recognize the power of the platform. Its not cheap; $50k can get you an initial integration and, say, a model (according to a Forrester report posted on the Genalytics site), but to really use the product, you recognize quickly that this is an “enterprise” priced solution.

I think they need to make a Clementine wrapper around it and price it on a per use basis, but that’s just me.

As a harbinger of cool things to come, I look forward to more penetration of new algorithms such as Genetic Alg to our analytic toolkit. I also encourage folks who want a leg up on the competition to consider a Genalytics analysis.


SPSSE: SPSS and Nasdaq · 04/25/2004 07:26 PM, Analysis

Ok, it would be remiss of me not to comment just a moment on the SPSS potential delisting issue currently facing us (and them).

Why do we care? Well, a liquid equity market is an important part of a company’s ability to get financing for big projects. If the stock is delisted from a major exchange, it becomes a “pink sheet” which is harder to trade. This makes it harder for SPSS to get credit or otherwise expand without extra “penalties” such as early payback requirements, higher interest rates, etc. (They did have $37 million in cash last year, so they aren’t desperate, but still…)

Ok, so what’s going on? Well, a million years ago (in Internet years), SPSS decided to join the Internet boom. In October of 2001, SPSS made a deal to become the “survey engine” for AOL (more here). This seemed great at the time: this was a great win for the SPSS MR Division, proving them to be both a scalable solution and with access to a large panel of respondents.

Jump forward to today. AOL has had its ups and downs, and like everything else during the boom, prices were way inflated. So, SPSS decided to review how it had accounted for the deal over the last few years, and reworked the deal with AOL. This stimulated their first request: they asked for a 15-day extension on their 10k (a required filing with the SEC for all public companies).

However, the review turned up a new issue, an accounting error whereby SPSS overstated revenues by $3-6 million (note that on the annual basis, this represents no more than 2.8% of their annual revenues). Because of this error, lots more had to be checked… and so they missed the filing deadline. Unlike most of our deadlines, this one really is sort of important.

Because that missed deadline is grounds for removal from the exchange, SPSS now trades as SPSSE. Obviously, they are appealing the issue, and if all works out, this will just be another chapter in the long history of SPSS.

But I worry:

But mostly, I just think this points out how difficult today’s financial world is. Its tough to run a large company (SPSS has a $270 million market cap, and reported $210 million in revenue last year), and every little mistake nowadays impacts, well, everything.

I wish SPSS lots of luck, and I am sure everything will work out. But let’s all keep a close eye on this, ok?

Disclosure: I own no SPSS stock, but I do own mutual funds, and they may have some shares, I don’t really know. I don’t work for the company, and I didn’t ask anyone there about this stuff. I pulled it all from Google News and links in a Google search. I am not a lawyer (IANAL) nor a stock analyst, so take all of the above to your personal broker before buying or selling anything you own based on my potentially incorrect interpretation of the circumstances.

Comments? [1]

Python and Stats modules... · 04/22/2004 02:11 PM, Analysis

Simson Garfinkel notes that Python has some pretty good stats modules, and that Salstat is written in Python. He also points to a simple stats module in Python for the quick and dirty stuff.

I might add that Perl has the impressive PDL: Perl Data Language for matrix calculations, as well as some other stats modules at CPAN.


Optimal Impressions · 04/21/2004 07:04 PM, Analysis

Nice study by Atlas entitled Optimal Frequency— The Impact of Frequency on Conversion Rates (PDF file). They are one of the first to properly examine the cumulative effect of 1 and only 1, 2 and only 2, etc. impressions instead of the bass ackward way this is usually done (group all the 6 impression people together, even if they converted on the 3rd impression, etc.).

More details in the paper. It represents some pretty good thinking, and is worth a read. In fact, anytime Young-Bean Song has something to say, its worth listening to.


"Uniques" problem · 04/21/2004 05:57 PM, Analysis

We’ve struggled with the unique user problem in web analysis since we first started. And it ain’t over yet. An article in Editor & Publisher points out some recent research which, yet again, shows that uniques via a log file don’t reflect self-reported behavior. Other research shows that cookie uniques do not match data from login counts (didn’t get much play here in the US, but read all about it here).

This is nothing new; at Strategic Interactive Group (now Digitas), we were calculating correction factors for this problem in 1995.

I guess I get frustrated that
a) we keep talking about this problem, and
b) we don’t just solve it.

I am amazed at the continual fear of cookies, for no real reason. That being said, as we continue to find ways to add value to a user’s experience in exchange for a cookie, we may yet get past this.

Of course I said that almost ten years ago, but hope springs eternal.

How might we solve it? Besides requiring logins, we can be good marketers (as above) and provide value for cookies. We can estimate a correction factor based on cookied individuals. We can compare cookies to IP over a time period to get a feel for transitions (DHCP doesn’t expire as fast as everyone thinks, and of course, AOL is always a pain) and understand whether cookies are really being deleted or not. Remember, just because it happens to their site doesn’t necessarily mean its happening to yours; its an empirical question.

And most importantly, our research implies that most people, most consumers, on average, just aren’t deleting them… and those who are may not be the best prospects for what you are selling. So consider the specifics of your situation before jumping to conclusions.

As far as the “value for identification” transaction, I like the “bait and switch” techniques of Orbitz. The first search for airfare is “free”, no login required. But as soon as you see one you want more info on, or you want to edit your search… you have to login. By this time, you are committed, and you will login. There is a value to doing so: getting the fare info you are looking for.

So, instead of cookieing wildly and wondering why users are deleting, let’s make overt the cookie, and explain why, and give a really good value reason.

Or, we can just keep bringing up the uniques vs. visits issue every other week or so, as an easy conversational fallback. Your call.


Where is Statistica these days? · 04/21/2004 05:20 PM, Analysis

CSS:Statistica, by Statsoft, was one of the best deals going in stats software. Fantastic graphics, very MS-Windowsy (meaning to took good advantage of the MS Windows platform), scriptable in a vbscript language (like MS Excel, etc.), and included tons of procedures in the package. It was also very reasonably priced, meaning a small company could afford to buy it with all the options and have a very nice analysis environment.

It did have “windowsitis”, meaning it spawned a window for every piece of output or options window, which got crowded very quickly… But it was lots of fun to use, and a great deal.

However, prices went up quickly, it stopped being such a bargain, and now, the site doesn’t even mention price, which is usually a sign that I can’t afford it (“If ya have to ask…”). I didn’t hear it mentioned much in conversation, folks looking for advice didn’t even mention it in their short list, and discussions which at one point were “SPSS vs. Statistica in a cage, which wins?” became “SPSS vs. SAS: which one is David again?”

However, they still live. They are popular in Europe, and they are still making new products (with a somewhat recent emphasis on QC offerings and some data mining stuff).

And now, their site has been redone. Check it out. And don’t worry, the wonderful Statsoft Electronic Texbook is still there!


A little light reading... · 04/21/2004 03:14 PM, Analysis

SPSS has always been pretty good about giving out some meaty white papers and newsletters. More recently, they’ve gotten a bit skimpier, but still worth a look.

Check them out here

They are a mix of fluff and stuff, so you may have a hunt a bit to find the gems. In addition, some of the fluffy titles have good stuff, so grab one a day for a quick update about applied stats in today’s market.


Tom Hespos agrees... · 04/20/2004 03:25 PM, Marketing

Tom came to the same conclusion I did about the Subservient Chicken at OnlineSpin from MediaPost

Just remember, you heard it here first!

Update: So does Jim Meskauskas


