Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: Page 29


Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.






Mark Rittman on Oracle Data Mining in 10g · 06/09/2004 02:56 PM, Analysis Database

We are an Oracle house here, and though my heart lies with MS SQL Server, my brain has learned Oracle. And there are some great things there. The indefatigable Mark Rittman has recently blogged about how impressive the built-in data mining is in Oracle 10g. Based on their purchase of the Darwin product from Thinking Machines (one of the first real commercial data mining tools), Oracle has been building more and more data mining into the core db… but unlike MS, they’ve made it nightmarish to get to. Basically, like their OLAP, Oracle provides no simple client tools; instead, you have to write Java code to get to it. This has made it a no-option for me, especially compared to MS’s GUIs and using Excel as an OLAP client tool.

Luckily, Oracle has now taken a step in the right direction with 10g. Mark points out the following on his entry:

Oracle 9i and 10g have a Data Mining Option for the Enterprise Edition of the database which embeds a number of data mining routines in the database engine. Access to these routines was initially provided in Oracle 9i through a Java data mining API, and when Oracle 10g was released this access was broadened through the introduction of DMBS_DATA_MINING, a PL/SQL API for these routines. Oracle position the data mining option as an embedded data mining engine with the emphasis on real-time scoring and classification of data, the idea being that you build your mining model using DM4J (or any tool such as SPSS that can output the mining model using PMML) and then load it into the database ready to carry out scoring.

(Note that SPSS (using SmartScore) and Clementine both can work with PMML XML model descriptions) and

If you want to have a play around with Oracle data mining, it’s installed alongside the OLAP Option when you choose the ‘Data Warehousing’ DBCA template. There’s a useful ‘Oracle By Example’ tutorial available on OTN entitled Using Oracle Data Mining to Predict Data Behavior which walks through the creation of two data mining models, and the DM4J project page has a number of data mining tutorials that demonstrate the use of DM4J’s data mining wizards.

But the best, the very best, is all the detail he goes into here about the PL/SQL access to the API. Not only does it give access to generate models and score data (as mentioned above), but it also has a transform api to create categorical bins (finally, a sql-based histogram tool!). Now, if only it had a “transpose” function that didn’t require PL/SQL coding, our lives would be much easier…

(BTW, those looking for a transposer (and who isn’t?) for Oracle might enjoy this article from OraMag. Others can be found on Google.)

Comments? [1]

* * *


More Textpattern Plugins · 06/08/2004 11:29 PM, MetaBlog

Mark Norton, AKA Remillard on the TP forums, was kind enough to create a catalog of Textpattern plugins

Many have complained that TP does not have a Calendar feature like MT. Well, there is a plugin to help with that. Conditionals are here, and lots of things you probably didn’t even know you needed.

Lots more to like there; worth checking back a couple of times in the future!

Also, in other news, a port of the vastly superior “Textile 2” is now available in PHP, and some folks have shoehorned this into Textpattern here. I look forward to trying this one out soon…


* * *


Textpattern 1.19 released · 06/07/2004 06:48 PM, MetaBlog

I will wait a bit for the dust to settle before this site upgrades, but as as Dean announced here, the newest version of TP, Gamma 1.19, has been released.

Note that Dean doesn’t link to old versions, so keep your 1.18 handy in case you need to downgrade. Also, make sure you backup your data and original install!

Get it here.

And, as mentioned previously, this one is GPL.

Comments? [1]

* * *


Should it be this hard to find a good research analyst? · 06/07/2004 12:18 PM, Analysis

We have a position open at my company for a senior database marketing research analyst. At first, I was looking for someone with a well rounded mix of skills. Now, I am starting to wonder if I should lower the requirements to “can count using either hand”.

I started off looking for someone who had experience with quant and qualitative research. That was quickly squashed, as number crunchers tended to shy away from the “softer arts” and the qualitative folks tended to be weak on advanced analysis (though some understood conjoint and scaling/clustering). (While I still believe in the merging of the two, I know its a pipe dream. But if you have the time, make sure to sit in on a focus group or two, and take a survey-writing course someday… its harder than you think)

Ok, so I retrench, and create a basic list of qualifications for success. But I ask you, am I looking for too much? Everyone who I am impressed with in multiple industries has this mix of skills, and I think that success in the modern marketing analysis world will require all of them. But 70 phone calls, 10 interviews later, I am still looking…

Now, I am not alone in expecting these things. Pick up any issue of Intelligent Enterprise or DM Review and read how these columnists and authors are begging for datawarehouse designers, developers and planners to work with the business heads to make sure what comes out of a multi-million dollar data project actually fits the needs of the business. And if someone is a novice but they understand that they need this, well, I can help. But a senior level analyst coming in with either “I did that stuff back when, but nowadays I don’t need to know how to do it, just how to use it” or “I just analyze it, what they do with it is their job” will have a shorter career than they might hope for.

So, I encourage you: if you work in this space;

Can someone succeed with just part of the story? Sure… but not as an analyst. You can do great things in finance, in selling CRM tools, in becoming a DBA, in working with tabs from survey houses, or even in the client services wing managing the various projects that analysts are working on. But you won’t be able to rise up an an analyst, and you certainly won’t be CxO of a large company (if that’s what you want) without more of the above. (Well, you could start your own company, make a success, and then laugh at me and my silly claims… but that’s a different story.)

Meanwhile, if this is all easy stuff to you, and you are in the Boston area, perhaps you can use the Mail Me link here to drop me a line.


* * *


Segment Stability · 06/04/2004 12:41 PM, Analysis

We are often asked to examine performance of groups of people by segment. This seems easy; each person was assigned to a segment at day 1 and so just group up their performance.

Only, its never that easy.

Often, segments are updated based on time or behavior. RFM grouping change after purchase, or because more time has passed. Preferences may change based on a shift of buying pattern.

There is also the issue that if you are targeting segments with specific offers, its not how the segment did, its how the segment responded to your choice of specific targeted offer/content…

The analyst is then faced with how to structure the data:

  1. You can double count the person in all the groups for which they were a member
    • This basically says that you don’t care about the person per se, simply how the amorphous group of “non-buyers” responds in general to various offers
  1. You can weight down the person; in effect, distribute their impact based on the length of time they spent in each segment.
    • Much more procesing required here, of course
  1. Take only their most recent or first assignment (if you logged it. You did track segment shifts, right?) and group them by that

None of these are optimal. Instead, I’ve found myself thinking more about the Marketing Mix Modeling work going on which emphasizes a time-series approach based on econometrics. Perhaps its a “time by segment at that time” model to better understand how people behave when they are in a specific segment space.

But, yes, given time constraints, like so many others, I often fall back to first providing option 1 and then going back later and doing the better analysis… yet another case of “Good Enough” defeating “Doing it Right”

(BTW, notice how that cliche, “good enough is the enemy of best” is coming back into play? John Patterson (founder of NCR) said this back in 1900, and people keep bringing it up as if they invented it, most recently Jim Collins in Good to Great One of my bosses used to go the other way, “The Perfect is the Enemy of the Good, the Accepted, and the Paid For”)

More about Marketing Mix Modeling:
An article at
Marketing Management Analytics specializes in this type of analysis


* * *


Scripting in Clem vs. SPSS · 05/31/2004 01:41 AM, Analysis

I am somewhat surprised and disappointed everytime I go searching for help on Clementine scripting. Search for SAS help and you get pages; but Clem? I hear the sound of crickets...

Clem's scripting doesn't follow SPSS Base's model, nor does it follow the SaxBasic/VBScript model. Instead it shows its roots as a POPLOG tool, even though it's front end is an SWT Java app.

Like SPSS has bolted on SaxBasic for programming around its SPSS Base scripting and macro limitations, I hope Clementine adopts a BSF framework to allow any BSF-compatible scripting language to be used, including Judoscript (or Groovy or any of the other hot Java scripting languages).

Why script Clementine at all? Isn't it a data-stream GUI? Sure, but there are lots of little annoying things that you have to script around. For example, the set-to-flag node likes to append the original variable name to the new "flagged' fields. This cannot be turned off in 7.x... but a script can strip those out. Also, there are times when you need to branch or loop depending on aspects of the data. No node can handle this, but you can script and adjust the functioning of a node. Finally, think of this: Scripts can generate your own custom nodes and streams; the old "programs-writing-programs" approach. Think of it as a macro language: What you use when the basic script needs some help.

The included docs do not really give much of a tutorial on how to script or why.... but SPSS sells what we used to call "included manuals". The "training" area has these manuals at $100 a piece, give or take... but to be honest, most of them are pretty darn good. For the programming subjects, they describe useful techniques and when to use them; for the stats, they often review how to interpret output and how to decide between the variety of ways to analyze, say, a bunch of categorical data with a few continuous factors.

In the meantime, this link will give you a good starting place to learn more about the tricks of scripting Clementine. Its a search for all the scripting articles in the SPSS Support db. Yes, its behind the support wall, but like all good hosts, they are usually kind to guests.

In addition, the SPSS FTP Site has some streams in zip files worth reviewing (especially, how to transpose and aggregate, worth its weight in gold).

Finally, there are great examples in the vertical streams (the "Clementine Appplication Templates") included with some versions of Clementine. Those are relatively well documented in the accompanying PDF, but do require some effort to understand how to duplicate some of their effects.

So, its not easy, but Clementine scripting can make life much easier. Now if only someone besides me would write about it... (well, someone besides me and Tim Manns.)

(PS: Yes, I know this is an extension whine of my previous whine here. Perhaps if SPSS just created some user groups (with cash and prizes) or got the old ones (one?) to be more active... or do whatever SAS did to get people excited about the tool... or do whatever made the people on SPSS-L so active... or maybe just get the SPSSE business cleaned up first.)


* * *


Nice BI Summary in Intelligent Enterprise · 05/27/2004 09:41 AM, Analysis

There are few magazines that I can recommend to the modern analyst as “must reads”. One of them is Intelligent Enterprise. While it varies in quality from issue to issue, every one has something worth reviewing. It swings between down and dirty query modifications and modeling approaches (rarely) to more CIO-type product choosing guides and administration issues.

A recent set of articles is worth a good look: The “BI Scorecard” series by Cindi Howson.

Depending on when you read this, there may be more.

The series reviews features for a collection of OLAP and BI tools. Oracle OLAP is noticeably absent, but by knowing what the others do well, you can compare to your own knowledge of Oracle’s solution to see what offering is better. This series also helps in understanding what types of features might be necessary, desirable, and just fluff in your decision process.

What’s missing from all this? The affordable “small analyst team” OLAP solution. All of these assume that you will be making a “corporate information factory” or other huge portalized front for millions of employees. The pricing is set up so that a small team of 3 analysts would pay as much as a 100 person small business.

Sure, SQL Server’s solution is pretty darn good, but that’s still a) a database you would be buying just for the OLAP and b) not as cheap as a stand alone solution should be. Same for Oracle OLAP. In the past, ESSBASE (now part of Hyperion) was marketed as the desktop or analyst solution… but now, I guess the market is different.

Its not that I don’t think everyone deserves to see and manipulate the data… but that in many cases, an existent reporting system meets much of the needs for many of the employees, and to be fair, OLAP does not replace reporting. In this case, when only a few people have the time to play with the data to create the reports and analyses for others, there really isn’t an affordable solution to my knowledge (comments box is open if you want to recommend one).

Oh, and note that pricing is not really reviewed at all… but I’ve found that BI prices tend to be… negotiable.


* * *


Textpattern news... · 05/24/2004 09:23 AM, MetaBlog

Ok, some great news from the Textpattern front…

Lots of action on Textpattern and Wordpress recently thanks to the MovableType “I never promised you a rose garden or free software forever” move. I am also a huge fan of Drupal, and used that for a time… but it needs more wiki-ness. Wordpress was pretty cool, but took too long to set up. TP was up and running in minutes, and the design (what little there is of it here) took another few days (it was my first attempt at using CSS).

TP isn’t perfect; Textile (the TP formatting language) still has some bugs, and there isn’t an easy way to say “don’t format this!” as I’ve documented on the discussion boards. But in general, I’ve had a great time with it.


* * *


Best Map Ever... · 05/20/2004 11:17 AM, Trivial

Let’s face, driving in Boston sucks. We all know it, but we keep doing it anyway. I’ve found driving in Manhattan to be easier. Even walking in Boston is frightening, come to think of it.

But I’ve also found something which has made all of it much more palatable. Beyond the big yellow “book o’ maps”, beyond Mapquest and Mappoint (or even Maporama or Map24, both of which show much more detail and useful info than the US based map companies), when you are down in the city and need a map, I’ve found the one to have.

Berndtson & Berndtson have been making maps for years for Europe, and also reselling them under various names in the US. Well, their map of Boston is amazing, and costs only $7 or so. Look for yourself:

I cannot recommend this map enough. I have bought multiple copies over the years, some from Amazon, and the most recent few from Barnes & Noble (not online, only in the stores, I’m afraid). The Boston Globe store used to carry them as well, but the one in Downtown Crossing has closed.

I can’t vouch for how well their other city ones work for you, but look at the great images at their site for each city of interest and see if they help.

There are lots of other interesting maps out there (the Access series by the amazing Richard Saul Wurman comes to mind) but for best utility and clarity for the buck, let Berndtson and Berndtson be your guide.


* * *


There Are No Industry Averages! Get Over It! · 05/19/2004 11:04 AM, Analysis

The one question I get asked more than any other is “What is the industry average click rate for fill in the blank”? This can be about banners, popups, emails, whatever. Most recently, thanks to some press releases, our clients are asking again “Are we beating the market average?”

Its a symptom of the “one number to rule them all” approach, whereby its really hard for people to balance multiple factors so they want just a single number to judge whether they are doing well or not.

Bigfoot and Doubleclick, two large emailing companies, exacerbate the situation by releasing their “benchmark” studies. Bigfoot’s is here and recent Doubleclick ones are here.

There are so many problems with these reports, I don’t even know where to begin. But here are some:

So, there are so many “variables” around these numbers that, like the average “2.4 children”, it becomes useless. Do you really want to measure your business against a meaningless number?

I also hear “Oh, its just a guideline” or “its just directional”, which is a way of saying “I get you that its meaningless, but its a number, so it must have some meaning by definition”. Sorry, its just not so. We can divide your height by the temperature and call it your average IQ, but that’s really all meaningless. Yes, its a number, and yes, I call it something… but that doesn’t make it truth.

Direct mail still had the mythical “2%” conversion rate, though no one can yet show me a master citation for this. But direct mail people have figured out over the years that they should break out different forms of communications. The vast majority, of course, is sales mail, but they tend to split out bill enclosures, renewals vs. acquisition subscription mails, etc. And when you do all that, the 2% quote doesn’t really make much sense anymore.

(As an aside, another irksome approach is the “market share” question, which implies that there is a 0-sum game of spend. For some B2B product, that may be so, but in many cases, market share is not the right approach. Some of these brands complain that they only have x% of the market… but for all we know, that x% is stable revenue while the rest of the market are fighting over fickle, non-brand loyal consumers with high acquisition costs… so I’d rather have the x%, all things considered… but I digress.)

Instead, don’t worry about how others are doing. You have no idea what they did to get that high click rate (perhaps they gave away coupons or margin, perhaps they got great clicks but horrible conversion, etc., etc.). You should be measuring yourself by your own benchmarks.

What does a generic, non-customized mail do for you? What if you include an offer? Fine, you now have a benchmark, including cost and ROI. Now, start manipulating things: Vary the offer. Include different types of messages, some more content focused, some more brand. Vary the frequency. Utilize all that data you’ve been collecting to custom publish. Track the costs for each of these changes, and compare that to the revenue/profit. Are you improving? Good. Are you not? Let’s keep testing.

Note that I don’t care if you are beating some industry average. Since there is none that anyone can trust, why waste your time comparing to it? We compare the stock market to the Nasdaq or S&P 500 in part because they are well defined indexes. But when it all comes down, do you care that your stock beat the index, or that it went up?

This is the same claim I make when optimizing web analysis. Folks like to say, “I heard at a show that my direct competitor has a 10% visitor conversion ratio, so why can’t we have that?” Well, perhaps you can. How is your product return ratio? How are the credit quality of the folks you get? How much are you willing to spend on customized marketing via email? Does that ratio count return visitors, or all visitors? Perhaps that 10% is a great number, and your competitor is willing to accept a high rate of returns, low margins, and many fraudulent credit cards.

Again, you don’t know all the facts round a single number, so to try and beat it is a waste of time. Focus instead on fixing your own house. Learn from admirable and clever strategies and tactics that your esteemed competitors do… but make sure your own house is in order before worrying about how to clean out theirs.

(Disclaimer, sort of: I work for an email marketing company, though its not one of the above. I think we do a pretty good job, but so do the above. If you like lots of service and help with extremely complex campaigns, my company is probably a better bet. If you like more control and have the time to manage your own campaigns, the above companies, and many others, will do a fine job.)


* * *


On a previous episode...

powered by Textpattern 4.0.4 (r1956)