Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: Page 8


Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.







The SPSS Book you need to have... · 03/19/2008 01:46 PM, Analysis

And its free.

UPDATE 03/19/2008:
Things got moved around a bit. The “description” page is now here at Programming and Data Management for SPSS 16.0: A Guide for SPSS and SASĀ® Users and the direct link to the PDF is here:

The new version features some additional help around the Python interface, as well as more on how to link to R. It also includes more command magic for more powerful fixing of output issues.

Original Post:

SPSS Programming and Data Management, 4th Edition
A Guide for SPSS and SASĀ® Users
by Raynald Levesque and SPSS Inc.

Now, you guys know my complaints for how SPSS treats its users (with disdain in the best of times). They continue to ignore many of the needs of power users, for years. And then, almost by accident, good stuff comes out.

You should recognize the name Raynald Levesque from the Raynald’s SPSS Tools site, which is your go-to starting point for all tricks with syntax, scripting, etc. Well, he’s taken lots of that knowledge and, together with some folks at SPSS, has created a fantastic guide to actually getting stuff done with SPSS. Its also a great “translation” for SAS programmers who are transitioning to SPSS (I don’t think that’s an overwhelming number, but they’re out there) or for SAS solutions that you want to duplicate in SPSS. Finally, it includes some useful chapters on Python scripting, which is slowly replacing the “Sax Basic” scripting that previous versions emphasized.

(BTW, Sax Basic is no longer sold by Sax (though WinWrap is still doing a VB-alike scripting drop in). I suspect that is part of the reason for the push away from Basic: when your vendor drops support, you need to look for other options.)

The fact that this is a free download is amazing, and one of the shining stars from SPSS. They do sell a print version if you are not a screen reader or don’t want to print all 520+ pages.

So, with folks like Raynald and Jon Peck, there continues to be hope that SPSS will start to recognize the needs of its power users and do some of the things we’ve asked for over all these years. (BTW, look for my “Things SPSS is STILL missing” post soon in case you need some workarounds).

Download the book. Thank Raynald and visit his site. And keep pushing SPSS to expand its support of the power users.


* * *


MS Experimentation team makes another winner · 03/10/2008 12:32 AM, Analysis

The Microsoft team creating their own online experimentation platform continue to impress me with their willingness to share both their learning and their expertise.

It’s one thing to pump out “best practices” which are watered down samples of simple a/b findings. These guys not only talk about the complexities of multivariate testing, but give concrete examples and the statistics behind them. Offermatica, Optimost, even Memetrics had good stats behind their systems, but they rarely talked about the meaty stuff. The MS team has really been a great resource to help those past the simple A/B.

You don’t need to be a super stats-head to get the basic issues here, but if you don’t even take the time to understand the basics, you will waste lots of time and money. Well worth the read.

As seen in the ever fabulous Web Analytics Discussion Group hosted on Yahoo Groups, but moderated by the Web Analytics Association, and originally founded by and still moderated by the inimitable Eric Peterson.

BTW, if you aren’t a member of this group and you enjoy my blog, you most definitely need to subscribe to this group.


* * *


Fedora 8 in Virtual PC · 03/09/2008 08:20 PM, Tech

Following on from the Ubuntu issues, here is a nice post on how to get Fedora 8 running in Virtual PC. You’ll find that you need most of the same tricks you tried with Ubuntu… but Fedora makes them harder.

Fedora 8 (werewolf) on Virtual PC 2007 at the Sean Blog

For those looking for my Ubuntu links, see
Ubuntu 7.10 and No Mouse on Virtual PC
Ubuntu 7.04 and No Mouse on Virtual PC


* * *


Where tech goes to die... heaven. · 02/20/2008 11:27 PM, Tech Personal

I recently saw someone mention the Weird Stuff Warehouse in California. On first click (go ahead, try it), just looks like the usual tech parts store.

But then I saw some pics of the store itself... and it looks like it had some, well, unusual things. Not necessarily Weird (shrunken heads), but definitely not your usual store. (More pics below via the Flickr links if you can’t wait to see more.)

And I saw where it was: just down the street from Yahoo! in Sunnyvale. So, yesterday, I took a few minutes and drove over there.

My jaw dropped when I went in. Every piece of technology detritus, every type of tape cartridge to defunct networking dongle, its all there.

It’s in the Silly Valley, so it has very few personal computer stuff (no Atari or Commodore 64 stuff)... but it has floor to ceiling racks of telco switches, of serial cables, of various size floppy drives, of PC software from the 80s, Irix software from SGI, early Sun gear. Stacks of old lap-bricks, piles of typewriters and adding machines. Boxes of odd fasteners and screws, of PCI graphics cards, of modems and oscilliscopes, of hard drives of every size and interface. Lined up like squat skyscrapers were printers, from laser to dot martrix, along with ribbons and cartridges and toner from every era, right by the fax machines, and endcapped by the microfiche machine.

Signs were all markers and tape, “as is” and “$1 ea.”. A test bench was set up with some power sockets and some basic tools to test your discovery.

I found myself laughing over and over again as I walked between the shelves. It was like Costco or Sam’s Club… but full of all those pieces of tech you see in the IT closet at a company which has been around a while, or in the box you find by the dumpster when an old office building is being torn down. Sure, some of that laughter was the “Who the hell is going to buy this piece of junk?”... but a lot of it was just sheer delight at the scale and scope of this collection, all for sale, all reflecting the various waves of technology which, over the last 30 years, have swept the Silicon Valley and the world.

I have to admit, I know a lot of old gear… and I was stumped by some of what I saw.

But lots of people knew exactly what they wanted. They were carrying out old cables, old fasteners, SCSI connectors, old serial cards, a broken LCD screen. People were buying small screws and old network gear. One guy was looking at the old software for an old version of PC DOS.

Look, this wasn’t some boutique with fancy watches or Vertu phones, nor was it the wonderland of new gear which makes up Fry’s Electronics. And you had to be pretty nerdy to enjoy this place, full of dusty old computer gear. And yes, it all looked forlorn and abandoned: none of this gear was going to be seen as “antique” or collectible (though some of the SGI and Sun gear was much cooler than I expected)

But think of the first time you played wiffle ball, or the first time you went to a candy store. Hearing those sounds, smelling those smells, seeing logos and symbols from a bygone age reflecting the memes of the time, they all evoke that feeling of when these things were new to you… the joy of discovery, of that first time. Like Proust’s madeleine, the sights of this old gear reminded me of when I first started to play with this stuff, when I first grabbed time on the school’s minicomputer or when I first hung out at the local “business technology” store (ComputerLand) by riding my bike.

So, if you love old tech and are in the SF or Silicon Valley area, highly recommended. Check out the photos linked below, and feel free to let me know in the comments if there are other places like this around the US. and

(All pictures from Flickr are owned by their authors, used without permission. I hope they don’t get mad.)


* * *


Still at Y! · 02/15/2008 06:26 PM, Personal Trivial

Just in case you saw some news about layoffs, including the departure of some really great people… well, they didn’t ask me to leave, and I’m going to ride out the storm as best I can. So, still here at Y!, and there are some amazing things we are working on.

If you love data like I love data, you really do want to keep your eye on Yahoo!, and not get distracted by the news (or news corp) rumours you keep seeing.

Sure, there are problems here. There are problems everywhere. If you haven’t noticed, the US is approaching a recession, even Google’s growth is slowing, and we are hearing the biggest credit crunch since the invention of Rice Krispies. So, don’t assume that Yahoo! is completely disfunctional just because Valleywag says so. Or AlleyInsider. Or stock analysts. Or… well, anyway, better stop there.


* * *


WASP · 02/06/2008 04:21 PM, Analysis

If you aren’t using it yet, you should be.

WASP, the Web Analytics Solution Profiler.

This is one of those “why didn’t someone make it earlier” tools for investigating web sites. Basically, it tells you what web analytic pixel trackers are on each page you visit. It has a huge list of trackers it already detects (68 of them!), and the author is great about adding others. It breaks out the cookies, query strings, the works!

Oh, and it’s free!

It’s a Firefox add-in, and highly recommended. Every web analyst should have this one installed.

Comments? [1]

* * *


Comments closed for the moment · 02/06/2008 12:42 PM,

Sorry, guys. Got a comment spam storm going on. I’ll either add a blocker or some other trick in a bit, and re-enable them soon.

Thanks for your patience,


Update: well, that was quick. 3 days later, things seem to be back to normal. Comments are back on. Critique and snark away!

* * *


Yawn. Superbowl 2008 Ads. · 02/04/2008 12:05 PM, Marketing

Another boring football game. And this time, the ads were terrible.

See them all at Yahoo Video or AOL Sports.

Some observations on just how bad they were:

Sobe: lizards dance better than Naomi Campbell. That’s scary.

The car ads all sucked except for Bridgestone night driving ad (BTW, the daytime one joined the sucky crowd with the ripped off “screaming squirrel”). Why? Story. Audi tried, but mafia/godfather jokes are just old. The Sopranos are over. Just showing the car with a voice-over is a waste of millions of dollars, Cadillac. Toyota’s badgers was a commercial by committee, but at least they tried something.

Careerbuilder: Strange. Um. Not much to say. Better than Monster’s monkeys from last year, but not just unsettling enough to be buzzkills. Congrats, Careerbuilder: you managed to spoil parties around the country.

Beer ads: I expected better. Wow. A “Rocky” rip off with Clydesdales. A caveman set of jokes so old that I remember making the same ones in 3rd grade. Funny, when Fedex did the caveman jokes, they found new ones. Geico… well, let’s just leave it at that. Why did Bud (or Bud Light or whatever) just go to town with the dumb one?

Coke Balloon. So dumb. The balloons never changed emotion. That could have been a really clever ad, but it was just shot all wrong.

Doritos user ad: Mouse trap. Weak. Last year’s was OK, with the American Pie-like guy having an American Pie like experience. But this one? Go back to the professionals.

Tide: Welcome to the game. Weak ad, but not bad for a first try. After all, its only been 40 years or so of mass market audiences that you’ve missed out on..

What can you say about an ad that keeps you watching, but when you get to the end, you feel that you were totally ripped off? Is that a good ad? No, Pepsi with Justin Timberlake, it isn’t.

And Pepsi, why didn’t you run the deaf ad that got you so much press? Sure, its not a great joke, but at least it shows how a punch line works:

Garmin Napolean: Plastic Bertrand soundtrack aside, terrible ad. And I love your products. Now is your time to shine: GPS is hot, and your products are all becoming indistinguishable. This is the best you can do?

Fedex Carrier Pigeons could have been better… but compared to the rest, it was great. No cavemen, but still.

GoDaddy’s “call to action”: Terrible ad, and when you get to the site, the “pulled” ad is even worse. But at least, more than any other ad, the guts to pull a clear and obvious call to action: go to our site. Simple and trackable.

Funny enough, the best ads were the movies and tie ins (Will Ferrell and Bud Light). No others made me laugh.

So, yawn. Millions of dollars down the drain. Is this really the best the collected minds of marketing’s most expensive creative has to offer?

Comments? [2]

* * *


MSFT makes offer for YHOO · 02/01/2008 08:13 PM, Trivial

So… the rumours are true. What to say?

I have no inside information, nor do I hang out with the board or the senior execs. This is just my little ol’ opinion.

Going back to Cali. Cali. Cali. Courtesy LL Cool J, Yahoo Lyrics.
I used to work MSN, doing similar work to what I do now. I had an office (that was nice) and worked with a great crew there. It would be weird going back (though I really wouldn’t be going back; I would stay in NY).

MS and Y! have very different ways of doing things. The obvious stuff, like MS reliance on MS software vs. Y! reliance on open source is one major piece. Y! likes to believe that they are a part of the Internet culture, while MS likes to believe they own the software powering the Internet. I actually didn’t mind some of the ways MS approached the world: they had a very Googlish overconfidence that served them well for a while. But I can see how hard the integration with Y! culture would be: some very smart people would leave no matter how much money MS threw at them. And don’t take this as a hint as to what I might do. Or is it?

It’s never enough, never enough Courtesy The Cure, Yahoo Lyrics
It’s actually a crummy offer. 6 months ago, this would have been laughed off. But unfortunately, given where the stock is now (18 adn change the day before the offer), suddenly, this is a “great deal”. Too bad. Y! is really worth more. If you could see what I see, from the people to the technology, from the willingness to keep fighting when everyone says you are a has-been to the recognition that Y! is the first internet company many people ever trusted and that they actually love your organization… It’s all pretty impressive. Sure, Y! has its flaws, and sure, we have had some dirty laundry spread across the pavement… but there’s still a lot to like.

High School never ends Courtesy Bowling for Soup, Yahoo Lyrics. This is just the start. Everyone from Google, eBay, AT&T, Comcast, Time warner, NBC, etc. all the way down to conglomerates in Abu Dhabi or VC/Hedge firms could all make an offer. This was an unsolicited bid, so others can make them too. And this will be a long process… even if Jerry and the board said “Yes” today, MS said it would be 2nd half before the deal closed. As this stuff keeps getting dragged out, so will the eventual end.

This process could actually get very painful, akin to slowly pulling off a band-aid. I don’t think Yang wants to sell. I do think the board does. I wonder if the senior leaders are soliciting other offers, to either bid out MS, or to at least get the dollars up. It could drag on for a while, and there will be hurt feelings in the end.

I’m alright, Nobody worry ‘bout me Courtesy Kenny Loggins, some crummy lyrics site
I’m not worried. I don’t think I’ll be affected by this round of layoffs, the stock got an un-asked for boost which won’t last long, and I think MS will want to have me back. After all, they are spending 44.6 billion dollars just to re-hire me.

I’ll update as more stuff happens… just another exciting day on the Internet.

Comments? [1]

* * *


Full Outer Joins in SPSS · 01/31/2008 12:27 PM, Analysis

In SQL, there is the concept of linking tables by shared keys. For the most part, think the classic case of a customer ID linking customer address to purchase data. In SQL, it’s very easy to create these links as you select, and even though they can have unexpected side effects (either expanding or filtering what rows you get back), it’s one of the foundations of “Relational” databases and a very powerful tool.

I’m going to talk about what these joins are (using Oracle and SQL for examples), then show how you get this same helpful effect in SPSS (way at the bottom). If you know what these are, or don’t have patience, jump down there… but you’ll miss links to Oracle v8 and v9+ tricks, as well as a link to how to do full outer joins in MS Access, so skip at your own risk! Ok, back to our main story.

As part of the “Relational Algebra” (or “Relational Calculus”, depending on how advanced you are), there are different types of joins. The basic one is key to key, keeping only the matches. This is called an “equijoin”.

(I like old style SQL, so I don’t use ANSI joins. Sorry)

Select t1.customer, t2.address
from customerlist t1, addresslist t2
where t1.customerID=t2.customerID

This returns only customers who have given their address, leaving out customers with no address.

Now, let’s say you want all the customers, and you don’t mind if there are Nulls in the address column for those who haven’t given them. This is a Left Outer Join (left is arbitrary, think of the long customer list as the “left side”). Oracle v8 had a weird way of doing this; v9 made it (sort of) cleaner:

v8: The (+) means keep all the IDs.

Select t1.customer, t2.address
from customerlist t1, addresslist t2
where t1.customerID=t2.customerID (+) 

v9: ANSI new syntax (doesn’t like joins in the Where clause)

Select t1.customer, t2.address
from customerlist t1 left outer join addresslist t2
on t1.customerID=t2.customerID

Now, what if you wanted a list of customers who had addresses, customers without addresses, and addresses without customers? Why this last? You may want to do an append via some 3rd party to find those names and have a completed list, so you need everything from both tables, but joined where appropriate.

This is called a Full Outer Join. In Oracle up to v8, you couldn’t do this directly and had to use syntax trickery. Full Outer Joins in Oracle9i shows both the workarounds in v8, and the new syntax in v9 and above.

The workaround involves two outer join queries combined by a UNION operator. Basically, you say “give me customers with and without addresses (match ‘em where I’ve got ‘em) on one hand, give me addresses with and without customers (match ‘em where I’ve got ‘em) on the other, and then plop them on top of each other to make one long list”. The extra trick is the UNION instead of the UNION ALL. Why? Well, your match ‘ems will be in both joins, so you will have dupes unless you let SQL take care of them with the UNION.

MS Access also needs a simulated full outer join which is described her as two UNION ALLs (could probably be implemented as 1 union with some editing).

But how about SPSS? Unlike SAS, SPSS has no “Proc SQL”, so there is no direct way to write the query. Instead, we have to fake it with Match Files. BTW, this is a topic worth a Knowledgebase article on, but the SPSS Knowledge Base has only one (1!) article on “outer join”, which doesn’t really even address the issue: Can SPSS perform an anti-join merge of 2 or more files as in Clementine? Searching on Merge is more helpful, but only slightly. Here is one on “Cartesian Join”, which is really just every row to every row (not really a join at all, but technically correct): Cartesian File Merge

Here’s how you do it.

Here’s an example problem: I have a list of IDs, and I want to know who did something on 3 different months, say visited a web site. There are 2 ways this problem can be presented:

1) Open each file, add a flag column (MayFlag, JuneFlag, whatever). If this is based on the filename, you can even script it. Sort it before saving by the ID
2) Then, take your master list, Match Files one after another.

or, the harder one…

At the end, we want a list of IDs who visited at least once (ATLO) and a flag for each month they visited. In this version, we won’t know anything about those who didn’t visit. Because I want all the IDs, I have to do full outer joins for each new file or month I include.

This syntax will do the first 2 files for us. Just keep repeating for each additional file.

* Create fake data.
DATA LIST LIST /id(A8) JuneFlag(F1).
One 1
Five 1
Six 1
Seven 1


SAVE OUTFILE='C:\Junk\test2.sav'   /COMPRESSED.

DATA LIST LIST /id(A8) MayFlag(F1).
One 1
Two 1
Three 1
Four 1

SAVE OUTFILE='C:\Junk\test1.sav'   /COMPRESSED.

* Ok, just like we talked about.
* Do 2 left outer joins, then the "Union" which is really
*    just another match files and some dedupe syntax.
* BTW, this assumes that your visit files are already aggregated so
* each ID is once per file; if you have more than one, aggregate
* the files first before the joins.

* 1st Left Outer Join.
* I have "test1" open already, so it's my "Left" in the Left outer Join.
* Test2, my key file, will "lose" some its records in this join.
 /by id.

SAVE OUTFILE='C:\Junk\step1.sav'

* 2nd Left Outer Join: Now, take 2nd table, and make it the left side, and repeat...
GET File='C:\Junk\test2.sav'.

 /by id.

SAVE OUTFILE='C:\Junk\step2.sav'

* But wait... ID One is in both sets!  Horrors!.
* Have to "Union" them.
* Merge the two files (add cases), and then dedupe.
* I still have "step2.sav" open, so I'll just add step1 back in...



* not needed, but always good to save your work!.
SAVE OUTFILE='C:\Junk\step3.sav'   /COMPRESSED.

* Identify Duplicate Cases; standard SPSS generated syntax.
  /BY id
VARIABLE LABELS  PrimaryLast 'Indicator of each last matching case as Primary'.
VALUE LABELS  PrimaryLast 0 'Duplicate Case' 1 'Primary Case'.

Select If PrimaryLast=1.


* Done.

Wasn’t that fun?

Sure would be nice if SPSS added a SQL type access to our files to allow us to treat it like a database. Its easy to dislike SQL for many things, but when it works, its like magic. SPSS is making us jump through lots of hoops, including multiple time-consuming sorts, to get the same result. I hope that, as SPSS adds more multi-threading, they include a multi-threaded SORT in the future.

BTW, to turn off logging?

SET Printback=Off TFit=Both TLook=None.


* * *


On a previous episode...

powered by Textpattern 4.0.4 (r1956)