Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: SPSS and Python


Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.







SPSS and Python · 11/21/2006 05:46 PM, Analysis

Much more recent UPDATE (6/19/2008): I’ve finally gotten some Python written. Much more useful info about my experiences (and how to avoid some pain) is at SPSS and Python, Take 2 and SPSS and Python: Passing Parameters to Scripts.

=Back to the original post=======

UPDATED 11/21/2006: After Jon Peck pointed out that everything had moved under DevCentral, I understood why I had so much trouble finding the below stuff. I ignored DevCentral as I am not a developer, I analyze data.

Instead, SPSS has now decided that if you are going to write complex syntax, you are a developer, preferably an expert programmer. Why did they do this? Because, they don’t really want more people using SPSS directly; there’s some money there… but lots more if you get an enterprise developer to “embed” or build a system around SPSS; these license fees could multiply their revenue by 2 or 3 times, since they can charge for everyone using the system instead of just those running SPSS on the desktop. So, DevCentral assumes you know a lot about programming vs. knowing a lot about data and stats.

Are the “missing” things there? Yes, a Download Center has a poorly laid out list of things to download, non-column sortable (how do you see the newest uploads if you can’t sort by date? They saw that too and added a “new items” to the filter, which doesn’t really solve the problem, but is a fair hack). The listings are complex mixtures of python and syntax, so that’s interesting to read. But where are syntax samples, SaxBasic samples, etc.? Not there…

Are the forums there? Sure, but now they are called the DevCentral Forums and yes, they focus mostly on developers. There is one interesting forum focusing on how hard it is to make quality graphs, something I’ve pointed out as a huge flaw in SPSS to date. Nothing there around helping SPSS Users…

So, I spoke too soon when I said they killed off things, though its not hard to put redirects on old links to point them to the right place, so that was just sloppy on SPSS’s part, and mine.

But the bigger question: Is this really going in the right direction? Personally, I think its going too far to woo developers, and not far enough to empower the SPSS user. This is nicely expressed in the bottom of this thread in the new forums. A developer struggles with the requirements to solve a problem with as little python as necessary, since non-python folks (ie, users) will also have to use the code.

So corrections inline, and Gold Star to Jon Peck for pointing me towards the new homes…

UPDATE: 10/16/2006. I went back to look at at some of the links below, and I was shocked to see that they’ve killed off moved all the forums with no forwarding address. Why, why does SPSS want to keep any community from growing up around them? Why do they continue to think that if they just pretend to be the “cheaper SAS”, they will get a SAS-like community around them?

Its disappointing. Every time I think they are going in the right direction, about a year later the find a way to get it wrong. Not only did they kill off retcon the start they made below, they’ve done NOTHING to address any of the other concerns I raise below.

Jeez, they make it hard to like them sometimes.

Anyway, they still have some info at which is where all this stuff wound up.

Original Post, circa 10/10/2005.

SPSS’s programming language (aka SPSS Syntax) has always stunk. Besides the lack of any IDE or debugging features, it has no data structures, no multi-dimensionality, poor looping and branching… its kind of a joke (and don’t get me started on that Notepad Lite syntax window). Its like they decided that SPSS is so powerful, it doesn’t need a real language… for 13 versions / 30 years.

The macro language isn’t much better, and feels like a hack (as does SAS’s, to be fair; both feel like summer intern projects). SaxBasic always feels like an add-on (which it is), but at least it has an IDE and debugger, data structures, and easy access to the outside world via com objects.

But things are starting to change. v14 is out (actually, for many people, they are now on v15… time flies…) and part of its features include access (finally) to an outside programming language… and they chose Python.

For those who don’t know, Python is an interesting mix of object oriented scripting with many “perlish” features. It uses indents to define loops (which takes some getting used to) and has lots of features, though not as many as perl and its CPAN library. Python fans would give up their first born for the language, so its clear that it has something good. As part of my learning, I purchased Beginning Python by Magnus Lie Hetland, and its been really helpful. In addition, a project called Jython is attempting to dupe the language (well, a slightly out of date version) in java to run on a JVM (aka, run anywhere).

So, new things to look at:

This implies a couple of things:

So, SPSS now has a chance to change lots of things. Leveraging an open-source language can be a good step towards opening up lots of other aspects of how SPSS processes data. Opening up the system to allow a real scripting language at all is a big first step. But there is always room for improvement:

So, I’m finally learning Python (don’t worry, Judoscript fans, I’ll still fall back to my old favorite) and I’ll blog more about v14 when I get a copy (focusing on things like multiple datasets open at once, etc.) .

* * *


  1. So, are you a SPSS guy now? I always thought that you were a SAS bigot…
    andrew    Oct 11, 02:11 PM    #

  2. Python rocks! Let me know if I can help you move toward the light!
    Ned Batchelder    Oct 16, 08:16 PM    #

  3. For Andrew: SAS has its strengths and faults, but its biggest is that it ignores growing companies. SPSS is much more willing to work out a fair pricing deal. But yes, I have to say, if I had the money, I would probably run more SAS (and its multithreaded data engine) than SPSS for large data. But SPSS sure is trying to move up… its fun to watch, isn’t it?

    And Ned: For those who don’t know, Ned writes code which creates more code: I definitely know who to ask for help when I get going! Thanks!
    Michael Wexler    Oct 16, 08:30 PM    #

  4. Want to see a new stat language that’s different from both SAS and SPSS?
    Check out Vilno, the new statistical programming language at :

    Robert    Nov 1, 12:11 AM    #

  5. Please take another look. SPSS has NOT killed off the forums. In fact, the forums are more active and better organized now.

    Just follow the DevCentral Forums link from the main page.

    Also, there are now eight pages of listings of downloadable stuff from SPSS and user contributions.

    These offer statistical routines, data management, and other useful stuff.

    Jon Peck    Nov 21, 05:01 PM    #

  6. I see why I had trouble. Everything has been moved under “DevCentral”. I’ve updated the article to reflect my thoughts on this not totally helpful change.

    Michael Wexler    Nov 21, 05:31 PM    #

  Textile Help
Please note that your email will be obfuscated via entities, so its ok to put a real one if you feel like it...

powered by Textpattern 4.0.4 (r1956)