Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: Page 6

OTHER PLACES OF INTEREST

Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.

 

HOW DID YOU GET HERE?

https:
https:
https:
https:
https:
https:
https:
baidu.com
https:
https:

 

 

 

Panorama and Google Apps · 07/10/2008 04:00 PM, Analysis

Panorama is a great BI tool which is most associated with Microsoft, as they developed the version of OLAP which is now built into SQL Server. They market a suite called the “NovaView” toolset, which includes the usual reporting, dashboard, etc. stuff, but they’ve been around long enough to forge their own approach.

They’ve continued to create BI and visualization tools, and they’ve now created an iGoogle Widget which brings some basic BI (Pivot Tables and Pivot Graphs) to the Google Spreadsheet.

http://www.panorama.com/google/index.html

This is a clever idea to provide your services to the tail market. It also reminds us that, besides the constant upgrade of features that Google builds into the apps, others are also working to create the future of the “office” online experience.

PS:
While you are there, check out their SaaS OLAP Engine PowerApps which is the back end of the Google widget, as well as their link for integrating Google with you in house SQL Server OLAP/Analysis Services system

PPS: If you are really interested in more clever ways to use data visualization in Google Apps, see these:

List of gadgets:
http://code.google.com/apis/visualization/documentation/gadgetgallery.html

Quick summary of Gadgets
http://googlesystem.blogspot.com/2008/03/google-spreadsheets-adds-gadgets.html

Links to some Data Visualization gadgets.
http://googlesystem.blogspot.com/2008/02/data-visualization-google-gadgets.html

Links to Visualization Gallery
http://code.google.com/apis/visualization/documentation/gallery.html

Comments?

* * *

 

SPSS and Python: Passing Parameters to Scripts · 06/24/2008 11:48 AM, Analysis

UPDATE 6/24/2008: An anonymous source wrote in to let me know that v17 should provide a way to pass simple parameters ala the old way with the Script command.

Meanwhile, the unstoppable Jon Peck of SPSS saw my post and suggested an alternative that is more complicated, but demonstrates some powerful stuff. It also allows for the passing of more sophisticated objects, in case you need to pass a list or something. The nice thing about this approach is that it doesn’t need to read or write files or any other platform dependency, so it can work on all systems.

It has a few pieces, and I’ll show you how they work. Basically, we use an SPSS program to set up the parameters we want to pass, then we call a “helper” program to pass them. The real script we want to run is run, in effect, by the helper. You don’t need to worry about any of that if you want; just use my sample you you can ignore the details. If you want to know more about how Python works, however, dig into Jon’s script.

The first piece is scriptwparams.py, which should also soon show up on the SPSS DevCentral Downloads area (annoying free login required). I give the code below, but treat it as a black box for the moment. You would put it in your favorite Script hangout so it can be “imported” in a Python program.

So, here’s a hypothetical SPSS syntax file. In it, we run a quick freq on a sample dataset, we import this handy “utility script” for passing params I mentioned earlier, then we tell it:

Finally, we tell it to run the script by calling a function in the “helper” program, scriptwparams.runscript().

The result of all this is put the parameters in a “shared memory” area, and then call the the real script.


get file="c:/spss16/samples/cars.sav".

FREQ origin year.
begin program. import spss, spssaux import scriptwparams scripttorun = "c:/python25/lib/site-packages/misc/tests/parampasserscript.py" parms = {"outlinetitle":"Frequency Table", "replacement": "Table of Frequencies"} scriptwparams.runscript(script, params=parms) end program.

Ok, so what about the receiver? Remember, the whole point was to pass some parameters TO our script. Well, our script, using the same “helper file”, can now read from that shared memory area and then do what we need by parsing the parameters.

In this case, our simple little “parampasserscript.py” reads the parameters, starts reading the output tree, and if the outline title matches what we passed in, it changes it to the replacement we passed in (see the SPSS syntax above). Simple, but the point is that we passed in 2 pieces of info: what to search for, and what to replace with.

parampasserscript.py


#script with parameters
#run the procedure named in the procedure parameter on the variable named in the variable parameter
import SpssClient, scriptwparams
SpssClient.StartClient()
parms = scriptwparams.getscriptparams()
doc = SpssClient.GetDesignatedOutputDoc()
items = doc.GetOutputItems()
for index in range(items.Size()):
    item = items.GetItemAt(index)
    if item.GetDescription() == parms['outlinetitle']:
        item.SetDescription(parms['replacement'])
SpssClient.StopClient()

Here is Jon’s “helper” script. As I mentioned, I suspect it will wind up on DevCentral, but if you can’t wait, here it is.

scriptwparams.py,


#post parameters for script
import mmap, pickle, os, tempfile, sys
import spss, spssaux
def runscript(scriptname, params={}):
    fn = tempfile.gettempdir() + os.sep + "__SCRIPT__"
    f = file(fn, "w+")
    shmem = mmap.mmap(f.fileno(), 4096, access=mmap.ACCESS_WRITE)
    shmem.write(pickle.dumps(params))
    f.close()
    spss.Submit("SCRIPT " + spssaux._smartquote(scriptname))
    try:
        os.remove(f)
    except:
        pass
def getscriptparams():
    fn = tempfile.gettempdir() + os.sep + "__SCRIPT__"
    try:
        f = file(fn, "r")
        shmem = mmap.mmap(f.fileno(), 4096, access=mmap.ACCESS_READ)
        ps = shmem.read(4096)
        try:
            f.close()
            os.remove(fn)
        except:
            pass
    except:
        return {}
    if ord(ps[0]) == 0:
        return {}
    d =pickle.loads(ps)
    shmem.close()
    return d

Ok, so now, this post shows 2 ways to pass parameters in v16 with a bit of work (this “shared memory” and the simpler temp file creation) along with 2 ways that don’t work but are worth playing with to better understand what’s happening when SPSS calls out to Python.

======Original Post ======

As you may recall, with old Sax Basic, you could do things like:
SCRIPT file=“c:\my documents\ChangeLabelTitle.sbs” (“Freq of Men only”) . And the Script could look at strParam = objSpssApp.ScriptParameter(0) to see it.

But in v16, the Python plugin for SPSSClient Scripts doesn’t seem to pass parameters, which sucks. This is for v16.0.2, perhaps it will get fixed in a later version.

So, using my “change the Label and Title and Header” script, I tried to pass in what I wanted the changed label to be.

I tried 3 work arounds. 1 finally worked, 2 didn’t, but I put it here to save you the pain of wasting your time with these “what ifs”.

1) Run it directly. I tried just calling the Python script directly.
This runs and printed out the debug print statements I included, but it spawns a whole new SPSS unit so the viewer tree doesn’t get touched.


BEGIN PROGRAM PYTHON.
import os, spss
psout = os.popen("C:/Python25/Lib/site-packages/spss160/spss/changetitle.py")
results = psout.read().split('\n')
for line in results:
   print line
END PROGRAM.

2) Ok, fine. Can we make an “environment variable” and pass that down to the spawned process?

Python has things like os.environ to play with the environment, but you guessed it, it doesn’t get carried over to other processes (same for getenv and putenv, for you picky ones). If I really wanted it, I could put it in the registry, but that seems overdoing it.

So, the below ran, but my test program could never see the “changelabel” environment variable.


BEGIN PROGRAM PYTHON.
import os, spss
os.environ['CHANGELABEL'] = 'python env label test'
END PROGRAM.

SCRIPT file="C:\Python25\Lib\site-packages\spss160\spss\test.py" ("Freq of Men only").

3) Sigh. Make a file.

The oldest way, make a file and put what you need into it. SPSS even has an environment variable which points to its temp file, so at least I can hide the file and not litter.

The below code does work, and is the current preferred solution. You’ll need both parts for each use, the file writer and the python script.


BEGIN PROGRAM PYTHON.
import os
tmpfile = open(os.environ['SPSSTMPDIR'] + 'changelabeltemp.txt', 'w')
tmpfile.write('test another new label for python')
tmpfile.flush()
tmpfile.close
END PROGRAM.

SCRIPT file="C:\Python25\Lib\site-packages\spss160\spss\changetitle.py".

then, in external script:


import SpssClient, os
SpssClient.StartClient()
# First, open the file to find out what to change...
thelabel=file(os.environ['SPSSTMPDIR'] + 'changelabeltemp.txt').read()
print thelabel
SpssClient.LogToViewer(thelabel)
# want to change Title, then Heading
objOutputDoc = SpssClient.GetDesignatedOutputDoc()
objOutputItems = objOutputDoc.GetOutputItems()
for index in range(objOutputItems.Size()):
    objOutputItem = objOutputItems.GetItemAt(index)
    if objOutputItem.GetType() == SpssClient.OutputItemType.TITLE:
        # Fix the Title first
        objTitleItem = objOutputItem.GetSpecificType()
        SpssClient.LogToViewer(objTitleItem.GetTextContents())
        objTitleItem.SetTextContents(thelabel)
        objOutputItem.SetDescription(thelabel)
        index = index-1 # Back one for the header...
        objOutputItem = objOutputItems.GetItemAt(index)
        objHeaderItem = objOutputItem.GetSpecificType()
        objOutputItem.SetDescription(thelabel)
print "done!"
# SpssClient.Exit()
SpssClient.StopClient()

Not as easy as the Sax version, but basically delivers the same results.

Well, not quite. For some reason, it keeps showing HTML tags and screwing with the font, making the output look ugly. I’ll keep playing and see if I can solve it.



PS: Note to self: you keep forgetting where these directories are, so here they are.


SPSS160_SCRIPTING_HOME 
C:\Program Files\SPSSInc\SPSS16 
SPSSTMPDIR 
C:\DOCUME~1\username\LOCALS~1\Temp\ 

so don’t forget!

Comments?

* * *

 

SPSS and Python, Take 2 · 06/19/2008 03:43 PM, Analysis

I’ve posted about SPSS’s jump into Python before, and I am still not loving the experience. I’m really trying to shift my stuff from Sax Basic (aka Winwrap Basic) to Python, and it’s just a pain. SPSS has provided tons of programmer style docs, but very little in the way of helping you understand the best way to approach the problem with Python.

Here’s what I have learned, with the help of Raynald Levesque’s fantastic book Programming and Data Management for SPSS 16.0: A Guide for SPSS and SAS User (direct link to the book). You’ll also want to keep the SPSS-Python Integration package.pdf and SPSS Scripting Guide.pdf (docs for the SpssClient) files open and handy, both in C:\Program Files\SPSSInc\SPSS16\help\programmability (perhaps only after installing the plugin, but that’s where they SHOULD be.)

(BTW, Raynald wrote the first few versions of this book, but to be fair, its becoming a work with input from a collection of SPSS staff as well, so kudos to them all. When you see me refer to Raynald’s book, mentally thank the rest of the SPSS gang who keep improving it…)

Don’t wait til the last minute to do this stuff. You’ll come out a better person on the other end, but getting there will create scars. I’ll try to post things that tripped me up, but start early getting used to this new world.

This is a pain. You need to install Python first. SPSS doesn’t say if the Activestate distro will work; they ship 2.5 default. The Activestate includes a much nicer IDE (PythonWin) and lots of helpful preinstalled modules for general programming; the SPSS distro includes NumPy and SciPy, which are handy for numerical programming… but include no real IDE, which is kind of in keeping with the SPSS spirit of minimal programmer support. When this finishes, you install the Python Plugin. This is all on your CD, or you can download the plugin from the SPSS DevCentral Python Plugin page. Yes, you may need a free account to get to this, sorry. At that point, you are ready to go; if you want, you can go to the Devcentral downloads area and update some included scripts/programs, but might be useful to get through this post first.

You probably know that you run “syntax” commands in the Syntax window; these commands tell SPSS to open datasets, process the data, etc. If you need to loop, there is a relatively little known extension called “Macro Language” which basically uses !variables and allows you to loop a command. While handy, it’s very limited.

Beyond macros, you could use Sax Basic Script to do more advanced things, including more sophisticated branching and looping, as well as modification of the Output Viewer tree, etc.

So, syntax is still syntax, the macro language is still there, and Sax Basic is still hanging on by a thread… but SPSS is hoping to replace the last 2 with Python. Macro stuff becomes a Program, and Sax Basic stuff comes Scripts.

Huh?

There are actually 2 types of things you can do with Python inside of SPSS:

What is the difference? Well, a Python Program can be inline with your SPSS syntax, just like a macro. You can do the following (cribbed liberally from Raynald’s book):

Sounds like a lot, right? Well, not quite. You don’t have full access to the output tree, so massive reshuffling of output is not available. Since that’s about half my time with SPSS, it is disappointing that this is not exposed. But honestly, you will probably never write another macro again. The only thing you need macro variables for is to pass information out of the Python portion back into regular syntax (Raynald describes this technique).

These programs use the phrase “import spss” as one of their first lines, which is the library SPSS coded up to expose their functionality to Python.

Ok, so what about the Python Script? These use a different library, the “import SpssClient” library. These focus on the stuff left out above, specifically:

Scripts are very akin to SaxBasic, which lived in a separate window (File | New | Script) and was run only via the SCRIPT command (if you are on Windows). Well, same limitations here. Accd to the docs (PDFs and help file), these Python Scripts cannot be used inside a Syntax file ala Begin Program / End Program (we’ll talk more about this below): “Python scripts can be run from Utilities>Run Script or from the Python editor launched from SPSS (accessed from File>Open>Script).” They don’t mention the SCRIPT command, which is kind of a huge omission. I understand that its Windows only, but that is a huge part of the SPSS userbase.

(Note: Though it is undocumented here, some comments from SPSS folks imply that the SCRIPT command will be improved in future versions, is an acceptable way to call Python, and may even allow parameter passing in later versions)

So, why the two? If you think about it, these Script are the “interface/windowsy” side of the system, while the Programs are more about actual data processing. Or, you could say that the Programs are focused on the “back end”, while the Scripts modify the “front end”, the “client”. This allegory falls apart if you push it too hard, but for most cases, it works. See more at SPSS Scripting Facility > Scripting with the
Python Programming Language in the SPSS Help system.

So, in short: Programs are Python in your Syntax file, ala macros. Scripts are Python that run outside of your Syntax file, either by manual calls (File Run) or SCRIPT commands. I suspect there is no real reason for this split other than the way SPSS is programmed. I can only hope they eliminate this arbitrary distinction and confusion at some point in the future. In the meantime, Programs are what you will do most of the time, and Scripts will be the way to make things pretty.

By the way, Python is a full language on its own. So, you could write all your analytic stuff in Python, using it to read and process a file, call SPSS to read the resulting file, call SPSS to analyze some stuff, call SPSS to save the output, and then finish it off in Python. SPSS would show up in the background here and there, but you would never see it. This causes no end of confusion to authors who feel they need to cover both the “SPSS calling Python scripts/programs” and “Python scripts/programs calling SPSS” situations. Here’s my simple advice on it: try everything in SPSS first, and when you are an expert in running things from the SPSS environment, then become batch-master. Why the authors of these docs don’t write their articles this way is beyond me; lots of confusion could be avoided. (Probably the same editor who forgot to mention the use of the SCRIPT command to call external Python Scripts)

Raynald gives some clever examples of using Python to create dialog boxes to let users select variables, etc. In effect you could build a mini, constrained front end on top of SPSS to run just a single analysis for students or for clients who need analytics but want the complexities hidden away.

If you do any searching, you’ll see people on the SPSS-X list whipping out some cool Python with the “import viewer” command at the top. Sorry, that won’t work with V16 as of the writing of this article; the viewer “library” (aka module) has not been updated for v16. Some of it can be rewritten to work with the “import SpssClient”, but it’s an adventure. The SPSS Developer Central Downloads section shows some other modules to play with, but not all of them are ready for V16 (like the “tables” module), so be prepared to experiment. With my V16, I had spss, SpssClient, spssdata, and spssaux modules pre-installed.

I’ll try to give some hints on how to deal with the Output Viewer down below.

It winds up looking like this… just type the below into a syntax window and fire away.
BEGIN PROGRAM PYTHON.
import spss
print “Welcome to Python in SPSS!”
END PROGRAM.

The “PYTHON” is optional, but its good form. All the output shows up in Log sections in the viewer. Between the Begin and End, you are using Python, which means no periods at the end of lines, upper/lower case matters, and spacing/indenting is how you make sections/blocks of code.

I refer you to Raynald’s great book starting on page 219 for how to start making SPSS dance from Python. The best book on Python, in my opinion, is Hetland’s Beginning Python but there are a couple of pretty good ones, as well as free tutorials online. But a book is pretty handy.

For the SPSS portion, besides Raynald’s book, look for SPSS-Python Integration package.pdf and SPSS Scripting Guide.pdf (docs for the SpssClient), both in C:\Program Files\SPSSInc\SPSS16\help\programmability.

As I said, accessing output from these Python programs is mixed bag. You can actually turn output into data (like making a dataset out of a frequencies output, something that SAS has done for years) by “walking the XML tree” of output, but its confusing. This can also be done with “OMS” syntax, but I still don’t understand that stuff, and it’s been around for years. Simplification here would be VERY APPRECIATED. Also, note that you can get to the output, but you can’t really reformat or shuffle it. If it’s a pivot table, you can do some stuff to it, but if it’s not, well, the program access is limited. (Yes, I know it can be as simple as http://support.spss.com/Tech/Troubleshooting/ResSearchDetail.asp?ID=40945 but somehow, mine always seem to be more complicated).

Once you get it working, there are some cool things you can do, including using Python to make a GUI for a custom experience, or using “SPSS Extension Commands” to make new commands in syntax which call Python: In effect, you never have to deal with the Python junk, you just use your new command in syntax just like usual.

(BTW, if you run the SpssClient externally, remember to use SpssClient.Exit() before you do the .StopClient())

One of my most popular scripts is something I put on Raynald’s SPSSTools.net site, called Change The Label and Title of Last Run Procedure. It lets you change the labels and titles of pieces of output, so you don’t have a list of 20 “Frequencies”, but instead can label them “Freq of Gender filtering high income” or whatever, making the viewer tree much more usable. This is in SaxBasic, so I decided to make it work in Python.

This took some doing. The basics are the same, but I struggled with some Pythonic pieces.

The biggest problem: The script assumes you will pass it the new label as part of the script call. This works fine in Sax Basic in V16, but SPSS didn’t include a way to pass parameters into external Python scripts. I will post a workaround to this in a bit.

How does it work? Well, there are objects your script can play with. There is the SpssOutputDoc, made up of SpssOutputItems. SpssOutputItems include Pivot Tables, Headers, Charts, Text Items, Title Items, and Log Items. Each analysis creates a package of output items in the tree. I intend to walk back up the tree from the bottom and find the most recent title, and change it.

Say we run FREQUENCIES VARIABLES=SEX /ORDER=ANALYSIS. on a blank Output Viewer.

We get back, on the Tree Pane:
Frequencies
—Title
—Notes
—Active Dataset
—Statistics (which is a pivottable)
—Sex (which is a pivottable)

If we were to walk the table with something like SCRIPT file=“C:\Python25\Lib\site-packages\spss160\spss\titleheaderwalker.py”. in my Syntax window, and the below script plopped in that directory, we get output which I’ve put in a table below.


# titleheaderwalker.py
import SpssClient
SpssClient.StartClient()
objOutputDoc = SpssClient.GetDesignatedOutputDoc()
objOutputItems = objOutputDoc.GetOutputItems()
for index in range(objOutputItems.Size()):
    objOutputItem = objOutputItems.GetItemAt(index)
    print "=================================================="
    print "Index = "
    print index
    print "Description = "
    print objOutputItem.GetDescription()
    print "GetType = " 
    print objOutputItem.GetType() 
    print "GetTypeString = " 
    print objOutputItem.GetTypeString() 
    print "SpecificType = "
    print objOutputItem.GetSpecificType() 
    print "SubType = "
    print objOutputItem.GetSubType()
    print "TreeLevel = "
    print objOutputItem.GetTreeLevel()
print "=================================================="
print "done!"
# SpssClient.Exit()
SpssClient.StopClient()



Index Description GetType GetTypeString SpecificType SubType TreeLevel
0 Output ROOT Blank SpssHeaderItem Blank 0
1 Log LOG Log SpssLogItem Blank 1
2 Frequencies HEAD Blank SpssHeaderItem Blank 1
3 Title TITLE Title SpssTextItem Blank 2
4 Notes NOTE Notes SpssPivotTable Notes 2
5 Active Dataset TEXT Text SpssTextItem Blank 2
6 Statistics PIVOT Table SpssPivotTable Statistics 2
7 Sex PIVOT Table SpssPivotTable Frequencies 2




So, using my previous code as a guide, we get the following:


import SpssClient, os
SpssClient.StartClient()
thelabel='New Label for Output!'
# want to change Title, then Heading
objOutputDoc = SpssClient.GetDesignatedOutputDoc()
objOutputItems = objOutputDoc.GetOutputItems()
for index in range(objOutputItems.Size()):
    objOutputItem = objOutputItems.GetItemAt(index)
    if objOutputItem.GetType() == SpssClient.OutputItemType.TITLE:
        # Fix the Title first
        objTitleItem = objOutputItem.GetSpecificType()
        objTitleItem.SetTextContents(thelabel)
        objOutputItem.SetDescription(thelabel)
        index = index-1       # Back up one for the header...
        objOutputItem = objOutputItems.GetItemAt(index)
        objHeaderItem = objOutputItem.GetSpecificType()
        objOutputItem.SetDescription(thelabel)
print "done!"
# SpssClient.Exit()
SpssClient.StopClient()

Call this with the Script command from your syntax right after running a procedure, and it will change to whatever is in that line next to “thelabel”. Useful, but only so much.

Why? You shouldn’t have to edit the python file every time you need to label something! That’s silly.

I’ll show a later post how I worked around that problem. Anyway, compare the SaxBasic version to how I did it here, and you’ll see that they are pretty similar, give or take.

In the SPSS Help file, search for “Script Editor for the Python Programming Language”




PS: I keep giving Raynald props, but Jon Peck of SPSS is tireless, an absolute robot, when it comes to helping people on the forums and mailing lists with these kinds of problems. That man is a living, breathing SPSS processor and is a daily lifesaver. He should be knighted.

Comments? [2]

* * *

 

CABbing with SPSS · 06/12/2008 02:19 PM, Analysis

I had the honor of being invited to Chicago to provide feedback on SPSS product directions as part of their Customer Advisory Board, or CAB.

I will not be able to talk about the details of what I saw, but I wanted to talk a little bit about some our conversations to help folks understand some things about where SPSS is heading.

Big focus on enterprise integration. In the past, they really saw the products as silos, and did little to get the products integrated into the enterprise workflow. Now, they are offering products to combine the different product capabilities into a more cohesive whole. This is at the macro level (Predictive Enterprise Server) as well as on the products (unifying codebases to allow cross platform and cross-product consistencies). Clementine will continue to be the high-end offering, however, with more non-statistical stuff happening there first, as well as better handling of big data. I didn’t get to spend as much time with the Clem guys as I would have liked, so not much to talk about here.

Heard our complaints. I was pleased to have a unanimous “we agree” from other CAB participants when we complained about the lack of “proc SQL” from SAS, or the ability to run SQL-like queries against an SPSS dataset for manipulation, merges, and counts. Clearly, this is a pain the userbase has felt that SPSS just hasn’t really recognized til now. We also all complained about desktop speed, and SPSS suggested that they might have some things in mind to speed up certain parts, sort of like the multithreading they started releasing in v16. We also all mentioned the old fashioned syntax window, and they suggested that there may indeed be a pot at the end of the rainbow. They also showed us some of v17 and v18 plans, and there are some great things ahead. More modern models for the stats folks, but also an increased emphasis on making it easier for junior analysts to do basic analytic tasks. What was clever was the range of things they are helping solve, from tactical “help me do this step” (like the data deduper) to the more strategic “help walk me through how to do this multistep process to get this output”. And for some stuff we saw, we were astounded (or at least, we used very graphic language about what we saw). Yes, there are hints of what’s to come in this paragraph.

Increasing emphasis on open. While it was nowhere near what I asked for (Release an R-only front end that looks like SPSS! Allow all SPSS features to be API callable from Python!), they were certainly considering expanding the integration of Python and R with SPSS. I think this is a great approach, since SAS defines “open” as “if you do it in SAS, it’s open to other things you do in SAS”. When competing against a giant, flanking is very powerful, and I think SPSS might be on to something here. And time to start learning Python; SAX Basic will be supported for a little while longer but I think its time is drawing to a close. Besides SPSS Developer Central, the SPSS-L and SPSS newsgroups, the SPSSTools.net site by Raynald Levesque, and anything by SPSS’s Jon Peck on the web will help you get up to speed on advanced uses of Python and SPSS.

You can do a lot with the basics. One guy was reviewing all the analyses he did, from store location analysis to sales forecasting to account-rep assignments and sales force management. We started asking how he used the geo-mapping features, or how he handled seasonality as part of the neural nets, when he raised his hand to stop us. “I do all this with regression and ARIMA.” Sometimes, we love our new toys so much that we forget about the tried and true. While this guy said that he would look at these other ideas we’ve suggested, we were all thinking “perhaps we should just try some regression instead of jumping onto some pattern detection fancy thing”.

Lots still in there, stuff that neither you nor I have even looked at. Did you know that you can output to XML from SPSS script? That you can copy dictionaries from one file to another? That variables can now have “custom attributes” like tags? There are lots of new convenience features in each version that don’t really get lots of play, you sort of have to find them. If you have a ton of script to work around something annoying, might be useful to look at the latest PDFs to see if SPSS added something to solve the issue.

Shoutouts to Karl Rexer of Rexer Analytics who was great to catch up with, and Bob Muenchen, author of the best book on using R, R for SAS and SPSS users. BTW, this 80 page early version is now a 550 page book published by Springer-Verlag, pre-order at Amazon.

I hope SPSS will give me permission to talk more about some of the things we spoke about, and keep your eyes peeled to learn more about the future of SPSS.

Comments?

* * *

 

What a... · 05/29/2008 11:34 AM, Trivial

Thanks to my brother Darren for loading me up with tons of goodies, including this stellar clip:

Comments? [1]

* * *

 

Yamipod, Floola same program? · 05/20/2008 11:52 PM, Tech Personal

I don’t like iTunes… but I’m forced to use it. Over the years, however, folks have figured out ways to reverse engineer the database structures Apple uses on the iPod and provide some useful features that iTunes lacks, the most important being the ability to take songs off of the iPod.

There were always a few leaders in the windows world:

Yamipod
Floola
vPod (kind of bare bones, but an early entry)
EphPod (first one I used, but not updated in a while)
and more recently, Sharepod

(BTW, if you want a different metaphor, Anapod Explorer (commercial with a trial) was a cool idea of integrating iPod with the windows explorer instead of requiring a separate app. It never delivered enough integration for me to keep using it, but clever idea.)

These also have the advantage of running off of your iPod in disk mode, so you have the ability to mount the iPod to any PC via USB, run these programs, and either play tunes, add tunes, or copy them off to your new location.

The strange thing was that Yamipod and Floola always had releases near each other, looked like each other, and had the same bugs and error messages. Yet the sites were slightly different, and the forums for each never mentioned the other.

Floola:

Yamipod:

I wondered: was one guy ripping off the other? I mean, ripping off the source code in such a way that the error messages in both have the same mispellings and typoes? Reverse engineering one into the other? But people kill for less online, and neither guy was complaining about the other in forums or blogs.

So I did some digging. Turns out both sites are linked to the same guy, Tomas Camin (P.IVA: 06020870967). Both programs were written in RealBasic. And Floola appears to have become the winner, though Yamipod was much more popular. Yamipod hasn’t been updated in a while, but Floola is keeping up with iTunes releases. So, basically, yes, they are the same program… or were, until Tomas started giving all the love to Floola.

On my iPod, I have Floola and Sharepod, and have removed the rest. Neither are perfect, so don’t trust your only copy of a song to them. But for quick and dirty syncing, or sharing files (completely legal ones only, of course) with friends, these programs are lifesavers.

For more fun, read Lifehacker on iTunes software replacements and 10 alternatives to iTunes

Comments?

* * *

 

Processing in JavaScript · 05/16/2008 06:16 PM, Tech

There have been a slew of recent programming languages aimed at “graphical designers” instead of programmers. Flash is one that has grown powerful, but Sun is pushing JavaFX and the MIT Media Lab created Processing. Both of these last require the Java JVM.

These languages are not fully baked scripting languages, but are instead focused on making it easy to draw imagery and visualize basic data. Animation, 3d motion, other graphical issues are reduced to script commands to make it easy to create images.

In a labor of love, John Resig has converted the entire Processing Language to run in JavaScript in a browser.

Suddenly, easy and powerful graphics are available to any web page running in a modern browser without the need to dip into Flash, and of course, since its all JS, other JS libraries can work with these capabilities.

Just to whet your appetite, look at the molten bar chart.

Very cool.

Comments?

* * *

 

Ubuntu 8.04 Hardy Heron and Virtual PC · 05/13/2008 11:51 PM, Tech

Update
Accd to comment below (Thanks, Jerry!), in comments on site Arcane Code

Hi! I was getting the same error message trying both booting into live environment and install mode. Adding boot parameters noapic nolapic and vga=791 solved the problem for me.

In addition, comment from BimmerM3 links to a post where a guy reveals that noreplace-paravirt is the magic thing to add to get past the errors.

I’ll try that next but wanted to put it here in case you guys have time to try it before I do.

Original Post: ================

I’ve written before about my fun with Virtual PC and Ubuntu.

Ubuntu 7.10 and No Mouse on Virtual PC
Ubuntu 7.04 and No Mouse on Virtual PC
All Nettakeaway articles with Ubuntu

Now, with the latest release of 8.04 Hardy Heron ISO, I can’t even get the LiveCD to boot. I get to the Select language screen, I select English, then to the “Try Ubuntu without any change to your computer”. It loads Kernel quickly, then I get “An unrecoverable processor error has been encountered. The virtual machine will reset now.”

There’s probably some parameter here or there that I need to set, but yet again, the most popular Linux config out there somehow failed to test on the free virtual environment owned by the largest software company in the world. I know Linux guys’ disdain for MS, but come on.

So, I’m sure I can install it to a drive and all will work. But right now, I’ve got nothing to say about it, and that’s not good.

I will say that Innotek’s VirtualBox, now part of Sun, loaded the LiveCD flawlessly with really great performance. An amazing open source product, I highly recommend it. Download it here.

BTW, I have no idea if the mouse will work or not… but I would check those links above and be prepared.

Comments? [11]

* * *

 

Great Tools for Video Conversion · 05/06/2008 06:30 PM, Tech Personal

Was in a jam to convert some files, and pulled down these guys. Very handy. Both for Windows XP, btw; may work on Vista but I haven’t tried them yet.

SUPER © is somewhat complex, but can convert pretty much any audio and video format to any other format. And it’s Free.

Combined Community Codec Pack is a nice collection of Codecs and Filters to play almost any file in your usual players. Really helpful.

BTW, if you run something like Sherlock, you might see an error similar to “Warning: The following codecs were found broken: ff_vfw.dll”. Accd to http://www.cccp-project.net/smf/index.php?topic=2459.0, this is just an error with Sherlock, and the codec should work fine.

Comments?

* * *

 

Competitor's Tools, or just Tools? · 04/28/2008 01:02 PM, Analysis Marketing

I was working with a friend in Sunnyvale and we were trying to solve a problem. After the 4th spreadsheet mailed back and forth and 20 more minutes of “Ok, go to cell C137, and change that formula to, ready, =A27/$Q$5 and copy it into every 3rd column and you’ll have what I have”...

I suggested just going to Google Docs and setting up a quick shared spreadsheet.

Silence on the phone. “But.. but they are a competitor”.

“In what way? Do we sell a spreadsheet?” I wanted to hear her real concern, and this is the standard knee-jerk reaction.

“No, but this is confidential information and we can’t let it out of the enterprise. We have a policy about this, don’t we?” she returned.

“Ok. But I saw some people sending spreadsheets around via Yahoo! IM, which is inherently non-secure. People email conversion reports to client accounts hosted via Gmail; email is non-secure and Gmail is, well, Google. So, sometimes, we do let things out of the enterprise.”

I continued. “And for this sheet, you have Group A and Group B all over the place with no actual client or group names. In fact, other than the headers, this could be data from an chemistry experiment. And yes, we do have a policy, but I’m not giving the data to others or taking it home; I am using a tool hosted outside of our corpnet to process unidentified numbers.”

She jumped in: “But they can see this data! This would reveal confidential information about this advertiser spend if it ever got out, or Google could use it against us in their strategy. Or they just simply leak that we use their office suite. Imagine the press: Yahoo! uses Google to run their business.”

“Of course they can see this data if we use their system. They see millions of bytes of non-identified data per second. I don’t think they will be able to figure out what Group A really is, since I get confused and I do know… And given recent news, aren’t we testing using Google to run part of our business anyway? It’s not like we are getting rid of Excel or OpenOffice (which rocks, by the way), but for interactive spreadsheeting, they have a great solution. By the way, Yahoo! buys search ads on Google to drive traffic to our properties. And guess what: Google buys ads on Yahoo! to drive traffic to their offerings (like Gmail and Docs). So, I guess we are already using each other to run parts of our business.”

She was adamant. “No. We won’t put any of this on Google’s spreadsheets. You are insane to suggest it.”

I was a bit bleary at this point. “Ok, what if we didn’t use any column headers other than pure codes. Group A and B stay, but we change Impressions to S, Clicks to K, and CPM to Q. Its completely contextless, only you and I know what these are, and we can get this done in minutes. And think of how smart this is: They are paying programmers to improve this product so that we, their competition in search, can benefit by being more efficient. If we want to acquire or license SharePoint for Excel or EditGrid or ThinkFree or Zoho or any of the others then I’ll use them. But til then, why not leverage the efforts of our foe to conquer them? It’s like Judo!”

More silence.

“I’m emailing you the sheet again. Please let me know when you get it”. You could hear her gritted teeth.

So, we continued sending the 600k sheet back and forth, and solved the problem. We are both excel wizards, so we knew how to do all the tricks, so there were hassles, but it was comfortable. It also led to lots of copies of this sheet in my inbox and sent mail, and same for her; our mail server sent extra junk, and of course, it will all get backed up.

So, I ask you, my audience. You work at big and small companies, publicly traded and not. Think of this as a business school case study. Comment away, on the below or on anything else.

Comments?

* * *

 

On a previous episode...

Admin
powered by Textpattern 4.0.4 (r1956)