OTHER PLACES OF INTEREST

Danny Flamberg's Blog

Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier

Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg

Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog

Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix

Jim Porzak tells of his real-life use of R for marketing analysis.

HOW DID YOU GET HERE?

nettakeaway.com

https:

nettakeaway.com

nettakeaway.com

https:

https:

nettakeaway.com

https:

https:

nettakeaway.com

NAVIGATION

SYNDICATION

You know, its pretty hard to get the table underlying a histogram in SPSS. Frequencies are easy to run, but if you want to bin the data and see how the buckets size up, you are sort of out of luck.

SPSS will gladly make the histogram graph for you (from many different places, including FREQUENCIES, GRAPH, IGRAPH, EXAMINE, etc.), but if you want to use the bins or pop the data into Excel for better charting and easier formatting, tough.

SPSS doesn’t want to make it easy. First off, they don’t want to tell you how they calc the bins. The below is from the SPSS Tech Notes, circa Sep 29, 2004.

http://support.spss.com/tech/troubleshooting/ressearchdetail.asp?ID=49426

Q: How are histograms binned in SPSS Base for Windows? I’d like to know the algorithm.

A: The algorithm is part of our intellectual property so we’re unable to provide too much detail. If you specify either the bar width or number of bars, that determines the interval directly. Otherwise the number of bars is calculated by an algorithm that uses statistical theory to suggest a number of bars that is optimal for a data set of the size provided, under an assumption of normally-distributed values. This optimal value may be overridden if the algorithm detects granularity in the data (i.e. values distributed at discrete locations). This granularity will be used to calculate interval widths when the number of bins suggested is not much larger than the value derived from the other algorithm.

Ok, great, but how do I then get the bins myself? There appears to be no way for me to call that process directly to create bins, so I have to rely on eyeballing the histogram graph to figure out the breaks. (BTW, Edit the chart, change the X axis to count of 1, and then the eyeballing is easier. Still insanely manual, but easier).

So, my current silly solution is a multistep and annoying process. First, hope you have v12 or above. Use Visual Bander to make some bins… but assume you will have to fix them. Then, either igraph or graph them… but of course, you will have to apply a chart template or risk getting either the fire-engine red or kakhi-on-grey color scheme, neither of which are client-ready.

If you don’t have the visual bander, the syntax winds up looking like this:

```
*Visual Bander.
*mailcnt_mean.
RECODE mailcnt_mean
( MISSING = COPY )
( LO THRU 0 =1 )
( LO THRU 10 =2 )
( LO THRU 20 =3 )
( LO THRU 30 =4 )
( LO THRU 40 =5 )
( LO THRU 50 =6 )
( LO THRU HI = 7 )
( ELSE = SYSMIS ) INTO mailcnt_bnd.
VARIABLE LABELS mailcnt_bnd 'mailcnt_mean (Banded)'.
FORMAT mailcnt_bnd (F5.0).
VALUE LABELS mailcnt_bnd
1 '<= 0'
2 '1 - 10'
3 '11 - 20'
4 '21 - 30'
5 '31 - 40'
6 '41 - 50'
7 '51+'.
MISSING VALUES mailcnt_bnd ( ).
VARIABLE LEVEL mailcnt_bnd ( ORDINAL ).
EXECUTE.
```

For some reason, SPSS made the labels 11.00 vs. 11, so I hand edited them for my purposes. And yes, I wasn’t really n-tiling, I wanted things which made sense for my viewers, hence the even breaks. Rank and a few of the other procedures can help with n-tiles, but even that is a pain.

This was really tons of work. I think SPSS should allow the histogram (no matter what procedure made it, including graph or igraph) to also generate a table.

Ah well, another thing to dream of getting.

BTW: to change the variable “level” from Scale to Nominal or Ordinal, try

```
Variable Level
Var 1 var2 (Scale)
/var3 var4 (Nominal)
/var5 (Ordinal) .
```

* * *