Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: Clementine Scripting Explained

OTHER PLACES OF INTEREST

Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.

 

HOW DID YOU GET HERE?

nettakeaway.com
https:
iqworkforce.com
https:
nettakeaway.com
google.com
https:
https:
https:
nettakeaway.com

 

 

 

Clementine Scripting Explained · 12/03/2004 11:53 AM, Analysis

Scripting in SPSS’s Clementine… You too can do it!

Yes, the entry you’ve all been waiting for. How does scripting in Clementine work? I tell you, it took some digging. The documenation for this is written in such a strange way: It talks all about various node commands and such without ever really documenting how the language works until the end of an appendix section. BTW, I am working from Clementine 7.1; I understand that 9 is about to be released and I hope to get that up and running sooner or later.

1) How to run a script? The easiest way is from Tools | Standalone script. You can also attach a script to a stream which means that when the stream is run, so is the script. SuperNodes can also have scripts.

2) Basics of the language:

Note that there are no syntax highlighters or other “coding” features in Clementined. No debugprint or other debugging functions are available. So, no stepping or breakpoints either. You can open a file for output and log to it (see below), but that isn’t the most elegant of techniques. In general, try to find a way to do what you want with the GUI, because coding in this environment is about as painful as coding in SPSS syntax1. (Yes, that is a footnote, click on the number to read it)

Assignment
Assignment is handled with set and a single = sign. Variables can be integers, strings, or “objects” which are really just Clementine things (nodes, streams, etc.). You cannot make your own objects or data structures, and there are no arrays or other indexed or grouped data structures.

Assignment is pretty generic… that is, from the simple statement description set PARAMETER = EXPRESSION, we have many things we can set:

PARAMETER can be:

EXPRESSION can be:

But basically, thinking of it as name=value gets you most of where you need to be.

Quoting and Special Syntax
String literals (including Filenames) need to be double quoted: "druglearn.str" If necessary, you can use single quotes around strings (but filenames always need double quotes).

Also, CLEM expressions need to be double quoted (that’s rather annoying, isn’t it?) If you use quotation marks within a CLEM expression, make sure that each quotation mark is preceded by a backslash (\)—for example:

  set :node.parameter=" BP=\"HIGH\""

Parameter references such as ^mystream should be preceded with a ^ symbol

Comments are somewhat traditional: # is a single line, /* */ is for multilines.

If you need to continue a statement to a next line, you HAVE to use a /. This is similar to VB, btw.

  set :fixedfilenode.fields = [{"Age" 1 3}/
  {"Sex" 5 7} {"BP" 9 10} {"Cholesterol" 12 22}/
  {"Na" 24 25} {"K" 27 27} {"Drug" 29 32}]

Flow Control

Looping and Iteration is very limited.
You can iterate across the fields in a node (i.e., in your data) with for f in_fields_at type For loops are closed with endfor

“For” has a few other versions:

Branching is via if-then-else-endif. Unlike assignment, logical equality testing requires 2 (two) = signs: if first == 1 then

As part of manipulation, the “with” construct is available. If you have multiple streams available, you can specify which one to interact with for a series of commands:

  with stream STREAM
    for I from 1 to 5
      set :selectnode.expression = 'field > ' >< (I * 10)
      execute
    endfor
  endwith

Most of the Clementine functions are available in the scripting language (and have to be double quoted when used!). String concatenation is really oddball here: you use >< (yes, the greaterthan and lessthan signs).

Manipulating Clementine

Nodes have some special issues: You refer to nodes either by their name (a good reason to name each node), or name:type if you have different types of nodes with the same name. You can leave off the name to refer to all nodes of a certain type (:neuralnetnode). Things get a bit more tricky with indirection: set n = “Drug1”, and then you can refer to it as ^n. (This is handy for looping)

Basically, each node is an object with properties. Almost every property in the GUI for a node is also exposed as a script property (btw, these node properties are also called “slot parameters” by Clementine folks). As mentioned above, you use the set Name=Value.

If you want to set multiple properties at once, use braces ({}).

  set :samplenode {
    max_size = 200
    mode = "Include"
    sample_type = "First"
  }

(Note that in this example, the script would impact EVERY samplenode on the stream)

Besides setting properties, you can perform many actions with the Clementine collection of objects:

Like perl, there are “special variables” for current stuff. You can refer to the “current” object in scripting using predefined, special variables.
The words listed below are reserved in Clementine scripting to indicate the current object and are called special variables:
* node—the current node
* stream—the current stream
* model—the current model
* generated palette—the generated models palette on the Models tab of the managers window
* output—the current output
* project—the current project
So, for example:

  save stream as "C:/My Streams/Churn.str"

Not much can be done to output. SPSS has been gradually adding to its OMS (Output Management System), reflecting SAS’s additions and work that Statsoft’s Statistica has had since day one. However, little of this has migrated to Clementine yet, so there are very few ways to manipulate Results. Basically, each terminal nodes include a read-only parameter called output that can be used to access
the most recently generated object.

For Tables, you can get access to a few attributes and values in the data that was generated. For example:

  set num_rows = :tablenode.output.row_count
  set num_cols = :tablenode.output.column_count

The values within the data set underlying a particular
generated object are accessible using the value command:

  set table_data = :tablenode.output
  set last_value = value table_data at num_rows num_cols

Indexing is from 1.

Creating Files
Open (create) a new file with open MODE FILENAME where MODE is either create (creates the file if it doesn’t exist or overwrites if it does) or append (appends to an existing file. Generates an error if the file does not exist). This returns the file handle for the opened file, so best to open file as part of assignment statement.

write|writeln FILE TEXT_EXPRESSION works as expected. close FILE is needed to flush any output caching.

So:

  set file = open create 'C:/Data/script.out'
  for I from 1 to 3
    write file 'Stream ' >< I

  endfor
  close file

CLEM in Scripts
Make sure to examine the Parameters section as well.
Pretty much every CLEM expression is available except any @ functions, date/time functions, and bitwise operations. Also, CLEM expressions have to be in double quotes (and if you have quotes in the expression, they need to be escaped via (backslash).

Examples of CLEM expressions used in scripting are:

  set :balancenode.directives = [{1.3 "Age > 60"}]
  set :fillernode.condition = "(Age > 60) and (BP = "High")"
  set :derivenode.formula_expr = "substring(5, 1, Drug)"
  set Flag:derivenode.flag_expr = "Drug = X"
  set :selectnode.condition = "Age >= &#8217;$P-cutoff&#8217;"
  set :derivenode.formula_expr = "Age - GLOBAL_MEAN(Age)"

Parameters and Misc
The scripting language often uses parameters to refer to variables in the current script
or at a variety of levels within Clementine.
* Local parameters refer to variables set for the current script using the var command.
* Global parameters refer to Clementine parameters set for streams, SuperNodes, and sessions

Local Parameters are just variables. They need to be predefined with the var command. If you use them to point to nodes, then you need the ^ indirect syntax:

  var my_node
  set my_node = create distributionnode
  rename ^my_node as "Distribution of Flag"

Global Parameters are single quoted: '$P-Maxvalue' These are well examined as part of the CLEM function explanation in the manual.

Exit current, current CODE, Clementine, Clementine CODE allows one to exit script or the program with an optional return code for batch use.

Okay, that should be enough to get you started. For further information, here are some good places to look:

Soon, I’ll post some code samples for you to learn from…

Finally, one last thing. What’s missing from all this stuff? External control of Clem. That is, other than calling command lines on scripts, I can’t externally use any of the modules. SPSS is a bit better, in that I can control modules using SAXBasic, but still problematic for external calls. I don’t want to embed anything, but since Clem is all in Java, one thought is that they could expose the API and Class structure and let us call the jars from other languages. Imaging using Jython or Judoscript to control Clementine… But that’s also for another time, I guess. It’ll probably have a web services interface or a “cook dinner” node sooner than having SPSS realize that its customers are also partners and open the kimono, but we can dream…

(Yes, I know that some folks have embedded SPSS and Clementine in tools… but note that almost every one of those is actually sold by SPSS… And its not like SAS is also opening its world… but looking at Eclipse, how its a tool and a toolset, I think the world is moving that way for the best tools. And, btw: Statistica by Statsoft does document the heck out of their APIs, so some are starting to play this way. And so does R 8-). )

++++++++++++++++++++++++

1 I guarantee you that the people coding the SPSS programs are not using notepad; they are using Eclipse and Visual Studio and other programming tools with the features of modern development environments. So, why can’t they extend that to those of us coding with their products? Clem and SPSS both have programming environments not much more advanced than notepad (or pico/nano for you unix folks). I say, force the SPSS programming team to code the SPSS products with notepad and a command-line compiler for a week as punishment… and I bet we would see some amazing programming enhancements in the next release of the SPSS product line.

* * *

 

  1. Hey, I love your website - I am trying to learn some Clem scripting at present and can resonate with some of what you said. Because of your programming expertise I wanted to mention that SPSS 14 will have:

    Python (http://www.python.org/), an open-source programming language, is now embedded within the SPSS syntax language. Users can write and execute Python code in-line with SPSS syntax commands that are sent to the SPSS processor.

    Among many other improvements that will I hope, open the Kimono for the power users out there! Thanks again for your succint website.

    Best,

    James


    James E. Parry    Jun 7, 05:21 PM    #


  2. Slt, je fais la quatrième année à une école de commerce et de gestion, en fait, jai lu votre article sur clémentine, il est vraiment intéressant et je l’apprécie bcp

    Le domaine de data mining me passionne bcp surtout ;il s’agit d’un domaine vivant ,un domaine qui évolue avec la technologie ,en effet,je suis initiée dans ce domaine et je cherche des TDS afin de compléter mes connaissances et de mettre en pratique ce que jai acquis pendant ces trois ans, dans ce sujet, je veux bien que vous m’aider et de m’envoyer des TDS ou bien toutes documentations que vous juger utiles

    Je veux bien apprendre de vous, de votre expérience et de votre expertise, je vous assure que je cherche juste à me développer dans ce domaine en se basant sur une autoformation et, bien sur, sur votre aide ~

    Je suis trop motivée, trop passionnée, je sais bien que ce monde est plein des bonnes personnes qui aident les autres par tous ce qu elles disposent : savoir, idées, connaissances …………………

    Donc, je vous souhaite une bonne continuation et de prospérité dans ce domaine


    mimi    Jun 8, 07:03 PM    #


  3. Need urgent comment on how to incorporate processing of a transpose node in a Clementine script.
    Can we customize a script to essentially “read the values” for “new field names” within the transpose node. If this is possible, the stream can truly be made automated. Looking forward to your expert comment. Thank you.


    Sunpreet Singh Khanuja    Dec 13, 07:25 PM    #


Name
E-mail
http://
Message
  Textile Help
Please note that your email will be obfuscated via entities, so its ok to put a real one if you feel like it...

Admin
powered by Textpattern 4.0.4 (r1956)