Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: R Packages


Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.







R Packages · 02/27/2007 02:37 PM, Analysis

(First off, if you found this page via a web search or bookmark, you may be much happier in the R Section of this site to see the multiple articles about R, including this one, but also about Packages, Data manipulation, etc.)

Packages are bundles of additional functionality. They can be analyses, datasets, or just tools. For the unix side, they come as source code and get compiled on your system. For windows, the R team has pre-compiled many of them, but sometimes they don’t work. (All together now: It’s open source. Get over it.)

Like CPAN is the home of all add-ins for Perl, CRAN is the home for all add-ins (packages) for R. While there are a few here and there not mirrored on CRAN... assume CRAN is the best place to start.

What’s on CRAN? Check out

library() lists what’s available on your current box
search() lists what’s loaded
library(packagename) loads it in
detach("package:packagename") unloads it

Adding a package? First you have to get it, using install.packages(name); note the plural. Then, to activate it, use JGR’s package manager, or type the commands below; the path below is your default library dumping ground, but feel free to put your favorite path. Remember, the names are case sensitive, so Hmisc needs to be spelled this way.
install.packages(c("RODBC"),"C:/Program Files/R/rw2001/library");.refreshHelpFiles()

Don’t forget to either or .refreshHelpFiles() to refresh the help files and indexes; most of the packages include some these days.

You can set your nearest CRAN to be the default for your session (or put it in a startup file):
options(CRAN = "")
Then simply say
install.packages("foo") or install.packages("foo")

If you’ve already downloaded the zip with the binary package for Windows, then argument pkgs can also be a character vector of file names of zip files if CRAN=NULL. The zip files are then unpacked directly.
<pre> install.packages(c("C:/Downloads/Downlods/R System/"),CRAN=NULL) </pre>

Packages can be removed in a number of ways. From a command prompt they can be removed by just deleting the package directory, or
remove.packages(c(“pkg1”, “pkg2”), lib = file.path(“path”, “to”, “library”))

as in : remove.packages(c("DBI"))

Or just delete the directory. I have no idea if the help files are properly removed as well; perhaps run the refresh commands mentioned above to remove the un-needed help files.

Updating Package?
summary(packageStatus()) lets you see what is new and not.
update.packages() walks through each new one to let you upgrade it.

print x[[“inst”]][“Status”]

Default Packages:

Boot = Bootstrap functions, including some sample data
Class = Classification, very handy, including k-nearest-neighbor and SOMs
Cluster = Cluster analysis including plots, Clara/Diana/Agnes large data techniques
Datasets = Tons of datasets for sample analyses
Foreign = translators for Minitab, SPSS, S3, SAS, DBF, etc.
Graphics = all the basic plots and some clever ones; Lattice has more advanced ones
grDevices = Control over graphic display devices
Grid = Low level graphics control, underlies Lattice
KernSmooth = Kernel Smoothing algorithms (Kernel Density Estimate, etc)
Lattice = Powerful visualization package, similar to the Trellis package from S-Plus; requires Grid package
MASS = Venables and Ripley’s MASS, including datasets, analyses, and examples linked to their book. Lots of good “utility” analyses here.
Methods = Package to deal with R internals and programming
mgcv = GAMs with GCV smoothness estimation and GAMMs by REML/PQL = General Additive Models
nlme = Linear and nonlinear mixed effects models
nnet = Feed-forward Neural Networks and Multinomial Log-Linear Models, handy for categorical data analysis
rpart = Recursive Partitioning and Tree building. Handy for categorical analysis.
spatial = Kriging and Point Pattern Analysis. I have no idea what this does, so worth investigating. I assume its a geo-spatial analysis approach
splines = Regression Spline Functions and Classes
stats = All the stats you ever wanted, from Anovas to weighted means, and lots of stuff inbetween.
stats4 = Statistical functions using S4 classes. Looks like wrappers around the more advanced stat calculations
survival = Survival analysis (Cox model, etc.), including penalised likelihood. Useful for decay analyses. Includes some sample data
tcltk = Tcl/Tk Interface, a gui popular on unix but less accessible on windows (hence the drive towards JGR and other “more cross platformy” approaches)
tools = a mixture of random stuff, more useful for R programmers than users
utils = a mixture of random stuff, but actually handy things. Worth reviewing the list of things here for quick saves.

VCD, “Visualizing Categorical Data” has been mentioned as a great package for data viz. has the JGR packages

Finally, for the ever popular clustering of binary data:

?dist (method=“binary”)
For distance based clustering methods see

Recent Discovery:
sqldf is an R package for performing SQL select statements on R data frames, optimized for convenience.

It consists of a thin layer over the R packages RSQLite and RMySQL. (The code for accessing RSQLite has been tested but the code for accessing RMySQL has only been partly tested and only in the development version of sqldf). More information can be found from within R by installing and loading the sqldf package and then entering ?sqldf. A number of examples are at the end of this page and more examples are accessible from within R in the examples section of the ?sqldf help page.

So, for those times when you know exactly how the transform should go in SQL, but you don’t know all the R tricks to get it there… sqldf.

Another good one: sqlitedf and Basically, this replaces your in-memory dataframe with a SQLLite backed version, allowing much larger data. As G. Grothendieck, the author of SqlDF, pointed out in a comment, this doesn’t give you access to SQL itself, but can help you deal with larger datasets while staying in an R context and syntax.

Update: 2/6/2008: FF is a very exciting package that got its first big show at the 2007 user conference.
The ff package: Handling Large Data Sets in R with Memory Mapped Pages of Binary Flat Files What’s great about it is that it appears to work without changing lots of R’s insides.

Andy Edmonds on the Web Analytics Group suggested highlighting the ODBC and SQLite connectors. Getting data in and out of databases and other tools is pretty important. Did you know you can control Excel through ODBC? And SQLite is a very small database that that you can use when you just gotta do something in sql that you can’t do easily in R (multi-dataframe joins, etc.). rodbc and rsqlite are good places to start.

* * *


  1. i need any example of application about neural network.

    Where is possible to find this?

    Claudio    Apr 20, 08:26 AM    #

  Textile Help
Please note that your email will be obfuscated via entities, so its ok to put a real one if you feel like it...

powered by Textpattern 4.0.4 (r1956)