Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/mwexler/public_html/tp/textpattern/lib/txplib_db.php on line 14
The Net Takeaway: Infobright: The MySQL DataWarehouse


Danny Flamberg's Blog
Danny has been marketing for a while, and his articles and work reflect great understanding of data driven marketing.

Eric Peterson the Demystifier
Eric gets metrics, analytics, interactive, and the real world. His advice is worth taking...

Geeking with Greg
Greg Linden created Amazon's recommendation system, so imagine what can write about...

Ned Batchelder's Blog
Ned just finds and writes interesting things. I don't know how he does it.

R at LoyaltyMatrix
Jim Porzak tells of his real-life use of R for marketing analysis.






Infobright: The MySQL DataWarehouse · 09/30/2008 12:04 PM, Analysis

Most of the datawarehouses out there that aren’t built around Oracle or one of the other biggies tend to start with PostgreSQL. There are a variety of reasons, from the more complete SQL standard support in it’s query engine to it’s early handling of fully ACID transactions. Most of the “DW appliances” have been built around heavily modified PostGres, including Netezza and Datallegro, among many others. Some have called PostGres the “open source Oracle”, and in fact, EnterpriseDB have modified Postgres to run Oracle programs directly.

But someone mentioned that I should look at Infobright, an open source data warehouse built around MySQL (now owned by Sun). MySQL is by far the most popular database system around these days, offered on every hosting system and showing up in all sorts of places. Many of the differences between PG and MyS have been ironed out with improvements in MySQL storage engines and query handling.

Infobright’s Community Edition is chock full of code and documentation on how their system works. Like the rest, it includes the usual suspects: Columnar data store, compression, gridding, etc. Note that it runs only on Linux (currently, a slew of 64 bit Linuxes, soon 32 bit Ubuntu) and reqs 16GB or more RAM (UPDATE: From comments below, 16GB is recommended, but it can run in less.)

If you have the space (or a spare Amazon virtual server), might be worth checking out. Especially if you grew up on MySQL, and can make it dance and sing like Jeremy Zawodny. Ok, you don’t need to be that much of an expert. But chances are, if you’ve done any web development over the past few years, you’ve become pretty good at MySQL. Here’s a way to leverage that knowledge into a warehouse.

For more on Open Source BI, see my posts (reverse chrono order) on:

LucidDB… Open Source DB for Data Warehousing and BI

PHP and BI?

Open Source BI?

* * *


  1. Hi Michael,

    Thanks for the posting about ICE! Just as a clarification – ICE will run with less than 16 GB of RAM. That is just a recommended setting for high performance servers. The configuration available on our Wiki outlines settings for the configuration file for higher, and lower amounts of memory. We’re also bringing out 32-bit binaries shortly, so we’ll be expecting people with 2-4GB of RAM to be fairly common.

    Best regards,
    Mark Windrim
    Community Manager

    Mark Windrim    Sep 30, 03:44 PM    #

  2. Thanks, Mark… Would be great to get a Windows binary out there to play with as well!

    Michael Wexler    Sep 30, 03:47 PM    #

  3. Hi Michael,

    There is a 32 bit binary available for Windows now. There is also a 64 bit VM available (Linux) should you happen to be running Windows64.


    Mark Windrim    Mar 24, 03:17 PM    #

  Textile Help
Please note that your email will be obfuscated via entities, so its ok to put a real one if you feel like it...

powered by Textpattern 4.0.4 (r1956)