The Net Takeaway: Page 13


PHP and BI? · 04/25/2007 12:27 PM, Database Analysis

I hear a lot about how open source and BI (Business Intelligence, the umbrella of data mining, reporting, analytics, etc.) are the waves of the future… but still, its hard to find real success stories and easy starts. I know, BI isn’t supposed to be easy, but come on, should I have to have a dedicated server just to do some simple reporting? Should I have to code it all from scratch?

For example, here is a list of the standard toss-outs whenever someone mentions open source BI:

(There are other great things like Weka for data mining, the R Project for Statistical Computing, etc. but these are not germane to the current point, so don’t jump on me for not mentioning these, nor the ETL offerings out there. Also, Bizgres by Greenplum is a custom extension to Postgres (ok, sorry, PostgreSQL) which really ramps it up for warehousing. Compare to Netezza (custom extension of Postgres on custom hardware), ExtenDB (free MPP version, but may be gone now), Datallegro (the cheaper Netezza), ParAccel (“middle layer accelerator”), and EnterpriseDB (Oracle compatible Postgres). These also don’t fit the current point, but they are cool.)

What do all these have in common? They all need Java and J2EE.

There is nothing for a basic LAMP stack. Actuate and Birt make press when they announced support for PHP, but they still require a J2EE middle layer; they just allow PHP to either call the Java components directly via a “JavaBridge” or call the reporting engine via a services approach. All of the rest have a Java foundation.

The only ones I could find included Agata from brazil, or in English, and phpreports This seems crazy. If LAMP is ready for the enterprise, if PHP is so powerful, why are there no reporting systems?

UPDATE: Thanks to the kind comments of Sherman Woods of the JasperServer project mentioned above, he reminded me of the BEE Project which is mostly PERL but doesn’t require anything beyond LAMP.

UPDATE: Also stumbled across RLIB which is a c program, full open source, designed to be compiled on Linux. I guess its a halfway solution; its not PHP, but its not hard to compile. Actually using it may be difficult; its not a beginner tool. Also found SQLMaestro’s Generator Series which are freeware GUI tools to generate custom PHP reporting on specific tables. Clever idea, if not entirely scalable.

Note that the same could be said for ColdFusion (and to some extent, newer environments like Ruby on Rails or Python); its as if Java and .Net are the only viable options, and open source (esp. open source code on open source foundational tools) is considered a waste of time for this hugely important market.

Now, I haven’t done a deep, deep dive… but if I can’t find an open source PHP-based reporting solution after half an hour of searching across 2 search engines and 5 “catalogs” of open source and Linux software, then it becomes clear that there is no clear winner in this space. Sounds like an open field of opportunity, esp. for small to medium businesses. And I’m just starting with reporting; the rest of the BI stack on PHP is also full of unmet needs.

Any takers?

Most Improved Camper · 04/23/2007 05:19 PM, Trivial Marketing

Since no one wants to be seen as bad, just less good:

U.S.D.A.’s Meat Grades:
1. Prime
2. Choice
3. Select

Cog psych people can wax rhapsodic on anchoring and labeling, but the truth of the matter is that powerful lobbies force the US govt. to be, in effect, politically correct. Why? Well, note that Inspection is mandatory but Grading is voluntary and the service is requested and paid for by meat and poultry producers/processors. So, if they are going to grade themselves, might as well stack the deck.

This post stimulated by the Portfolio Magazine Guide Eat Sheet Guide to Steak which has lots of other explanations of fancy words in steak house menus. Luckily, I rarely wind up in these types of restaurants to even be confused by these words.


Online Notepads · 04/14/2007 06:09 PM, Reference Tech

I used to use for many, many years… til they recently made a change to the allowable size of the note, and truncated all my notes (they are limited to 65k, with NO WARNING if your note is too long. It just truncates. Nice.). While the fact that I’ve lost tons of things I wanted to store still irks me (you should have seen the nasty post this one started out as), I’ve decided to get over it by looking elsewhere.

What kinds of lists are we talking about? Say, “Stuff to buy” which are things that if they go on sale, I would order… or “Songs to get” which are songs I hear on online stations that I should buy or otherwise acquire at some point. Simple, right?

The problem is that there really isn’t any other simple notepad. Instead, you either have “stickies” like Stikkit which still suffer from the size issue and have all sorts of baggage like to-do lists and other junk, through to ThinkFree’s Online Office which is a whole Java office suite. Google’s impressive Google Docs is pretty nice, but it has a 512k limit, which worries me. Seems like a lot, but I’ve been burned before with arbitrary size limits on files (Yahoo! couldn’t have emailed my account and let me know it was going to truncate my files? How could you just throw away user-entrusted data? But I digress…)

There are more and more of these AJAX laden online office suites, and its disappointing that so few are trying to just make a simple notepad, a text box that one can plop text into, without making me have to install an entire wiki or CMS on my server, without me having to remember to escape this chararcter or deal with that type of formatting or whatever, and without trying to make it too overblown.

So, what are my options?

Install my own
There are wikis and CMS tools which I could put up. For generic CMS/Content systems, Drupal is really nice. WebGui by PlainBlack, Joomla and Mambo are popular, but more complex. I couldn’t find a simple “notepad” tool; I guess I could write one in PHP but it seems like someone should have already done this. Maybe its standard homework for Comp 101 classes, so no one shares their results.

For wikis, its also an overkill problem. The “supported” wikis are huge and dense, and the lesser ones tend to disappear. WikiMatrix lists many wikis for installing as well as wiki hosts (see below).

MediaWiki is the fan favorite (its the wiki behind Wikipedia), but its a monster. It does get around the “small text box” issue by allowing independent editing of section breaks, which is a nice touch. TWiki is also impressive, but way overkill (and not related to Twiki of Buck Rogers quasi-fame).

Finding a simple and supported wiki engine is actually very difficult. PMWiki has turned out to have a nice compromise between simplicity and staying power.

There’s also TiddlyWiki and other “all in one HTML file” variants, but I find them to be gimmicky. Some people love them.

Public Wiki
Either hosted open source or proprietary wikis allow anyone to have a wiki. Some use easier AJAXy or WYSIWYG attractive editors instead of wikitext. Note that some of these do not allow private pages, so your entries would be open to the world if you aren’t careful.

I am not so interested in having others collaborate, but if you are, you can look at CentralDesktop and Nexdo.

BTW, if this doesn’t make you sick of wikis, check out WikiIndex.

Organizer sites
Stikkit and others. All see notes as very short things, or “lists”. These all tend to look great, but either a) add lots of stuff I don’t need b/c they all think they are PDAs/Outlook, like Calendars, ToDo/Tasks, etc. or b) are so Ajaxy that they take forever to load. They all tend to be Web2.0 good looking, but again feel like overkill. They also tend to be stuck on the GTD philosophy, and if you aren’t using that approach, then these tools may not be lots of fun. Finally, they are very dependent on tagging, and not all offer full text search, and if you know my opinion on tagging, you can understand why I’m underwhelmed. You may hear the phrase “digital sticky note apps” as the “catch-phrase” around these…

Online Office Sites
Every one is overkill when you just need to whip in and edit a note.

Not everyone is a suite…

So, what will I use? I’ll update when I have an answer. I’ll look at the public wiki sites and potentially the office sites. It comes down to speed to implement, and speed to edit (how fast can I get it, add a line, and get out). In the future, mobile access would be nice as well… but that can wait for another time.

PS: I was pointed to 50 ways to take notes which is pretty comprehensive, some overlap with my list, but worth looking at.

PPS: Interesting cool site: Competitious is a way to make comparison matrices and track competitors and comparisons between products or offerings.

PPPS: More on notetaking, most of which I wasn’t all that impressed with, but the authors at Web Worker Daily like them: “7 Apps for Online Note-Taking“

Branching Surveys... · 04/06/2007 01:14 AM, Analysis Marketing

One of the best tricks someone showed me a few years ago was how to deal with the missing data resulting from branches (or “skips” but that’s a terrible way to phrase it) in surveys. I provide code for SPSS, but the same approach works in R and SAS.

In summary, you set all the missings to a “user saw but ignored” code, then walk the logic of the survey converting “branched over” items into a “didn’t see” code so you can treat these 2 types of missings appropriately. Didn’t see missings are fine; saw but ignored can be a potential issue in the analysis.

You start by setting all the missings to some high unused number. I’ll use 98. This will be the code for “seen but not answered by the user”. We start off pretending that every missing was a seen-but-not-answered.

RECODE q01 to Q09_7, q10 to q20, q21 to q25
    (SYSMIS = 98).

Then you walk the survey, following each branch. If the user actually did branch over an item, you recode it as another high number, such as 99. So, here, if the user answered 2 to Q01, we branch over q02 to q08 (making those 99s). If we have any other situation, we code the missings (which we made 98s previously) as 99, branched over. If we have a branch but these weren’t missing (ie, weren’t converted to 98s by the first transform), then we have a logical error, denoted here by 97. This is a problem: it means the survey was either coded wrong at presentation, or the user mucked around with it. If lots of people have this, then its a flawed execution; if just a few, its probably users mucking around with the survey.

DO IF (q01=2).
   RECODE q02 to q08
    (98,99 = 99)
    (ELSE  = 97).

This can also work if you have linked items:

DO IF (q18=2 and q13<>2).
   RECODE q19 to q20
    (98,99 = 99)
    (ELSE  = 97).

(and various iterations of Q18 and Q13 combos follow.)

Finally, at the end, we tell SPSS to treat these codes as missings and label the various “special missings” (as SAS liked to call them):

MISSING VALUES   q01 to Q09_7, q10 to q20, q21 to q25  (97 THRU 99).
ADD VALUE LABELS q01 to Q09_7, q10 to q20, q21 to q25
     99 'Did not see'
     98 'Did not answer'
     97 'LOGIC ERROR'.

So, what did we do? We converted the various missings into actual user ignores vs. never-saws, and we also identified places where data was present where it shouldn’t have been. This also allows you to create correct denominators for percentages and tabs, since you shouldn’t use all N for items were only a portion of the sample was exposed.

Its important to double check your code vs. the branches you proposed in the survey. If you skip a branch (such as a nested branch) or don’t precisely duplicate the logic via these transforms, you could substantially change the interpretation you give the resulting counts.

Also, duh, don’t use numbers which could appear in the variables. If you ask people to divvy 100 points, for example, don’t use the codes 97-99 as I have since those could overlap with real values. Use 1000097-1000099, for example, and change my examples appropriately.


Get The Glass · 03/30/2007 08:03 PM, Marketing Trivial

A friend sent me this blog link about an impressive flash game promoting milk. I didn’t have time to play much of it, but it does look amazing.

The game: Get the Glass

Justin’s Flash Blog: Rocks My World!

Links in the comments on that blog point to behind the scenes info. Well worth a peek.


Save Online Radio · 03/22/2007 10:19 PM, Trivial

Update 3/22/2007: Copyright Board Agrees to Reconsider Web Music Fees at Bloomberg…

If you haven’t been following, the RIAA (who seems to be more powerful than the CIA and FBI put together) have now managed to convince the govt. that online radio is too cheap. So, the Copyright Royalty Board of the Library of Congress has raised rates in such a way so as to destroy pretty much every internet broadcaster not owned by a major corporation. If this stays unchallenged, say goodbye to Pandora, Live365, and all the other great champions of modern alternative tunes.

This is up there with the DMCA in terms of restricting your ability to enjoy media as you choose. Take a few minutes and help stop this latest example of corporations trying to own what you see and hear. is a great place to find out how to help.

A sample story about the ruling at the Chicago Tribune: Web radio reels from royalty ruling but a Yahoo! search will show you lots more.

I enjoyed this... · 03/22/2007 06:56 PM, Trivial

16 things it takes most of us 50 years to learn

Update: Ok, so it was a ripoff of Dave Barry. Its still funny. Now located at:


An allegory · 03/13/2007 07:05 PM, Personal Trivial

Many, many years ago, a man goes to the marketplace. He is surrounded by tents of people selling every type of ware, from camels to dates, from lamps to statues. He walks around, passing tent after tent, til he sees what he is looking for, the fortune teller, and goes in.

Its dark inside, even though the light outside was glaring, and as he sits, he realizes that he’s stepped into a different world.

“What do you want?” wafts a voice from the shadows.

“I know the future,” the man says. “Perhaps like you, I see how things should be, how they can be. I see a world where people ride on boxes with wheels, where people have devices which sing and dance, where people have the concept of a 0. I see a world where people recognize that there is a better way, even though that requires sacrifice of the old to get there.”

“So?” the voice floats.

“But I don’t know when the world will come. I also know how today works, of barter and camels, of scarves and gold and dinars. I can do well with this, knowing that its not how things should be or could be… but this means giving up the dream, of forgetting how things should be, and just doing the best with how things are. Things will be better for my wife and child, for those who work with me and hire me, for none of them see this vision, and believe that I work against them when I choose to spend energy on that instead of on making the now better. They say that .” He paused, to finally take a breath.

“You talk too much, that is what they say, I think,” the voice keens.

“Yes, you are wise. So tell me, what can I do? Should I continue to fight for what should come, even at the cost of knowing that I am not helping those who need help now? Should I help those now, knowing that I put off creating the better world for all?”

“Don’t tell me. You can’t do both because…” the vocie trails off.

“Because its a compromise, a sell out, and other phrases that have not yet been invented. It helps no one, and is not honorable. I need your advice, your sage wisdom… What should I do?”

A gentle brrr of snoring came from the shadows. The man sighed, got up, left a gold piece on the table, and trudged back into the blinding sunlight. He realized that in all his wandering around the market, he had no idea where to go to get back home.


"Forgot to mention..." · 03/02/2007 05:42 PM, Marketing

I recently received this email from a service I’ve used in the past called YouSendIt. Its actually not a bad service for sending files through email (though Pando is prettier and can integrate with Yahoo Instant Messenger… but I digress)... but I will probably not be using it too often til they respect their customers a bit more.

I was disappointed to see this marketing email printed below. You read it, see if you feel the way I do:

Subject: Here’s a Free Upgrade for Your Account

Free Upgrade to Business Plus


Your Lite account is great, but you deserve more. That’s why we are giving you a free upgrade to our most powerful service, YouSendIt Business Plus, to experience the many benefits it provides. This is a $29.99 value and you get it absolutely free – no strings attached.

Compared to your current account, some key advantages include the ability to:

* Track file downloads * Send multiple files at once * Customize file deliveries with your logo * Password protect your uploaded files

Plus, there are no ads on your pages or your recipient’s download pages.
Hurry, this promotion ends soon.

[A link I’ve removed was here]

Sent by Responsys, it included the proper Can-Spam stuff. Note that they left off my first name, but that’s not really the issue.

When you click, you go to a page with a list of the benefits, including larger files, etc. (probably things they should have mentioned in the mail, actually) and a login.

You then get to a page with 2 options, and here’s where I get disappointed.

1) Click here for a FREE upgrade! [Free Upgrade image].

Then in small print, “Upgrade valid for 30 days. Clicking this button upgrades you for free. Your credit card will not be charged and if you choose not to subscribe after the upgarde period ends your account will revert back to Lite status.”

(BTW, 2) is a signup form for full subscription).

Now, why am I disappointed? I shouldn’t be, they were truthful in every aspect of the message and login page… but only near the end did they mention that its a 30 day trial.

Like the online stores which hit you with S&H charges right before the final confirmation, I consider this a bait and switch. What could have been a real loyalty creator (give it to me for a year, but ask me to share the site with others by using it, and if not used 1x a month, remove it) really became yet another marketing flim flam.

Would I have even clicked if I knew that it was a trial? No, so creatively, they will have high clicks. But because they refused to let me know what I was going to experience, all it really does is disappoint the user, turning a gift into a negative experience.

There is no shortage of competitors in this One-Click Hosters space, so the only way to stand out is by clever marketing and smooth product functioning. Pando is leading the way on smooth, so others are choosing marketing. And in this case, bad marketing.

My suggestion to YouSendIt? Leverage the very technology you’ve built your business around, email, to send open and honest messages about trial upgrades, and maybe people will try them. But hide constraints or limitations til the end, and all you do is kick yourself. And consider the viral nature of your offering: if I use this to send a file to my friend, they will discover the service and use it for themselves (ala Hotmail in the early days). So, you should really want me to have that upgrade, and let me keep it only if I use it 1-3 times a month. Now, that’s a “free upgrade for my account”, and you’ve challenged me to show how much I appreciate it by using it.

Viral, sharing, and cheap. Wow, compared to the approach involving “almost deception” and “hide the facts”, it might just work.

Update… see Jeremy Wagstaff’s post on a related YouSendIt issue

X1 starts to open up... · 03/01/2007 12:19 PM, Search

No, they aren’t going open source or anything like that, but they’ve finally let slide some features which can basically enable all that stuff they’ve turned off.

UPDATE: The below refers to 5.6.3, in later versions they’ve gone back to being disrespectful and scummy. See X1 says ‘New business model: Take away features, charge more!’.

If you haven’t been following, you can see my past articles on Desktop Search with this link. Basically, I was a huge fan of X1 Desktop Search, even paying for it, but I disliked their continuous “phone home” process to check for piracy. I recommended the still awesome Copernic Desktop Search, now in version 2, still incredible, including network drive searching and FREE.

Then, X1 released a free version… but removed network drive searching (and removable drive searching, btw). So, it was reduced but free… and in my hatred of misleading marketing, I expected them to point out that this free version is not the same as the former commercial version. They will not do this, but at least they are finally allowing us to get it back up to speed.

You have a couple of options: if you like most of the defaults, and just want network search back, you can enable it on Windows XP Professional. If you are on Home or want to re-enable everything (or disable stuff you never use to just remove it from the interface), then you have a longer process, but still doable.

This is all free and poorly documented, but once completed, it puts X1 and Copernic pretty neck and neck.

So, first off: download the X1 Client Deployment Kit which includes an MSI file for the client, and some configurators. If you unzip it to a junk directory, you’ll see a zip file inside the overall zip which has an ADM file. Unzip that and put it somewhere static (say, windows directory).

As these forum topics reveal Network shares and Accessing network shares (among others), you can then open the Group Policy editor, open up the ADM file (which exposes X1 Admin features), and start to change what’s allowed and what’s not. Restart X1, and bingo. That’s the easy way.

The harder way is to configure the MSI to make a new installer for yourself. Forum Topic 3 gives the step by step for this. There are more features to be configured this way, and its more involved, but if you can’t do what you want with the group policy editor, this will work.

Oh, and changing what tabs display? That’s well hidden; its not in the Options or View menu like you’d expect. It turns out that they are hidden in that left hand window (click on the dots, or View | Searches Pane). Delete from the “top” and panes go away. I removed Music, for example.

So, comparison? Copernic nicely mixes attachments and email in one search results window; I have to use separate panes (or the ALL tab) to do this in X1. Copernic does not allow you to perform actions on group results such as deleting all the mails you’ve found; X1 does this with ease. X1’s index is smaller. Both have trouble with wildcards. X1 searches Tasks and Calendar; Copernic does not. X1’s display of Contacts info puts more on the screen at one time.

So, still a big Copernic Desktop Search fan, esp. in this latest version 2, and still recommend it in most every case. But if you are technically savvy and want to try a powerful option, you might consider giving X1 a shot.

Oh, and guess what: It doesn’t phone home for piracy anymore, and you can turn off all its other optional phone home stuff.

Finally, a question I get all the time: doesn’t Vista’s desktop search rock? Won’t it kill these other options? Well, to be fair, I haven’t tried it yet, but I expect it will take at least one more rev to be where it needs to be, given past experiences with MS search products and knowledge of how things work there. But that 2nd release will be impressive.

And remember, for my reviews of other tools and what I liked and didn’t, feel free to check out Desktop Search.

