Wednesday, November 18, 2009

Rulebase Benchmarks 2009 Part 3

Greetings:

Once again we return to the world of Rulebase Benchmarks. Benchmarks for rulebased systems seem to have three camps: Those who love them and usually have good numbers to report. Those whose benchmarks are not remarkable one way or the other and are not worried about them; yet. Those who benchmarks are not very good and keep telling their users and/or customers, "Yes. That's OK for academics. But those benchmarks don't represent the 'real world' in which we have to live and work every day."

At October Rules Fest 2009 I asked the T3 (Thursday Think Tank) if they thought benchmarks were important. Dr. Forgy (who has the fastest rulebase engine on the planet Earth - called TECH) and myself (who has spent the past decade exploring benchmarks, their cause and effect) were ardently in favor of such things. FICO (Blaze Advisor) folks were kind of OK with the idea, especially since they use Dr. Forgy's engine in their rulebase. The rest of the group, whose performance was anywhere from just barely OK to abysmal, were not really against using benchmarks but it just wasn't something that they wanted to discuss in public.

So, here's my proposal: If anyone can produce a "real world" benchmark that can be run on any rulebase engine on any platform, PLEASE DO SO!! Otherwise, those of us who are in this for the long haul will continue to examine a rulebase from many different viewpoints but we shall NOT throw out tried and true tests that continually disclose fatal faults in fancy products that will not perform with large data sets combined with large rule sets. Such applications are large banks, large insurance companies, Homeland Security along with "real world" AI problems in rulebased forecasting, petro-chemical processing problems, electrical power grid production problems, shift scheduling for large plants and/or hospitals and anywhere else that will have the many objects - many patterns complex cross products.

Speaking of which, I guess it's about time to publish the 2009 Benchmarks. Our goal is to finish up in December some time so that at least we can say that we finished this year with a completed product. :-)

SDG
jco

5 comments:

woolfel said...

As a user, I favor full and open disclosure. For me, a truthful and open message is much better than a marketing hype message. The minute I see marketing message filled with fluff, I can't take it serious.

Paul Vincent said...

Hi James - its an interesting quest, but I fear of limited interest. Why? Apart from the reasons you mention,
1. performance is often "good enough" in the "decision services" most commercial tools deploy as.
2. most commercial BREs are deployed executing simple non-inferencing if-then decision rules - for a benchmark here you should probably team up with Prof Jan Vanthienen and cover decision tables...
3. for complex problems exploiting rule engines ("inference programming"?), there are just too few use cases today to warrant interest in benchmarks.

Cheers

PS: TIBCO include a copy of Miss Manners in the rule engine download, but the example doesn't (yet) exploit MultiThreadedRete.

Geoffrey De Smet said...

What happened to the Manners 2009 benchmark? Are there any other contestants than my Drools Solver implementation?

James Owen said...

At the October Rules Fest the year (ORF 2009) there were only two votes to continue benchmarks; myself and Dr. Forgy. The reason for this is that the customers who NEED performance were not at ORF 2009 and the few vendors who WERE there (excluding FICO who use the Rete III Algorithm from Dr. Forgy) are not interested BECAUSE they can not compete with either Rete 2 / III nor with TECH.

"Good enough" is probably OK for smaller (less than 1,000 rules and less than 1,000 objects with few cross-match products and very little chaining between rules) but for large systems needed for Homeland Security, large networks, huge insurance companies needing complex underwriting, etc., performance is woefully lacking in what they are doing.

For example, HLS uses a Prolog system with less than 1,000 rules for special occasions only. Most of their rules are run in COBOL on a mainframe and they are EXTREMELY simple rules. I can't go into detail but take my word for it that they are not doing much with rulebased systems because they don't know about then nor how to use them.

SDG
jco

Anonymous said...

you have a nice site.thanks for sharing this site. various kinds of ebooks are available here

http://feboook.blogspot.com