Sunday, November 16, 2014

Benchmarks 2015

Greetings:

Yes, we will be doing benchmarks for 2015 with a presentation (maybe) at either Decision Camp 2014 and/or at Business Rules and Decisions Forum 2015.  In either case, I would like to have a panel discussion  of some kind after the one-hour (or less) presentation, much like we did back in BRF 2006.  Back then, it was a two-hour afternoon presentation that year when Dr. Forgy (PST OPSJ), Dr. Friedman-Hill (Sandia Labs Jess), Mark Proctor (Red Hat Rules and Drools), Daniel Selman (ILOG JRules now IBM ODM) and Pedram Abrari (Corticon) were all on the panel.  My part should be far briefer since all I want to do is to introduce the concept of BRMS/RuleBase benchmarks, show what we have done over the years, where we are today, and then moderate the panel discussion.

Maybe this year we could expand to 10 representatives comprised of the above plus Gary Riley (NASA CLIPS), Dr. Jacob Feldman (Open Rules), D/M/M X (Visual Rules), Carlos Seranno-Morales (Sparkling Logic SMARTS), D/M/M X (FICO Blaze Advisor).  The more participants the more confusion but I will do my best to moderate and not let someone "hog" the microphone nor to interrupt others until they are finished without being heavy-handed.

I will probably send out invitations to all of them and ask that they be represented.  Likely, only those blatantly opposed to benchmarks of any kind or those champions of the past years will be there so it should be a lively discussion. 

Another thing that I have done in the past was to allow only those "interpreted" version of the rulebase to run.  This year the vendor will be allowed to use the compiled Java version as well as "interpreted" version of the rulebase.  I will try and run both versions if available and will report on the difference in two different tables, one compiled and the other interpreted.  The interpreted versions will probably be a bit smaller since most modern rulebases do not run an interpreted version, only the compiled Java or Compiled-C/C++ versions.

Me?  I really like the idea of "Standardized" Benchmark(s) where everyone can compete with whatever rules they like so long as it solves the problem.  I am not a fan of micro-benchmarks since they do not measure the overall performance of the engine.  The old OPS-type benchmarks were, seemingly, deliberately designed wrong just to task the engine by loading and unloading the Agenda Table with as much junk as possible.  Probably I will still run the Waltz-50, WaltzDB-16 and WaltzDB-200 and add the Cliques and Vector Cover benchmarks.  Both of the latter are NP-Hard benchmarks and are, for all practical purposes, impossible to cheat.  And the vendors will have a free hand with the last two.

The first three are standard and will be dependent only on the Conflict Resolution Process (CRP), as so adroitly pointed out by Gary Riley at http://rileyonlife.blogspot.com/2014/05/river-benchmark.html .  As Gary pointed out there, the speed of the benchmark is highly dependent on the CRP but it also dependent on the number and type of CPUs used, the Hard Drive type and speed, the amount of RAM used and the command line used in either Java or CLIPS to give the process the "correct" amount of RAM and Swap space.  The original code for all of the OPS benchmarks are at http://www.cs.utexas.edu/ftp/ops5-benchmark-suite/ if you want to start at ground zero.

Oh, one other thing:  Later (probably in June or July) I will publish what I have so far on the benchmarks and try to get some suggestions on improving them, discarding them or adding other benchmarks that are more "real world" (as some have suggested.)  If you want the full nitty-gritty please request such in the comments section and I will send you one.   I do not plan on publishing them until the conference.

Shalom,
James

No comments: