Wednesday, April 25, 2012

Benchmarks Summer/Fall 2012


OK, we're off and running again on benchmarks.  CLIPS, Jess, OPSJ and Sparkling Logic are in the running since they have agreed to provide their programs to run on my machine; a Dell i7 with 12GB of RAM, a 7200rpm 1TB HD and a wonderful graphics board.   Oh, and a 32" High Resolution monitor that can display anything that I throw at it.

Anyway, now the technical part.  The first hurdle will be WaltzDB-16 that will build 16 3-D boxes from a series of random lines generated from another program.  After that will come the more cumbersome WaltzDB-200, WaltzDB-400 and WaltzDB-600.  All of these will have to be written and run for a specific rulebase.  CLIPS, Jess, OPSJ and Sparkling Logic have been done for DB-16 and DB-200 and published for OPSJ in InfoWorld as the introduction for Rete-NT some time ago.  Drools DB-16 was written some time ago but I haven't checked it for accuracy yet since it seems to run inordinately fast and doesn't fire the same number of rules as other rule engines.   I've been told that WaltzDB-16 was written for ILOG JRules but we don't have a copy here.

However, I'm going to re-write the benchmarks so that I don't use the number of rules as the acid test that it actually ran correctly.  Rather, it will check that it built the 16 boxes correctly for WaltzDB-16, 200 boxes for WaltzDB-200, etc.  The 36 or 37 rules are absolutely non-trivial rules and should prove to be a good starting point for testing any rulebase.  I'll have to get FICO to provide me with Blaze Advisor as well as get IBM to provide me with a valid copy of JRules rather than that silly sandbox that they now provide. 

Also, I would like to run Open Rules, Drools, and some others (they do provide their engines) as well as the code for the WaltzDB-16, 200, 400, 600.  I can check the code but I refuse to write the code for all of the engines in the world.  These tests will more than stress the ability of the engines to run large, complex problems.

After that, Mark Proctor of Drools (Red Hat) suggested that we have a 1K, 5K and 10K rule set with  sufficiently large data set to test what he and others call "real world" problems.  I have no issue with this approach but I do no have such a test suite and, to my knowledge, neither does anyone else.  Or, rather, no one has offered to provide one for testing purposes.  I'm sure that there are many of them out there and I have seen them but they usually exist at some customer's site and are not for publication.  I do have one that maybe I can use later but it will be 2013 before I can get around to producing a general purpose rulebase with all of the proprietary information removed from both the rules and data.

So, let me know if your company would like to participate and, if so, send me a working version of the engine and a 12-month license and we'll get cranking.  Also, I will need a technical contact person at the company for help should something go bottoms up.  :-)


No comments: