Wednesday, March 5, 2014

I MUST be Michael Jordan

Greetings:

Long, long ago I was watching "Numbers" on TV.  This was a TV series based on a Geek helping out the FBI.  Anyway, the Geek and his brother is outside playing basketball in the driveway.  The geek proposes a hypothesis; "I am Michael Jordan.  (Not the name he used but close enough for this story.) If I am Michael then I can get around you and score a two point basket." He does it and scores.  Ergo, he must be Michael Jordan.

His brother says, "No, that just shows that I still fall for the same old head fake that you've been using since grade school."  The Geek then explains to brother that what this demonstrates is not that the test failed, but that the hypothesis was wrong.

It actually had to do with fingerprints and a hypothesis that I had never really considered.  First, that fingerprint identification is an art form, not a science.  We don't really KNOW that no two fingerprints are alike BECAUSE we have never actually tested all 7 billion persons on the plant.  And with ten fingers (on average) per person that means comparing 7B*7B*10 fingerprints.  Mathematically, 7*10^9 * 7*10^9* 10 = around 49*10^19 fingerprint comparisons to validate this theory!  Anyway, what we have done is test a sample and "ASSUME" that the hypothesis is correct.  [I really like this show.  One of the few on TV that tries to explain logic to the masses.]

Epiphany:  How many rules are written with the same wrong hypothesis.  And how many rulebased systems, aka BRMS, are out there running faulty logic BECAUSE we, the true geeks, don't make the BA's explain in depth why that hypothesis HAS to be true and no other explanation can exist.  I can think of a couple of systems that had those problems before we came along and every time I got that "eye roll" that said, "You're a geek.  You don't understand business.  I should not have to explain this to you."

Anyway, just something to think about on a slow week- night.  :-)

Yaakov
Thanks to Diana Forgy for correcting my population calculations and my math earlier!!  :-)

Tuesday, February 4, 2014

Benchmarks: Who, What, Where When and Why?

Greetings:

A BRMS (Business Rule Management System) is, after all, a rulebased system that has evolved into what we now refer to as a Decision Manager or Business Decision Manager when applied to business systems.  Over the years we have tried to establish a set of benchmarks that will allow the users to test various systems for speed and efficiency on different, complex problems.  The two most famous tests are the Miss Manners test and the Waltz benchmark.  The original five OPS (Official Production System) benchmarks can be found at ftp://ftp.cs.utexas.edu/pub/ops5-benchmark-suite/ and will run on most any platform.  All major BRMS vendors have written the code for their particular language syntax for Miss Manners, Waltz and/or WaltzDB.  In addition, Dr. Forgy and I have written the NP Hard benchmarks for several systems.

The Miss Manners OPS benchmark originated about 15 years ago with OPS5 and CLIPS languages.  It’s a relatively simple rulebase with only eight rules that will do a depth-first search to find a solution and the program comes with a data generator.  The idea is that Miss Manners has invited 16, 32, 64 or 128 guests with various hobbies to a dinner party.  She want to seat the guests in boy-girl-boy-girl arrangement so that each guest will have someone on the left or right that has a common hobby.  Back in 1979 the Manners 128 program took 5,838.5 cpu seconds to run on a SPARC 1+.  Today, due to massive improvements in hardware, this one runs in about 1.5 seconds.  The data for Miss Manners should have X number of guests, 2 or 3 hobbies from a possibility of 5 hobbies and the guests must be equal number of male and female to be seated M-F-M-F etc.

It becomes more complex with the number of hobbies and number of guests.  The original test was written to really “stress” any rulebase but some vendors found the trick of putting a single “not” statement in one of the rules that would make it run 15 or 20 times faster.  Without that “trick” it’s a very good measure of how fast a system will run on any give platform and CPU.  Another “trick” is to re-arrange the data so that the rules will run faster because the benchmark is data-sensitive.

The Waltz OPS benchmark is another oldie but a goodie that will really stress a rulebase system because it checks to see how well the rulebase does pattern matching.  Consisting of 32 rules, it will analyze the lines of a two-dimensional drawing and label them as if they were edges in a three-dimensional object.  The Waltz benchmark also comes with a data-generator and is much harder to cheat with than the Miss Manners benchmark.  Waltz comes with a C program for generating data for any number of regions; 12, 25, 37, 50 or even 200.  UT maintains the object C files for convenience at /ops5c/lib/libops5c.a and a math library that is used with the benchmarks.  The SPARC 1+ time for Waltz-50 was 3,831.8 cpu seconds.  Today’s benchmarks are between 0.2 to 1.9 seconds at worst.

The WaltzDB OPS benchmark that, like Waltz, labels the lines in a 2D drawing in order to assign configure a 3D object.  The change is that WaltzDB can handle drawings with junctions of four or six lines while Waltz does junctions of only two or three.  WaltzDB only has 35 rules but its data sets have many, many more junctions.  WaltzDB also has its own data generator, waltzydb.c, and it also needs to be compiled.  I have run tests using 4, 8, 12, 16 and even 200 regions.  The WaltzDB 200 is the most difficult of all.  16 regions on the old SPARC 1+ took 8,033.3 cpu seconds but today it takes about 0.5 seconds.  The WaltzDB 200 takes only about 10 – 15 seconds on most systems but can take 2 seconds or less when running Rete-NT, the latest incarnation of the Rete Algorithm.

The A.R.P. OPS benchmark is program is an Aeronautical Route Planner that will plot a course across a given territory from P1 to P2 for a airplane or CRUISE missile.  There is a dataset generator that asks about 40 questions and generates a file called rav-sceneXxYxZ.dat where X, Y and Z are the 3D coordinates from the input data.  There is a sample list of questions in the README file at UT.  This benchmark is unique in that there are two files that have to be loaded, “filename”.dat and “arp-rp-makes”.  The best time for the A.R.P. benchmark on a SPARC 1+ with 10x20x30 data set was 1,220.2 cpu seconds.

The Weaver OPS benchmark is a combination of several expert systems that communicate through a common blackboard, or maybe a whiteboard today.  The “practical” application for this system is to design a VLSI (Very Large Scale Integrated) chip design, something that a chip manufacturer such as AMD or Intel face every day.  Far more detail is provided in the README file.  The best time for the Weaver benchmark on a SPARC 1+ was 1,053.7 cpu seconds.

The next two benchmarks come from a series of benchmarks known as “NP complete” benchmarks where NP stands for Non-deterministic Polynomial-time.  We have started using these this year, (1Q2014) since we have found that Manners and/or Waltz to be either 1) easy to cheat or 2) that the benchmark fires only one or two rules over and over.  Manners is guilty on both accounts.  So, this year we have include both the Clique Problem and the Vertex Cover Problem for starters.  Later we can expand this to other NP Complete problems. 

Either of these problems can be converted to Java or C syntax but, for starters, I plan on implementing these in Drools, Jess, CLIPS, Smarts, ODM and Blaze Advisor.  That should be enough for comparisons for this year.  Dr. Forgy has been kind enough to have already provided the code for these two NP-Complete problems in OPS syntax that we should be able to convert to Java, C/C++ or C#.   Or BASIC for that matter.

That pretty much covers the 5-W's of good journalistic articles.  The HOW (5W+H) part is not the most difficult.  Our plans are to provide the NP-Complete benchmarks, along with the first three UT benchmarks, for our talk at Decision Camp 2014 to be held in San Jose this November.  I hope to see many of you there since registration is, again, free thanks to Sparkling Logic and eBay.


Shalom
Yaakov

Friday, January 17, 2014

More on New Macs

Greetings, Programs:

Still playing with my new MacBookPro - quite a toy as well a super-development tool.  Things to really like about the "new and improved" MacBookPro are:
  • EMail: Seems to be improved and has the VIP feature
  • Speed: both CPU and SSHD.
  • Downloading movies is 12-15 times faster (15 minutes - not 3 hours for a 1-hour movie).
  • Hi-Res Screen (compared to older MacBookPro or other laptops)
  • Lighter Weight (and smaller / thinner)
  • 16GB of RAM vs 3GB or 4GB 
  • HDMI port standard
  • 2 - USB 3 (or USB 2) ports
  • Quad i7 Processor
  • NVIDIA GeForce GT 750M with 2GB of GDDR5 memory and automatic graphics switching
  • Good integration with Microsoft Office 
  • Full-size, back-lit keyboard
  • Much longer battery life (8-hours movie time and about 30 days - yes, days - on standby)
What do I not like?
  • Smaller screen (15" max size versus 17" in older models)
  • No 400/800 Ports for existing HDs (Extra cost for adapter to Thunderbolt ports)
  • Adapter for my older 30" Cinema Color Monitor takes up 1-USB and 1-Thunderbolt port
  • Microphones on side rather than front
  • No built-in DVD ($70 External)
  • No standard Blue-Ray Drive, etc
  • Max Resolution dropped from 3600x2500 down to 2560x1600
  • Not very many options on iMovie downloads
  • Games that come with any Mac are limited to Chess
 This is not a complete list but it should give you some idea about he new MacBookPro; some things to think about before buying one.  How can you go wrong with a new Mac?  OK, they cost more, but they come with all the goodies installed BEFORE you open the box.  For example:
  • Mail is almost ready to go and Mac guides you on setting up the EMail accounts.
  • All of the approved Mac software works as advertised.
  • iLife costs about $90 now and has great apps standard
  • iWeb page designer is an excellent design tool 
  • Calendar and your iPhone are tightly integrated
  • iPhoto and our iPhone are tightly integrated and iPhoto imports directly from cameras
  • Notes
  • XCode is a great C/C++ or Java editor
  • I have a quickly available terminal screen with great colors schemes and easily adjustable sizes 
  • Super video and audio tools available at about 1/4 the cost of other professional tools
 All in all, I suppose that the good parts outweigh the bad parts.  I think that I really like the light weight in my backpack and the regular hi-res screen on the laptop more than I dislike the missing 400/800 ports or the brain-damaged iMovie software.  If you have an existing MacBookPro with the older Core-Two-Duo processor, it is time to trade up.  If you already have an older MacBookPro with the Retina display, keep it. 

Bottom line:  Would I suggest a Mac rather than a high-end Windows machine?  Absolutely!  You can always use Windows on your MacBookPro if push comes to shove.  But why would you want to do that?  Just get the Mac version of whatever software you are using and stay away from the plethora of problems and upgrades with Windows.

Shalom
James