Saturday, March 14, 2009

Corticon and Tibco

Greetings:

There is a link called "The Orange Mile" [takeoff on "The Green Mile" ?] that published a "review" of Corticon as used with Tibco. They were even less impressed than I was when I reviewed Corticon several years ago - about 2006 or so. The link to the Orange Mile article is at http://orangemile.blogspot.com/2008/06/corticon-rule-engine-review.html so you can read it for yourself. I could not find an author's name or homebase link so it must be anonymous or something.  The link to my article is at http://www.infoworld.com/article/06/07/07/28TCcorticon_1.html if you would like to read the original broohaha.

Corticon calls what they do DETI, or Design Time Inferencing. The math that they claim to be part of the optimization process is doubtful since they don't show you what they do. ("Pay no attention to the man behind the curtain." - Wizard of Oz) That being said, what they do is basically static and, to my way of thinking, not a flexible as a true rulebased system.

On the other hand, you must consider that ILOG, FICO, and others now have sequential rules that are nothing more than what Corticon is doing. Even the "Decision Tables" by FICO, ILOG and Drools are extremely similar to Corticon in that each row is a rule that is static as well.  True, the row (rule) is processed by the rule engine BUT it's still pretty much a static process.  FICO has just recently introduced a "gap analysis" tool that Corticon has had for years.

Visual Rules and VisiRules are code generators, Visual Rules from a spreadsheet and VisiRules from a model driven process. They, too, are static processes.  Visual Rules generates plain-jane Java code while VisiRules generates a high-level processor for Prolog. 

Being a purist, I prefer a "real" inference engine that can deal with anomalies and incomplete rules and does have some kind of Conflict Resolution Strategy - as has been mentioned and discussed in other blogs at this link.  This is the BEST way to handle complex logic, incomplete rules and anything that might require thinking.

On the other hand, if ALL that you are doing is processing straight "out of the procedure book" business logic, why would you NEED and inference engine????  The answer is obvious:  You don't.  They may as well use a spreadsheet and get it over with as easily as possible.  And, so long as you don't have over a few hundred rules in each set, it should work quite well.  

All of that being said, why be critical of a company who is doing the same thing to the rulebase industry as the others except they don't have the fall-back position of a real inference engine should they need it? They (ILOG, FICO et al) gave their "stamp of approval" to the bastardization of the rulebase industry when they started down the Decision Table, Decision Tree, Compiled Sequential route a long time ago. Now that somebody is giving them the "come-uppance" that they deserve, they begin to whine like a mule. It makes you want to gag at the gall and hypocrisy of it the "Big Four" vendors.

SDG
jco

8 comments:

James Owen said...

Reference Comment from David Strauss:

David, David, David. You really know how to emotional push buttons don't you? But, being a marketeer, you would. But, since all of this is in the comments section maybe it won't be considered a "flame war" like some of our other contentious emails... :-)

Originally I really was going to just delete the comment and move on. But, I lived through all of those slanderous remarks from both yourself and Dr. Mark Allen when I originally wrote the review for InfoWorld. You (meaning "you" as in "Corticon") tried to have me removed from the editorial staff of InfoWorld with heavily-breathed threats of possible law suits for slander and libel. Most unbecoming since the article was most flattering to Corticon.

But, my Senior Editor stood firm and the article only printed a slight amendment that "This review has been changed to qualify statements in the original regarding Corticon 4’s performance and scalability and its suitability for complex applications. The scoring remains unchanged." The complete article is at http://www.infoworld.com/article/06/07/07/28TCcorticon_1.html for you to review should you so wish.

It is apparent that you, David Strauss, did not read the original article on the "Orange Mile". If you had, you would realize that I am not the only one who believes that Corticon is not suited for developers nor for the serious rulebase problem that might confront them. Now, just a couple of other comments.

First, Dr. Allen (whose degree is in applied physics from Cornell, CEO and founder of Corticon) served a brief stint with a clinical health care system and decided that all of those supposedly brilliant computer science folks at CMU (most of whom were not computer scientists since there was no such thing in those days) were all wrong in trying to use an inference engine for simple, everyday problems. They were just Nobel prize winners and the folks who invented all this stuff. He even went so far as to write a paper called "Rete is Wrong" and publish it on the internet. But, the paper received such scathing criticism that he pulled it off the internet shortly thereafter. (I still have a copy stored away somewhere should you want it.)

The thing is, while he was WAY off base in the article, he was right about inference engines. An inference engine should NOT be used for simple business logic - a spreadsheet would work quite as well and slightly better than just a pencil and paper. And a decision table (as used by ILOG, FICO, Drools et al) is quite sufficient for most simplistic business decisions.

That does NOT mean that you can solve complex problems with a spreadsheet. Corticon's solution to both Miss Manners and Waltz50 was an unbelievably complex spreadsheet that covered pages and pages and pages. (As you recall, you had to do this to prove to the Senior Editor at InfoWorld that you "could" do it.) It was, in a word, horrible!!

As for the "advanced math" behind the spreadsheet that comprises the DETI "algorithm" - no one outside of Corticon (to my knowledge) has ever seen it. On the other hand, the Rete Algorithm is widely published. You (as Corticon) offered to show it to me for the article to prove that the math was "infallible" (your word, not mine) but then retracted the offer when I accepted the challenge. (BTW, I really did major in math and quantitative analysis as well as Electrical Engineering, etc.)

Now, thankfully, all of this folderol is hidden away in the comments of a little known blog on a huge WWW. Maybe nobody will read it but I had no other way to contact you since you left Corticon. Next time, since there are many Davids in this WWW, please sign your full name, etc. Leave an email address if possible...

Now, subject closed. EOL

SDG
jco

David Kim said...
This comment has been removed by a blog administrator.
James Owen said...
This comment has been removed by the author.
James Owen said...

Greetings:

Response to snshor: (Leave a "real" name next time..")

Experience with business and IT has shown that you can put lots and lots of rules into a decision table. The problem is that if you put a rule of "IF a AND b THEN c" that there are corollary rules of

IF a AND b THEN c
IF !a AND b THEN ?
IF a AND !b THEN ?
IF !a AND !b THEN ?

Corticon, as well as FICO Advisor and others, helps you find these overlooked rules so that you can answer them and either insert them or ignore them. For example, if you had 6 variables that evaluated only to a binary true or false, then you would have 64 possible rules. Add one more variable and you have 128 possible rules.

Now, as to "size" of the decision table: I have noticed that as the table does grow beyond about 512 rules (9 boolean variables) that performance begins to become a problem. At the 1024 level it really is a problem - normally. Even though marketeers will tell you that they have seen 10K rules in a single Decision Table with no problems, I find that highly doubtful. That would be the equivalent of 10K rules in a single rule set - not good rule architecture.

At any level of rules in decision tables, or even in normal OPS syntax, you will find that individual rule construction is quite important. This is discussed in great detail in Girratano and Riley's book in section 11.5, about pp 486 - 486 in the 3rd edition and section 9.12 in the 4th editioin, pp. 539 - 546. BTW, the 4th edition contains quite a bit more than the 3rd edition and I have not (yet) taught from that one but it is supposed to be quite a bit better. Regardless, if you're doing Jess or CLIPS it is almost indispensable and is a great guide for any rulebased system. The first two chapters alone will make life easier.

Some good examples of where decision tables, decision trees and modeling doesn't work well (if at all) is on such things as engineering problems, plant processing (think cracking towers or distillation towers), heat trace problems, ground fault situations, design (electronic, building, plant, whatever) problems, complex plant shutdown and start ups (as in power companies), etc. I know, you would think FORTRAN before using a rulebase. But sometimes a rulebase is the best answer for some of these. The "prime" example or where you would not use a decision table or tree is that of NP Hard problems such as Subset-Sum, Traveling Salesman, halting problem or map coloring problem. Some of these are on Wikipedia. See either

http://en.wikipedia.org/wiki/Map_color_problem or
http://en.wikipedia.org/wiki/Np_hard or

there is a list of problems under
http://en.wikipedia.org/wiki/NP-complete#NP-complete_problems
on Wikipedia.

The thing is, the rulebase industry as a whole has become totally focused on only ONE thing - business problems. These are the simplest of all, the easiest to solve AND these are the ones that usually pay the bills. This is what both Davids meant when they said that I was ignoring customer demands. I am not ignoring them - I agree that their approach will handle 85% to 90% of their business-only situations. (Some say all business problems can be handled that way but I can't quite agree with that.) I just can not see doing MYCIN with a decision table. You could, but why? Back in the 80's I had a friend who wrote a word processor using COBOL just show that it could be done. But why?

I'm not fighting the success of Corticon, Visual Rules nor VisiRules. Rather I'm pointing out that they are a "business only" tools that were instigated by FICO Blaze Advisor and ILOG JRules several years ago when they initiated Decision Tables as a way of allowing the business users to have control of the rules. These guys are WAY too sensitive concerning criticism (as evidenced by their comments) and apparently have not tried to use their wonderful products in some truly complex situations outside of the business world. A business problem might be huge, it might be compound (meaning very large) but the rules themselves are rarely complex.

Thanks for you questions and I hope that I have answered them but I'm sure that you have some comments of your own. I would be interested in hearing them sometime. Maybe at ORF 2009? :-)

For now, I think I've said almost everything that I could say on the subject so I think I'm going to move on to something more constructive. Again, thanks for your questions.

SDG
jco

snshor said...

James,

thanks for your thoughtful and educating response,
even though I do not agree with some of your points.



1) First of all, I believe that the fight, that you found yourself somewhat in, is not for the hearts and minds of AI-oriented engineers, but for the fat (not so much lately) wallets of so-called business users (aka pigeons). In this context, leave it for the marketing, Rete-based engines have no real advantages, for many reasons, for example, because BUs can neither comprehend nor use any of the serious Rete features.

2) In terms of performance, I believe that you are incorrect, the modern products can easily handle Decision Tables of very significant size without any performance disadvantage to Rete engines. I would even go on a limb here and say, that Decision Tables provide better possibilities for performance optimization than unstructured rules.

3)Then there is an issue of which approach provides for a better user experience? This may seem too prone to personal biased opinions, but I strongly believe that BUs feel more comfortable with anything that looks like Excel.



This was the "simple and primitive" world.



Now, toward the "complex" world of mission-critical, real time and NP-complete problems.

1)There is nothing wrong to have "compile-time" gap and coverage analysis in here too. Most of the complex problems could and should be decomposed into the set of the simple subproblems, but the overall complexity becomes the problem of it's own, and anything that could minimize the possible human mistake at this stage is more than welcome. (Where would you rather have undetected logical problem, in a nuclear power-plant control program or a mortgage compliance? The latter at least can not cause a major catastrophe, right? Wait a minute .... ;)

2)Any problem of the computational NP-complexity deserves and often has a good problem-specific heuristic algorithm, I would not ever recommend a general-purpose tool like Rete engine for these kind of problems (and, for that matter, there are much better general-purpose tools for this problem domain, in terms of both performance and expressional power). The best usage of rules in those applications - to define problem constraints. In this sense, any approach will do, with table-driven providing better structuring

3)And complex equipment/plant control applications are best implemented using FSM metaphor - this is also kind of table-driven approach.



It would be wrong to confine any of the approaches to some kind of an "algorithmic ghetto". It is important to get above personal biases, and try to get the best out of each approach, at the same time being aware of the possible limitations


James, If it looks to you just as another set of talking points, it is just a blog comment, right? But I am ready to elaborate on any item of your interest, if you wish to discuss it in more details.

Best Regards,

Stanislav Shor

James Owen said...

Stan:

(I like that better than snshor. :-)

I thought this one had died by now. Maybe this will kill it off. So, I'll try and answer your points one-by-one.

1) I have absolutely NO fight for the wallets of the business users. I see myself as fighting FOR them. I consider myself an engineer first, a computer programmer second and an AI guy third. But AI is my first love. I let marketeers fight for the wallets, not me.

2) I was speaking from the experience of having implemented several projects where the customer insisted on large decision tables and found performance to be an issue and had to back off to more "conventional" rulebase programming. Quite possibly it was the way that the JRules (and later Advisor rules) were implemented by the time I arrived on-site and I (we) did not have the time to "debug" the rules that had already been written in order to improve performance of the decision tables.

3) True, business analysts always feel comfortable with something that looks like an Excel spreadsheet. So do most most Java programmers. Unfortunately. Declarative programming is a different mindset (paradigm) and something that has to be learned, just like moving from COBOL to Java or C to C++.

1a) I have worked with extremely large power systems that used a rulebase to control the shutdown and startup of their generators for maintenance that was complicated by the changing price of various fuels such as natural gas, high-grade coal, lignite coal, and oil. Their plants stretched from Houston and Corpus Christi to Minnesota. This was something that could NOT be done with easily with a "spreadsheet" approach. Also, it was not something that could be easily coded into a C++ or Java program without lots and lots of hidden errors. So your example of the nuclear power plant is spot-on: Do it with an inference engine and thoroughly debug the sucker.

2a) More than likely, a constraint- based engine would be best for the NP-hard problems. I have used them in the past and I liked them. They have some limitations in that they are usually not flexible enough to pick up "rules" that an inference engine handles quite well. Try the map coloring problem with a table-driven response approach and let me know what you think. :-)

3a) Mentioned above. Could not do it with a table-driven approach. They tried it and it became "Gi-normous" quite quickly. Rules within inference engines have the advantage of being non-monotonic and can be forward or backward chaining as needed, which is what we ended up using. (We migrated from C/C++ to Nexpert to Expert, a Neuron Data backward chaining engine that was a delight to use on complex problems.)

4a) "Algorithmic Ghetto??" My bias is trying to find the right tool for the right job and if a particular vendors product does not fit the bill (in my estimation) then their marketeers become agitated and irritated and pounce upon anything to get their product accepted. It is they who are living in the "Ghetto" of Decision Tables, not I.

As I have said before AND WILL SAY AGAIN, I find the Corticon and Visual Rule approach quite acceptable in the business environment in which they are normally used. This goes double for FICO Blaze Advisor and ILOG JRules and JBoss Drools when they deem it proper to used decision tables and decision trees. BUT, they don't fit every problem - BUT they do fit about 85% of them with proper architecture and proper design of the tables themselves.

So, read the paragraph above again, accept it that not every product fits every situation and let' drop this thing for now. If you (and the two Davids) show up at ORF 2009, I will schedule a special "pub night" [probably Thursday evening] just for this discussion. No shouting. No personal insults (such as those above.) And, hopefully, the giants of our industry (Dr. Forgy, Dr. Friedman-Hill, Jason Morris, Mark Proctor, Daniel Selman, Gary Riley, Paul Haley, Dr. Feldman, Dr. Hicks, Dr. Armstrong, and some others) will be there to discuss this "problem." :-)

Be there or be square. (Yes, I am that old.)

SDG
jco

Carole-Ann Matignon said...

James,

Let's not re-open the whole discussion. We can do that during Pub night at ORF. I am up for those arguments ;-)

One thing that you are mis-representing though is the decision table thing. We can spend hours arguing whether one decision table is superior or equivalent to another. What should be more valuable is to recognize that not all problems can be or should be represented as decision tables.

Managing 1,000 or 100,000 rules in a decision table when rules are widely disconnected is less than ideal. Most underwriting or claims processing projects do need other representations such as decision trees, scorecards or just business rules in their own language.

Representing well the decision logic is at least as important as its execution mode.

My $.02...

Carole-Ann

Paul said...

James,

Sorry I didn't notice this years ago when it was fresh, but I thought I should reply to the idea of scaling.

The Texas TIERS project computes eligibility for all their assistance programs (Medicaid, Medicare, Food stamps, TANF, etc.). This system uses Decision Tables coded in spreadsheets exclusively. They don't end up with huge Decision Tables, because they allow "calling" Decision Tables from other Decision Tables.

All the Decision Tables are limited to 16 columns. Conditions are all boolean (unlike Corticon).

TIERS uses over 3000 decision tables. And nearly everyone that has done performance testing on the system (The state has used Deloitte, IBM, and I am sure others) expected to see the Rules Engine to be a performance problem. In fact, the Rules Engine's performance has been so fast that even the smallest database access dwarfs the processing time of the Rules.

In contrast with Corticon, this system just executes the Decision Tables as defined with no claims of "pre-inferencing".

I know this is late, but it should be added. The TIERS rules engine was written in plain old Java in 2000. The system has been deployed in Michigan, Colorado, and New Mexico by Deloitte.

Decision Tables are quite capable of handling huge rule sets and huge amounts of data. They don't do reasoning so well, as you have pointed out. But many very big problems do not require reasoning.

Paul Snow
DTRules.com