Friday, November 28, 2008

Complex Event Processing

Greetings:

Carole-Ann Berlioz-Matignon started another small fire storm on her blog about Complex Event Processing.  In running the rabbit trails of I did find another blog on CEP by Tim Bass of Tibco that seems to take a more academic approach to CEP that uses lots of pictures, paragraphs of polysyllabic words and phrases that take some time to read.  Charles Young, he of UK fame, even chimed in with his usual lucid and erudite thoughts.  I did read through the blogs and now I'm pretty sure that no one has a really good grasp on a definitive answer to, "What is CEP?"  

I think my FIRST comment is that neither CEP nor Rulebased Systems nor Neural Nets are "Business" only pursuits, although Fair Isaac, ILOG, Tibco, IBM, Microsoft and others are desperately trying to force everything computer-related into their own pre-defined concept of how that particular tool might be effected in the business world while relegating the rest of the world (physics, chemistry, geology, psychology, medicine, electronics, forecasting, analytics, etc, etc.) to non-essential importance.  They will deny it, of course, but, nevertheless, their actions cry much louder than their words.  If it cannot be forced into their world of business applications then it does not deserve consideration.

Back to the matter at hand:  Wikipedia defines a CEP as "... primarily an event processing concept that deal with the task of processing multiple events with the goal of identifying the meaningful events within within the event cloud.   CEP employs techniques such as detection of complex patterns of many events, event correlation and abstraction, event hierarchies, and relationships between events such as causality, membership, and timing, and event-driven processes."  By it's very name, it is an Event - not something static - and it is Complex in that there MUST be something that about the event(s) that makes it not easily solvable or analyzed.

Since the definition seems to be getting in the way, let's say what it is NOT.  It is not:
  • Simple
  • Time comparisons (only)
  • Rules (only)
  • Processes (only)
  • Neural Net (only)
  • Necessarily Distributed
So, now that we (OK, just me) have said what it is NOT maybe we can say what it might be.  (This part is what is open for discussion.)  How about this?  "A Complex Event is any event in time that is composed of multiple facts - whether static or dynamic - or events the outcome of which defy ordinary logic and the outcome is solvable according to either rules or neural networks and the same set of facts and events always have the same resultant process."  Or something like that.   What a CEP is NOT is BRMS, an inference process (only) nor any one set of things that would constrain the solution.

For example, forecasting is an extremely complex process wherein only the short term can be predicted with any degree of accuracy.  If any financial forecasting package could predict the future within 5% within the next 12 months then the owner would make a small fortune.  You just can't know everything about everything.  Personally, my goal is that between now and October Rules Fest for 2009 I will have made some kind of significant progress on a commercially viable forecasting package.  If nothing else, my goal of understanding exactly HOW to do event-driven-rule-constrained-forecasting will be better.  

SDG
jco

4 comments:

Opher Etzion said...

The EPTS (Event Processing Technical Society) glossary definition for : complex event is:

Complex event - An event that is an abstraction of other events called its members.
(see: http://complexevents.com/?p=409 for the entire glossar and
http://www.ep-ts.com/ for EPTS)

Rules are one of multiple techniques that can be used to create complex events, thus CEP ins not equal BRMS

Tim Bass said...

The EPTS definition of a "complex event" is too general and simplistic.

For example, if you use the EPTS definition, simply counting events is "complex event processing" and the result of simple aggregation, accounting or filtering is a "complex event".

This definition is way too broad and too simple to have any meaningful context. Basically, the EPTS defines "complex" to be an operation on more than one event.

Anyone can see that this defination of "complex event" and "complex event processing" is way too broad, and because of that, it is basically meaningless.

Yours sincerely, Tim

James Owen said...

Opher, Tim et al:

There is something about standards groups that seems to irritate me - usually because it becomes a consensus opinion of the commercial members rather than that of the more erudite (don't you just love that word?) members. I was privileged at one time to be a member of OMG and a couple of SIGs as well as the IEEE.

Lots of talk, not much action. And even the actions were verbose and full of abstract adjectives rather than concrete definitions. What we need to do is to arrive at a definition where the very word "complex" is defined and examples given where we would have situations that were not complex enough to be called complex, some that are way too complex to fit - which hardly seems likely since that is what we are trying to accomplish.

If we were to say, "A complex event encompasses or is caused by at least three events and or data points that have a very low (.1 or less) correlation factor to each other and to the event itself, then the event can be called "Complex." So, if this is a start, either kill it or build on it.

But, whatever we do, it must be a quantitative and NOT a qualitative definition. Being too broad is not a problem if the descriptions are quantified and closely definable. And we we should not have to have 100% agreement but it should be something more than a simple majority agreement - perhaps something on the order of 75% agreement.

Also, many thanks for your comments and suggestions. Let's keep the ball rolling and try to do something before the end of the year.

SDG
jco

Tim Bass said...

Hi James,

I agree with you that a quantitative definition would be very useful; but we need to be careful. Complex event processing should not be defined by numbers that are unrelated to transforming raw events to actionable knowledge (actionable means high confidence, low false alarms, low false positives). I am talking about the mathematical foundations of detection theory.

For example, you can processing 10,000,000 events of 2,000 different kinds in 1 second and still not provide any actionable knowledge with high confidence. The outcome at the end of the processing tunnel might be a (useless) false alarm.

CEP was always envisioned to be about detection (real time detection of opportunities and threats), so some of the best places to start moving forward is in the quantitative math of detection theory. There are a number of areas that would be a useful point of departure for CEP. A few metrics that come to mind are:

Constant false alarm rate (CFAR)
False alarm rate (FAR)
Probability of detection (POD)
The probability of detection Past.
The false alarm ratio (FAR)
The probability of false detection (POFD, also known as the false alarm rate)

Etc.

There is quite a bit of quantitative theory in this area.

It is truly amazing (or perhaps disheartening is a better word) to see a technology which is all about real-time detection - avoiding all quantitative prior art and science in the area of detection theory.

Yours sincerely, Tim

PS: This is one major reason I have been consistently critical about the "Event Processing Technical Society" (EPTS). Their work is marketing oriented and their vocabulary is marketing oriented, not technology oriented, creating the required quantitative precision that will stop the confusion and endless debates.

On the other hand, the problem is deeper because if we developed precise detection metrics from basic detection theory, most of the self-described CEP products on the market today would not "rise to the level required" because you cannot detect very many complex events with high accuracy (confidence) with a continuous query rules engine across a sliding time window. SO, the marketing folks have reduced CEP to "processing a lot of events fast", which has very little meaning or purpose.

This is the basic problem with the "class struggle" in the CEP domain. The marketing hype does not match the technical requirements in even elementary detection theory.