3

My team implemented an alert broker system. The way it works is this:

  1. Event sources generate events, with mostly arbitrary structure, and the broker listens to them.
  2. Users go to the broker and use an expression language to create alerts for the events. The expressions can reference arbitrary attributes of the events.
  3. The broker dispatches alerts accordingly. This involves searching for the event subscriptions that users create in step 2 above.

I'd like some help in understanding my options for the subscription search as I need to scale it out.

Right now we're just doing linear search because the number of events and subscriptions is low. That won't last long at all.

The plan for the next step is to use indexing on the small number of known fields in the event (e.g. sourceId, type, name, etc.). Once we prune the search space that way we could do linear search. That should carry us for a while, and maybe indefinitely.

But if we hit a wall there, what are the standard approaches? I know RETE is good at this sort of thing, but it seems that RETE implementations appear in the context of rules engines, and those involve conflict resolution to a single rule. In our case there would be multiple subscriptions (rules) we want to match, so just doing a straight rules engine doesn't seem to work. If there are standalone RETE implementations, I haven't seen them.

I guess finding or implementing standalone RETE is one option.

Other options?

(Also--to be clear, this isn't a case of premature optimization. I don't plan to implement anything optimized unless it looks like I actually need it. But I'd like to know that there's some plausible game plan should that occur.)

Christophe
81.9k11 gold badges135 silver badges201 bronze badges
asked Sep 10, 2016 at 20:43
4
  • How complex is the expression language? Does it perform calculation? Is it limited to specific types? Is it just basic comparisons? Commented Sep 10, 2016 at 21:21
  • Basic comparisons: exact match, ranges, inequalities, stuff like that. Right now we are using NodeJS and JMSEPath (like XPath but for JSON) but all of that is up for grabs since it's just a simple search at this point. Commented Sep 10, 2016 at 22:04
  • (Quoting) "If there are standalone RETE implementations, I haven't seen them." hopefully relevant: jessrules.com/docs/71/rete.html 'HTH, Commented Sep 12, 2016 at 16:58
  • Jess is a rules engine, so it uses RETE in the context of a larger rule identification process that produces a single rule in the end. It might be that I can just use the RETE part of Jess, Drools, etc. and ignore the larger rules engine. But it would be nice if there were a standalone RETE implementation without the larger rules engine. Commented Sep 12, 2016 at 17:03

1 Answer 1

1

To circle back on this, we ended up using Elasticsearch Percolator. See https://stackoverflow.com/questions/21536599/what-does-percolator-mean-do-in-elasticsearch for more information.

There are also some papers that describe algorithms for solving this, be we haven't tried them. Example:

answered Apr 17, 2019 at 4:54

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.