README.md

riak_sysmon
===========

[![Build Status](https://secure.travis-ci.org/basho/riak_sysmon.png?branch=master)](http://travis-ci.org/basho/riak_sysmon)

`riak_sysmon` is an Erlang/OTP application that manages the event
messages that can be generated by the Erlang virtual machine's
`system_monitor` BIF (Built-In Function).  These messages can notify a
central data-gathering process about the following events:

* Processes that have their private heaps grow beyond a certain size.
* Processes whose private heap garbage collection ops take too long
* Ports that are busy, e.g., blocking file & socket I/O
* Network distribution ports are busy, e.g., lots of communication
  with a slow peer Erlang node.

The problem with `system_monitor` events is that there isn't a
mechanism within the Erlang virtual machine that limits the rate at
which the events are generated.  A busy VM can easily create many
hundreds of these messages per second.  Some kind of rate-limiting
filter is required to avoid further overloading a system that may
already be overloaded.

This app will use two processes for `system_monitor` message handling.

1. A `gen_server` process to provide a rate-limiting filter.
1. A `gen_event` server to allow flexible, user-defined functions to
respond to `system_monitor` events that pass through the first stage
filter.

There can be only one system_monitor process
--------------------------------------------

(Silly reference to [The Highlander](http://www.imdb.com/title/tt0091203/)
omitted....)

The Erlang/OTP documentation is pretty clear on this point: only one
process can receive `system_monitor` messages.  But using the
`riak_sysmon` OTP app, if multiple parties are interested in receiving
`system_monitor` events, each party can add an event handler to the
`riak_sysmon_handler` event handler.

The event handler process in this application uses the registered name
`riak_sysmon_handler`.  To add your handler, use something like:
`gen_event:add_sup_handler(riak_sysmon_handler, yourModuleName, YourInitialArgs)`.

See the
[`gen_event` documentation for `add_sup_event/3`](http://www.erlang.org/doc/man/gen_event.html#add_sup_handler-3)
for API details.  See the example event handler module in the source
repository, `src/riak_sysmon_example_handler.erl`, for example usage.

Events sent to custom event handlers
------------------------------------

The following events can be sent from the `riak_sysmon`
filtering/rate-limiting process (a.k.a. `riak_sysmon_filter`) to the
event handler process (a.k.a. `riak_sysmon_handler`).

* `{monitor, pid(), atom(), term()}` ... These are
  `system_monitor` messages as they are received verbatim by the
  `riak_sysmon_filter` process.  See the reference documentation for
  `erlang:system_monitor/2` for details.
* `{suppressed, proc_events | port_events, Num::integer()}` ... These
  messages inform your event handler that `Num` events of a certain type
  (`proc_events` or `port_events`) were suppressed in the last second
  (i.e. their arrival rate exceeded the configured rate limit).