README.md


mnesplit (fork of [reunion](https://github.com/snar/reunion))
=======

This is another approach to make `mnesia` more partition-tolerant. 
Results are not 100% accurate (see below for known problems), but it's much 
better than default "do not even try to recover after split".

Usage:
------

	(node@host)>application:start(mnesplit).

This will start `mnesplit` in default configuraion, tracking all keys in
`set` and `ordered_set` tables with default strategy of `merge_only` and 
trying to reconnect all not running nodes every `net_tick_timeout` seconds.

Advanced usage:
---------------

`mnesplit` tries to detect two types of collisions: 
- INSERT/DELETE collision, when some record is present on one node and 
not present on another.
- Data Collision, when some record present on both nodes but contents differ.

In order to resolve first type of collision for `set` tables, `mnesplit` tracks
insert and delete operations, storing info about Table, Key, Operation and 
WhenItHappened in internal `ets` table. When collision occurs, `mnesplit` 
consults this table trying to determine last event for this key and restores 
record accordingly (if last found event is 'insert' - record is re-inserted 
on node it missed, if 'delete' - record is deleted on node it still present).

For `bag` tables operations are not tracked, and default behaviour is just to 
merge bags, adding missing elements on node they are not present. 

With Data Collision, when elements are present on both nodes with different
content, conflict resolution function is called. Default behaviour for 
`set` tables is to do nothing but to raise alarm (using `sasl` `alarm_handler`),
and for `bag` tables both elements are written on both nodes.

There are two more pre-defined strategies: `last_version`, that selects
"best" record using comparision on some field in the record, 

	(node@host)>mnesia:write_table_property(kvs, {mnesplit_compare, 
		{mnesplit_lib, last_version, [field]}}).

and `last_modified` (variant of last_version using pre-defined `modified` field
for comparison):

	(node@host)>mnesia:write_table_property(kvs, {mnesplit_compare, 
		{mnesplit_lib, last_modified, []}}).

These strategies can be used for `bag` tables too, but configuration is a bit
different: to select "best" element among a bag, elements should have some
"secondary key" field, and "best" element is selected only among elements with
the same "secondary key".

You can define your own conflict resolution functon, which will be called as:

	function(init, {Table :: atom(), Type :: 'set' | 'bag', Fields :: list(atom()), 
		RemoteNode :: atom()) -> 
		{ok, Modstate :: any()};
	function(done, Modstate :: any(), RemoteNode :: atom()) -> 
		any();
	function(LocalRecords, RemoteRecord, ModState :: any()) -> 
		{ok, Actions :: mnesplit:action() | list(mnesplit:action()), 
			NextState :: any()} | 
		{inconsistency, Error, NextState :: any()}
		
where `Action` can be one of 

	{write_local, Record} | {write_remote, Record} | {delete_local, Record} | 
		{delete_remote, Record}

I hope, the names are self-descriptive enough.

Excluding table from automatic merging:
---------------------------------------

	(node@host)>mnesia:write_table_property(bag, {mnesplit_compare, ignore}).

This also disables key learning for this table, useable for tables with
rapidly changing data.

Disabling or tuning automatic reconnects: 
-------------------------------
	
	(node@host)>application:set_env(mnesplit, reconnect, never).

With this setting no new reconnect timers will be scheduled. Other possible
setting is custom timeout value in seconds.

Note on fragmented tables:
--------------------------

As `user_properties` are not auto-propagated to table fragments, 
`compare` strategy always inherited from `base_table`. There are no 
way to implement custom strategy for some fragment.

Known problems: 
---------------

Error window: it's possible that some conflicts will not be found or that 
mnesplit can introduce `missing update` problems. This is caused by the 
fact that no table locking used, so objects in mnesia can be changed 
after object is fetched for compare but before resulting object is 
written or before mnesia nodes are joined again.

Another interesting edge case: let's assume netsplit happens right after
creating some record and then record is deleted during netsplit. This can
be detected and resolved correctly only in case `mnesplit` is running on 
both nodes and clocks are synchronised. If there are no `mnesplit` on 
island where record is deleted - this deletion will be missed.

Other approaches:
-----------------

`mnesplit` is an "almost complete rewrite" of `unsplit` by Ulf Wiger. 
Major difference between our approaches: `unsplit` uses a stateless 
approach and just unable to resolve "object present here and not present 
there" case: this can be a result of local insertion or remote deletion, 
and it's not possible to determine what happened without storing this 
information.

Thanks: 
-------

Ulf Wiger for his `unsplit` application.