WL#2197: Multi-source: Diamond replication

Affects: Server-7.1   —   Status: Assigned

REQUIREMENT
Possible to replicate in any directed acyclic graph 
configuration (e.g. A->B, A->C, B->D, C->D) without having multiple 
applications of the same binlog entry.  This should also work for a 
configuration like (A->B, B->C, B->D, C->E, D->E).

IMPLEMENTATION
1. Add an identifier, UID, to binlog events.  If a binlog 
   entry arrives at any server that has a current UID larger 
   than the entry UID, then this entry will be discarded.
   Room for the id already exists in the 5.0 binlog format.  

   The id may either use UUID (exists in 4.1), or be a 
   combination of 
     server_id+thread_id + 
     timestamp+a_per_thread_counter(thd->query_id for example);
   in the latter case, note that we already have 
     server_id+thread_id+timestamp 
   in the current binlog since 4.0, so it's just adding a 
   per_thread_counter.
   TIME LEFT: 40 hrs

2. Optionally, care about minimizing the network traffic. 
   Example: M1->M2->M3->M1. 

   Currently an event is created on M1, is replicated to M2, 
   then to M3, then is sent to M1 and M1 says "it's mine I 
   discard it". It would be better if M3 did 
   not send it to M1. 

   In my example this does not bring much, but when we have 
   multimaster we will have configs with a lot 
   of connections between machines (a star for example, or a grid). 

   We may want to avoid, as much as we can, useless network traffic. 
   Remember we had a prospect question for 1000 machines 
   replicating each other. 

   I thought of storing in the event the last part of the route 
   (the server ids of the last X machines this event went through; 
   this would be stored in the manner of a circular buffer). 
   Don't know if it would help much. To be thought about.