TiCDC ensures that the order of single-row updates is consistent with that in the upstream. TiCDC splits cross-table transactions in the unit of table and does not ensure the atomicity of cross-table transactions. TiCDC does not ensure that the execution order of downstream transactions is the same as that of upstream transactions. TiCDC does not split single-table transactions and ensures the atomicity of single-table transactions. Users can filter the duplicated messages from Kafka consumers. Kafka sink sends messages repeatedly, but the duplicate messages do not affect the constraints of Resolved Ts.For those that cannot be executed repeatedly, such as create table, the execution fails, and TiCDC ignores the error and continues the replication. For DDL statements that can be executed repeatedly in the downstream, such as truncate table, the statement is executed successfully. MySQL sink can execute DDL statements repeatedly.When the TiKV or TiCDC cluster encounters failure, TiCDC might send the same DDL/DML statement repeatedly. The sink component ensures the row-level order, final consistency or strict transactional consistency.Įnsure replication order and consistency Replication orderįor all DDL or DML statements, TiCDC outputs them at least once. Kafka based on the TiCDC Open Protocol.The sink component provides the final consistency support. Databases compatible with MySQL protocol.Sink supportĬurrently, the TiCDC sink component supports replicating data to the following downstream platforms: This section introduces the replication features of TiCDC. Restores the transaction to downstream or outputs the log based on the TiCDC open protocol.Each capture pulls a part of KV change logs.Multiple captures form a TiCDC cluster that replicates KV change logs. The data sent includes real-time change logs and incremental scan change logs.Ĭapture: The operating process of TiCDC. Provides the interface to output KV change logs.Assembles KV change logs in the internal logic.TiKV CDC component: Only outputs key-value (KV) change logs. The architecture of TiCDC is shown in the following figure: The TiCDC cluster supports creating multiple replication tasks to replicate data to multiple different downstream platforms. When TiCDC is running, it is a stateless node that achieves high availability through etcd in PD. In this way, TiCDC provides data sources for various scenarios such as monitoring, caching, global indexing, data analysis, and primary-secondary replication between heterogeneous databases. Data integration: TiCDC provides TiCDC Canal-JSON Protocol, which allows other systems to subscribe data changes from TiCDC.This function works only with TiDB primary and secondary clusters. Database disaster recovery: TiCDC can be used for disaster recovery between homogeneous databases to ensure eventual data consistency of primary and secondary databases after a disaster event.Specifically, TiCDC pulls TiKV change logs, sorts captured data, and exports row-based incremental data to downstream databases. TiCDC is a tool used for replicating incremental data of TiDB.
0 Comments
Leave a Reply. |