White-hat change data capture

Certain database engines provide sanctioned ways to notify of changes to the data. That's better than polling or manual trigger creation.

Damir Bulic

Aug 25, 2021 • 1 min read

As the need to replicate database data is so common in the world of many systems and data sources, it is small wonder (some) database vendors provided mechanisms to make that easier.

Let's look at some of the implementations.

PostgreSQL

PostgreSQL has a very nice publisher/subscriber system.

SQL Server

SQL Server's implementation is messy. You need to initialize CDC for each table separately, and that will create a shadow table. you then need to poll for changes periodically. This looks as if someone from the dev team implemented the previously described trigger-based approach and wrapped it with a few stored procedures.

MySQL

You will need to edit the MySQL's config file and enable binary logs. You can read those logs, analyze them, and use them to push the data to the target. As MySQL is open-source, many solutions exist to read those logs and push the changes to other databases, Kafka, etc.

Oracle

Oracle used to provide an API for change streaming. Then, they bought Golden Gate, a product that does low-level log sniffing, and promptly shut down their APIs. Now, to replicate changes, Oracle wants you to buy Golden Gate. Of course, it is priced enterprisey and out of reach of the smaller companies

Tomorrow we will take a look at the most performant approach, write-ahead log sniffing.