00:00 - 00:03	We set up Debezium to capture changes from Postgres
00:04 - 00:05	then configured Kafka Connect
00:05 - 00:07	with the sink connector
00:08 - 00:12	We deployed a 5-node Kafka cluster with Schema Registry
00:12 - 00:15	and wrote 2,000 lines of YAML to wire it all together
00:17 - 00:19	We're running CDC from
00:19 - 00:21	47 Postgres tables into ClickHouse
00:24 - 00:26	Mein Fuhrer...
00:27 - 00:28	The pipeline...
00:31 - 00:33	Someone changed a column type in Postgres
00:34 - 00:36	The entire CDC pipeline is down. ClickHouse has been out of sync for 3 days
00:53 - 00:58	Everyone who has never been woken up by a PagerDuty alert from Debezium, get out
01:13 - 01:15	How did we let this happen?
01:15 - 01:17	Who runs a CDC pipeline with NO schema evolution strategy?!
01:18 - 01:23	You read one blog post about event-driven architecture and mass adopted three JVM services?!
01:25 - 01:28	We should have just stayed on Postgres...
01:29 - 01:31	You had to build the most over-engineered pipeline possible
01:31 - 01:34	just to feel like a 'real' data engineer
01:34 - 01:37	I could have used a managed Snowflake pipeline
01:37 - 01:40	for a tenth of the headache
01:40 - 01:42	Even a cronjob with pg_dump scales better than this
01:42 - 01:46	The Confluent sales rep sold you a dream. And you bought it.
01:46 - 01:48	Mein Fuhrer, we can add Flink to handle the schema changes
01:48 - 01:52	Oh wonderful, another JVM service. Should we also add Spark while we're at it?
01:53 - 01:54	What's next, a Hadoop cluster?
01:56 - 01:57	Maybe Airflow on top?
01:57 - 02:00	I was told to use 'infrastructure as code' for the connectors
02:00 - 02:03	But the connector configs are 500 lines of JSON that NO ONE understands
02:04 - 02:08	We wasted 4 months migrating from Kafka to Redpanda to avoid the JVM
02:08 - 02:13	And we STILL need Debezium which IS a JVM application!
02:14 - 02:16	The sink connector crashes every time ClickHouse does a merge
02:17 - 02:21	And the Schema Registry rejects every other schema change
02:27 - 02:29	Why is syncing two databases so damn hard?
02:30 - 02:34	Why can't I just define my schema and have everything wired up automatically?
02:34 - 02:36	Even our YAML configs have YAML configs
02:41 - 02:42	I'm done with Debezium and Kafka Connect
02:43 - 02:47	I will rewrite the entire pipeline myself. In Python.
02:48 - 02:53	All I wanted was to get data from Postgres into ClickHouse
02:54 - 02:56	Instead I'm running a JVM zoo with 6 services
02:56 - 02:59	just to move rows from one database to another
03:00 - 03:02	I've mass adopted Debezium, Kafka, Flink, Schema Registry, and sink connectors
03:04 - 03:07	Six months of my life. Gone. Debugging offset resets.
03:14 - 03:16	Sir... a junior engineer just Slacked me a GitHub link. Something called MooseStack
03:19 - 03:23	You define schema in TypeScript. One command wires up ClickHouse, Redpanda, APIs, everything
03:25 - 03:26	No JVM. No YAML. No sink connectors. Schema as code with hot reload.
03:31 - 03:33	...this is literally everything I just asked for
03:40 - 03:46	This existed the entire time?!
03:46 - 03:49	...six months. I mass adopted six JVM services. And the answer was one npm install away.
03:53 - 04:10	github.com/514-labs/moosestack

Hitler Discovers MooseStack

Captions

0 Comments

S14 Downfall

Hitler Rants About Swing Dance

The American

Gala Night tough discussion 3

Competition's Executive Meeting

Hitler reacts to POE2 0.2

Squeezer

Twink Nerf Eden S3

Der UnterPAng

Don MacLeod Reacts to Roy Jumping in the Garden City Election

Trudeau in the Bunker After Freeland Quits

Chansiri's Last Stand