Apache Kafka in KRaft Mode With RisingWave – DZone – Uplaza

Over the previous few years, Apache Kafka has emerged as the highest occasion streaming platform for streaming knowledge/occasion ingestion. Nevertheless, in an earlier model of Apache Kafka, 3.5, Zookeeper was the extra and necessary part for managing and coordinating the Kafka cluster. Counting on ZooKeeper on the operational multi-node Kafka cluster launched complexity and could possibly be a single level of failure. 

ZooKeeper is totally a separate system having its personal configuration file syntax, administration instruments, and deployment patterns. In-depth expertise with expertise are essential to handle and deploy two particular person distributed methods and an ultimately up-and-running Kafka cluster. Having experience in Kafka administration with out ZooKeeper gained’t be capable to assist to return out from the disaster, particularly within the manufacturing atmosphere the place ZooKeeper runs in a totally remoted atmosphere (Cloud). 

Kafka’s reliance on ZooKeeper for metadata administration was eradicated by introducing the Apache Kafka Raft (KRaft) consensus protocol. This eliminates the necessity for and configuration of two distinct methods — ZooKeeper and Kafka — and considerably simplifies Kafka’s structure by transferring metadata administration into Kafka itself.  Apache Kafka has formally deprecated ZooKeeper in model 3.5 and the newest model of Kafka which is 3.8, improved the KRaft metadata version-related messages. There isn’t a use until we eat the ingested occasions from the Kafka subject and course of them additional to realize enterprise worth.

RisingWave, however, makes processing streaming knowledge simple, reliable, and environment friendly as soon as occasion streaming flows to it from the Kafka subject. Impressively, RisingWave excels in delivering constantly up to date materialized views, that are persistent knowledge buildings reflecting the outcomes of stream processing with incremental updates.

On this article, I’m going to elucidate step-by-step learn how to set up and configure the newest model of Apache Kafka, model 3.8, on a single-node cluster working on Ubuntu-22.04, and subsequently combine it with RisingWave that was additionally put in and configured on the identical node. 

Assumptions

  • OpenJDK model 17.0.12 has been put in and configured together with setting JAVA_HOME on ~/.bashrc file.

  • SSH connection has been put in and configured. Later, this node could possibly be clubbed with a multi-node cluster on-prem. 
  • PostgreSQL shopper model 14.12 (not the PostgreSQL server) has been put in and configured. That is necessary to attach through psql with the RisingWave streaming database. psql is a command-line interface for interacting with PostgreSQL databases that’s included within the PostgreSQL package deal. Since RisingWave is wire-compatible with PostgreSQL, by utilizing psql, we are going to connect with RisingWave in order that SQL queries may be issued and handle database objects. You’ll be able to refer right here to put in and configure psql on Ubuntu.

Set up and Configuration of Apache Kafka-3.8 With KRaft

  • The binary model of Kafka 3.8, which is the newest, may be downloaded right here.
  • Extract the tarball, and after extraction, the complete listing, “kafka_2.13-3.8.0”, is moved to /usr/native/kafka. Make certain we should always have “root” privilege.
  • We are able to create a location listing as “kafka-logs” the place Kafka logs might be saved underneath /usr/native. Make certain the created listing has read-write permissions.
  • As a configuration step, navigate to “kraft” listing accessible inside “/usr/local/kafka_2.13-3.8.0/config” and open the server.properties within the vi editor to govern/replace key-value pair. The next keys ought to have the corresponding values.
  • In KRaft mode, every Kafka server may be configured as a controller, a dealer, or each utilizing the course of.roles property. Since it’s a single-node cluster, I’m setting each dealer and controller.
course of.roles=dealer,controller

And, subsequently, node.id=1, num.partitions=5, and delete.subject.allow=true.

Begin and Confirm the cluster

  • The distinctive cluster ID technology and different required properties may be created by utilizing the built-in script kafka-storage.sh accessible contained in the bin listing.

  • Make certain the recordsdata bootstrap.checkpoint and meta.properties had been generated contained in the created listing kafka-logs. A novel cluster ID is on the market inside meta.properties file.

  • Begin the dealer utilizing the next command from the terminal.

  • Make certain the next ought to be displayed on the terminal.

Subject Creation

Utilizing Apache Kafka’s built-in script, kafka-topics.sh, accessible contained in the bin listing, I can create a subject on the working Kafka dealer utilizing the terminal. Create one subject named UPIStream with the variety of partitions 3.

  • Make RisingWave useful as a single occasion in standalone mode.

As stated above, RisingWave within the standalone mode has been put in and configured on the identical node the place Kafka 3.8 on KRaft mode is operational. The RisingWave in standalone mode leverages the embedded SQLite database to retailer metadata and knowledge within the file system. Earlier than that, we have to set up and configure the PostgreSQL shopper as talked about within the assumptions.

  • Open a terminal and execute the next curl command:

$ curl  https://risingwave.com/sh  | sh

  • We are able to begin a RisingWave occasion by working the next command on the terminal:

$./risingwave

  • Open a terminal to hook up with RisingWave utilizing the next command: 

$ psql -h 127.0.0.1 -p 4566 -d dev -U root

Connecting Kafka Dealer With RisingWave

Right here, I’m going to attach RisingWave with the Kafka dealer that I need to obtain occasions from the created subject UPIStream. I must create a supply in RisingWave utilizing the CREATE SOURCE command. When making a supply, I can select to persist the information from the Kafka subject in RisingWave by utilizing the CREATE TABLE command and specifying the connection settings and knowledge format. There are extra further parameters accessible whereas connecting to the Kafka dealer. You’ll be able to refer right here to study extra.

Including the next to easily join the subject UPIStream on the psql terminal.

Steady Pushing of Occasions From Kafka Subject to RisingWave

Utilizing a developed simulator in Java, I’ve printed a stream of UPI transaction occasions at an interval of  0.5 seconds within the following JSON format to the created subject UPIStream. Right here is the one stream of occasions.

{"timestamp":"2024-08-20 22:39:20.866","upiID":"9902480505@pnb","name":"Brahma Gupta Sr.","note":" ","amount":"2779.00","currency":"INR","Latitude":"22.5348319","Longitude":"15.1863628","deviceOS":"iOS","targetApp":"GPAY","merchantTransactionId":"3619d3c01f5ad14f521b320100d46318b9","merchantUserId":"11185368866533@sbi"}

Confirm and Analyze Occasions on RisingWave

Transfer to the psql terminal that’s already linked with the RisingWave single occasion consuming all of the printed occasions from the Kafka subject UPIStream and storing on the supply UPI_Transaction_Stream. On the opposite aspect, the Java simulator is working and repeatedly publishing particular person occasions with completely different knowledge to the subject UPIStream at an interval of 0.5 seconds, and subsequently, every occasion is getting ingested to the RisingWave occasion for additional processing/analyzing. 

After processing/modifying the occasions utilizing the Materialized views, I may sink or ship these occasions again to the completely different Kafka matters in order that downstream purposes can eat these for additional analytics. I’ll articulate this in my upcoming weblog, so please keep tuned. 

Since I’ve not achieved any processing, modification, or computations on the ingested occasions within the working RisingWave occasion, I created a easy Materialized view to watch just a few fields within the occasions to ensure integration with Apache Kafka on KRaft mode with RisingWave is working completely advantageous or not. And the reply is a giant YES.  

Ultimate Word

Particularly for the on-premises deployment of a multi-node Kafka cluster, Apache Kafka 3.8 is a superb launch the place we utterly bypass the ZooKeeper dependency. Apart from, it is simple to arrange a growth atmosphere for individuals who need to discover extra about occasion streaming platforms like Apache Kafka. Alternatively, RisingWave capabilities as a streaming database that innovatively makes use of materialized views to energy steady analytics and knowledge transformations for time-sensitive purposes like alerting, monitoring, and buying and selling. Finally, it is turning into a game-changer as Apache Kafka joins forces with RisingWave to unlock enterprise worth from real-time stream processing.

I hope you loved studying this. In case you discovered this text beneficial, please take into account liking and sharing it.

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Exit mobile version