Apache Kafka is the de facto normal for knowledge streaming to course of knowledge in movement. With its vital adoption development throughout all industries, I get a really legitimate query each week: When ought to we not use Apache Kafka? What limitations does the occasion streaming platform have? When does Kafka merely not present the wanted capabilities? How will we qualify Kafka if it’s not the best instrument for the job? This weblog submit accommodates a lightboard video that provides you a twenty-minute rationalization of the do’s and do not’s.
Disclaimer: This weblog submit shares a lightboard video to look at an evidence about when NOT to make use of Apache Kafka.
What Is Apache Kafka, and What Is It Not?
Kafka is commonly misunderstood. For example, I nonetheless hear method too typically that Kafka is a message queue. A part of the reason being that some distributors solely pitch it for a particular drawback (akin to knowledge ingestion into a knowledge lake or knowledge warehouse) to promote their merchandise. So, briefly:
Kafka Is…
- A scalable real-time messaging platform to course of thousands and thousands of messages per second.
- An information streaming platform for enormous volumes of huge knowledge analytics and small volumes of transactional knowledge processing.
- A distributed storage supplies true decoupling for backpressure dealing with, assist of assorted communication protocols, and replayability of occasions with assured ordering.
- An information integration framework (Kafka Join) for streaming ETL.
- An information processing framework (Kafka Streams) for steady stateless or stateful stream processing.
This mix of traits in a single platform makes Kafka distinctive (and profitable).
Kafka Is Not…
- A proxy for thousands and thousands of shoppers (like cell apps) – however Kafka-native proxies (like REST or MQTT) exist for some use circumstances.
- An API Administration platform – however these instruments are normally complementary and used for the creation, life cycle administration, or monetization of Kafka APIs.
- A database for advanced queries and batch analytics workloads – however ok for transactional queries and comparatively easy aggregations (particularly with ksqlDB).
- An IoT platform with options akin to gadget administration – however direct Kafka-native integration with (some) IoT protocols akin to MQTT or OPC-UA is feasible and the suitable method for (some) use circumstances.
- A know-how for onerous real-time purposes akin to safety-critical or deterministic techniques – however that’s true for some other IT framework, too. Embedded techniques are a unique software program!
For these causes, Kafka is complementary, not aggressive, to those different applied sciences. Select the best instruments for the job and mix them!
Lightboard Video: When Not To Use Apache Kafka
The next video explores the important thing ideas of Apache Kafka. Afterward, the do’s and do not’s of Kafka present find out how to complement knowledge streaming with different applied sciences for analytics, APIs, IoT, and different eventualities.
Information Streaming Distributors and Cloud Companies
The analysis firm Forrester defines knowledge streaming platforms as a brand new software program class in a brand new Forrester Wave. Apache Kafka is the de facto normal utilized by over 100,000 organizations.
Loads of distributors supply Kafka platforms and cloud providers. Many complementary open-source stream processing frameworks like Apache Flink and associated cloud choices emerged. Aggressive applied sciences like Pulsar, Redpanda, or WarpStream attempt to get market share by leveraging the Kafka protocol.
Apache Kafka Is a Information Streaming Platform: Mix It With Different Platforms When Wanted!
Over 150,000 organizations use Apache Kafka within the meantime. The Kafka protocol is the de facto normal for a lot of open-source frameworks, business merchandise, and serverless cloud SaaS choices.
Nonetheless, Kafka shouldn’t be an allrounder for each use case. Many tasks mix Kafka with different applied sciences, akin to databases, knowledge lakes, knowledge warehouses, IoT platforms, and so forth. Moreover, Apache Flink is changing into the de facto normal for stream processing (however Kafka Streams shouldn’t be going away and is the higher alternative for particular use circumstances).
The place do you (not) use Apache Kafka? What different applied sciences do you mix Kafka with? Let’s join on LinkedIn and focus on it! Keep knowledgeable about new weblog posts by subscribing to my e-newsletter.