Course: Processing Data with Apache Kafka

$209.00
$252.89 incl. vat

duration: 11 hours |

Language: English (US) |

access duration: 180 days |

Details

This course introduces you to Apache Kafka. Apache Kafka is an event streaming platform used by Fortune 100 companies. You will explore the characteristics of event streaming and how the KAFKA architecture enables scalable streaming of data. This course also focuses on integrating Python applications with a Kafka environment, implementing consumer groups and customizing Kafka configurations.

Apache Spark is a distributed data processing engine that can process petabytes of data by aggregating and distributing that data across a cluster of resources. This course covers Spark's structured streaming engine.

It also covers the following:

  • Installing Kafka and creating topics.
  • Creating brokers and a cluster of nodes to handle messages and store their replicas.
  • Connecting to Kafka from within Python.
  • Various ways to optimize Kafka's performance, using configurations for brokers and topics, and for producer and consumer apps.
  • Spark's structured streaming engine, including components such as PySpark shell.
  • Building Spark applications that process data streamed to Kafka topics using DataFrames.
  • You will discover Apache Cassandra and learn the steps required to interface Spark with this wide-column database.

Result

After completing this course, you will be familiar with Apache Kafka, Apache Spark and Apache Cassandra. You will recognize the characteristics of event streaming and know how to integrate Python applications with a Kafka environment.

Prerequisites

You have at least basic knowledge of data processing.

Target audience

Software Developer, Data analist

Content

Processing Data with Apache Kafka

11 hours

Processing Data: Getting Started with Apache Kafka

  • Apache Kafka is a popular event streaming platform used by

  • Fortune 100 companies for both real-time and batch data processing.
  • In this course, you will explore the characteristics of event
  • streaming and how Kafka architecture allows for scalable streaming
  • data. Install Kafka and create some topics, which are essentially
  • channels of communication between apps and data. Set up and work
  • with multiple topics for durable storage. Create multiple brokers
  • and cluster of nodes to handle messages and store their replicas.
  • Then, monitor the settings and logs for those brokers. Finally, see
  • how topic partitions and replicas provide redundancy and maintain
  • high availability.

Processing Data: Integrating Kafka with Python & Using Consumer Groups

  • Producers and consumers are applications that write events to

  • and read events from Kafka. In this course, you will focus on
  • integrating Python applications with a Kafka environment,
  • implementing consumer groups, and tweaking Kafka configurations.
  • Begin by connecting to Kafka from Python. You will produce to and
  • consume messages from a Kafka topic using Python. Next, discover
  • how to tweak Kafka broker configurations. You will place limits on
  • the size of messages and disable deletion of topics. Then, publish
  • messages to partitioned topics and explore the use of partitioning
  • algorithms to determine the placement of messages on partitions.
  • Explore consumer groups, which allow a set of consumers to process
  • messages published to partitioned Kafka topics in parallel -
  • without any duplication of effort. Finally, learn different ways to
  • optimize Kafka's performance, using configurations for brokers and
  • topics, as well as producer and consumer apps.

Processing Data: Introducing Apache Spark

  • Apache Spark is a powerful distributed data processing engine

  • that can handle petabytes of data by chunking that data and
  • dividing across a cluster of resources. In this course, explore
  • Spark’s structured streaming engine, including components like
  • PySpark shell. Begin by downloading and installing Apache Spark.
  • Then create a Spark cluster and run a job from the PySpark shell.
  • Monitor an application and job runs from the Spark web user
  • interface. Then, set up a streaming environment, reading and
  • manipulating the contents of files that are added to a folder in
  • real-time. Finally, run apps on both Spark standalone and local
  • modes.

Processing Data: Integrating Kafka with Apache Spark

  • Flexible and Intuitive, DataFrames are a popular data structure

  • in data analytics. In this course, build Spark applications that
  • process data streamed to Kafka topics using DataFrames. Begin by
  • setting up a simple Spark app that streams in messages from a Kafka
  • topic, processes and transforms them, and publishes them to an
  • output sink. Next, leverage the Spark DataFrame application
  • programming interface by performing selections, projections, and
  • aggregations on data streamed in from Kafka, while also exploring
  • the use of SQL queries for those transformations. Finally, you will
  • perform windowing operations - both tumbling windows, where the
  • windows do not overlap, and sliding windows, where there is some
  • overlapping of data.

Processing Data: Using Kafka with Cassandra & Confluent

  • Apache Cassandra is a trusted open-source NoSQL distributed

  • database that easily integrates with Apache Kafka as part of an ETL
  • pipeline. This course focuses on that integration of Kafka, Spark
  • and Cassandra and explores a managed version of Kafka with the
  • Confluent data streaming platform. Begin by integrating Kafka with
  • Apache Cassandra as part of an ETL pipeline involving a Spark
  • application. Discover Apache Cassandra and learn the steps involved
  • in linking Spark with this wide-column database. Next, examine the
  • various features of the Confluent platform and find out how easy it
  • is to set up and work with a Kafka environment. After completing
  • this course, you will be prepared to implement and manage steam
  • processing systems in your organization.

Course options

We offer several optional training products to enhance your learning experience. If you are planning to use our training course in preperation for an official exam then whe highly recommend using these optional training products to ensure an optimal learning experience. Sometimes there is only a practice exam or/and practice lab available.

Optional practice exam (trial exam)

To supplement this training course you may add a special practice exam. This practice exam comprises a number of trial exams which are very similar to the real exam, both in terms of form and content. This is the ultimate way to test whether you are ready for the exam. 

Optional practice lab

To supplement this training course you may add a special practice lab. You perform the tasks on real hardware and/or software applicable to your Lab. The labs are fully hosted in our cloud. The only thing you need to use our practice labs is a web browser. In the LiveLab environment you will find exercises which you can start immediatelyThe lab enviromentconsist of complete networks containing for example, clients, servers,etc. This is the ultimate way to gain extensive hands-on experience. 

WHY_ICTTRAININGEN

Via ons opleidingsconcept bespaar je tot 80% op trainingen

Start met leren wanneer je wilt. Je bepaalt zelf het gewenste tempo

Spar met medecursisten en profileer je als autoriteit in je vakgebied.

Ontvang na succesvolle afronding van je cursus het officiële certificaat van deelname van Icttrainingen.nl

Krijg inzicht in uitgebreide voortgangsinformatie van jezelf of je medewerkers

Kennis opdoen met interactieve e-learning en uitgebreide praktijkopdrachten door gecertificeerde docenten

Orderproces

Once we have processed your order and payment, we will give you access to your courses. If you still have any questions about our ordering process, please refer to the button below.

read more about the order process

What is included?

Certificate of participation Yes
Monitor Progress Yes
Award Winning E-learning Yes
Mobile ready Yes
Sharing knowledge Unlimited access to our IT professionals community
Study advice Our consultants are here for you to advice about your study career and options
Study materials Certified teachers with in depth knowledge about the subject.
Service World's best service

Platform

Na bestelling van je training krijg je toegang tot ons innovatieve leerplatform. Hier vind je al je gekochte (of gevolgde) trainingen, kan je eventueel cursisten aanmaken en krijg je toegang tot uitgebreide voortgangsinformatie.

Life Long Learning

Follow multiple courses? Read more about our Life Long Learning concept

read more

Contact us

Need training advise? Contact us!


contact