Learn the fundamentals of data streaming and build your own data pipeline with three main components:
- data streaming using Apache Kafka
- data processing using Apache Spark
- writing data to Apache Solr
Duration: 16 hours / 2 days
Knowledge: Intermediate
Potential trainer(s): Kristijan Pavlović
Language: Croatian, English
Possible locations: PI EDU center, At client premises, Virtual education
Learn the fundamentals of data streaming and build your own data pipeline with three main components:
The fields marked with * are required
Our Big Data education is focused on creating a real-world project utilizing Python and well-known open-source technologies like Apache Kafka, Apache Spark and Apache Solr. The education encompasses the entire process of practical passage through the fundamental principles of the aforementioned technologies, as well as the application of best practices in software development.
This education is intended for software engineers, data analysts, data scientists, and anyone with prior Python programming experience who is interested in creating a data pipeline that consists of data collection, processing, and storage.
Setting up the environment, installing the tools required for the data pipeline, and using Python programming to solve real-world issues.
Basic Apache Kafka concepts, building a Kafka Docker container, building a Kafka Topic using a graphical and console interface, and building a Kafka Producer using Python.
The fundamentals of Apache Spark, how to build Spark Docker containers, and how to use Python (PySpark) to create Spark Streaming applications.
The fundamentals of Apache Solr, the creation of Solr and Kafka Connect Docker containers, the beginning of writing data to Solr, and the viewing of written data.
This module teaches you what Apache Kafka is, what its applications are, and how to put your newfound knowledge into practice by building a Kafka Docker container, a Kafka Topic, and a Kafka Producer.
The purpose of this module is to introduce you to Apache Spark and its applications. You will also learn how to create a Spark Docker container and a Spark Streaming solution for data processing using the Python programming language.
In this module, you will learn what Apache Solr is, what its applications are, and how to use it by building Solr and Kafka Connect Docker containers, writing data to Solr, and examining the written data.
You can officially register for the training via the registration form at the top of this page. After successfully sending the application, we will get back to you within 24 hours of receiving the application.
Mistakes in communication always happen, and if we don’t answer you, contact us directly at birdacademy@inteligencija.com
We currently do not support payment in multiple installments.
Of course, all our trainings can be followed virtually.
Students are entitled to a 10% discount on the price shown on this page.
BIRD Academy provides a computer, lunch, coffee and water for each participant. We strive to be the best host possible in every segment!
Duration: 24 hours / 3 days
Knowledge: Intermediate
Next Date:
Duration: 16 hours / 2 days
Knowledge: Beginner
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Beginner
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Intermediate
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Beginner
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Intermediate
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Intermediate
Next Date:
Soon
Duration: 16 hours / 2 days
Knowledge: Beginner
Next Date:
Duration: 8 hours / 1 day
Knowledge: Beginner
Next Date:
Soon