Q&A Session - Spark, Flink, Cluster Sizing and more - Big Data In Real World

Q&A Session – Spark, Flink, Cluster Sizing and more

What is RDD?
October 11, 2017
What is ZooKeeper and it’s Use Case
October 25, 2017

We hosted a webinar Saturday, October 14th 2017 and we answered some great questions that was posted by Hadoop In Real World community and also from participants in the webinar. We would like to thank everyone who joined the webinar. Here are some of the questions we covered in the session. 

Questions answered

  1. How to decide on number of reducers?
  2. How to deal with records with different schema in dataset?
  3. What is Vectorized query execution?
  4. Is there a good use case of Spark for ETL type workloads?
  5. What is Apache Flink and how it changes the big data ecosystem?
  6. What are the differences between Flume, Kafka streaming and Spark streaming?
  7. How to size a cluster?
  8. What are the prerequisites for Hadoop Developer, Administrator and Tester roles?

We quite often hosts webinars like these and sign up below to get invitations to join one of our webinars.

Here is the full recording of the webinar. Enjoy!

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

gdpr-image
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X