Q&A Session – Spark, Flink, Cluster Sizing and more – Hadoop In Real World

Q&A Session – Spark, Flink, Cluster Sizing and more

What is RDD?
October 11, 2017
What is ZooKeeper and it’s Use Case
October 25, 2017

We hosted a webinar Saturday, October 14th 2017 and we answered some great questions that was posted by Hadoop In Real World community and also from participants in the webinar. We would like to thank everyone who joined the webinar. Here are some of the questions we covered in the session. 

Questions answered

  1. How to decide on number of reducers?
  2. How to deal with records with different schema in dataset?
  3. What is Vectorized query execution?
  4. Is there a good use case of Spark for ETL type workloads?
  5. What is Apache Flink and how it changes the big data ecosystem?
  6. What are the differences between Flume, Kafka streaming and Spark streaming?
  7. How to size a cluster?
  8. What are the prerequisites for Hadoop Developer, Administrator and Tester roles?

We quite often hosts webinars like these and sign up below to get invitations to join one of our webinars.

Here is the full recording of the webinar. Enjoy!

Hadoop Team
Hadoop Team
We are a group of Senior Hadoop Consultants who are passionate about Hadoop and Big Data technologies. Our collective experience ranges from finance, retail, social media and gaming. We have worked with Hadoop clusters ranging from 100 all the way to over 1000 nodes.

Comments are closed.