Hadoop Platform and Application Framework on Coursera

Just finished “Hadoop Platform and Application Framework” MOOC on Coursera. Several thoughts on it.

  1. The primary goal for me was to get the sense of various abbreviations in the Hadoop world(HDFS, YARN, Pig, Spark). I think I reached this goal.
  2. The course is pretty lightweight in terms of time to be spent. It took around 1-2 hours for each lesson to be done. Only once I had to spend around 3 hours which was related to an assignment and weird autograder output.
  3. Autograder in this class is not good. You can’t really say what went wrong with your results and where you need to search for the error.
  4. I could easily learn everything I needed on the 1.25x video speed.
  5. There are really high chances that your BASH utils will be much faster than Hadoop MapReduce approach if you don’t have gigabytes of data.
  6. Spark is amazing. It really makes the whole Hadoop cluster open for programmers like me. You really start creating normal functional code to process your data rather than struggling with the mappers/reducers paradigm directly.

This MOOC is definitely worth spending time on it. Looking forward to get into the next course named “Introduction to Big Data Analytics” from the same specialization.

Advertisements
This entry was posted in Education and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s