Saturday, July 13, 2019

Basic Programming guide to begin with Apache Spark

0 comments

When you are planning to learn Apache spark the first thing which comes in mind is: 

"How Much Programming I should know to begin with Apache Spark?"

Database developers are more inclined to learn big data, and they are more comfortable in writing SQL or PLSQL code but not Python, Scala or Java.  People think that programming language is the critical prerequisite for learning Spark/Big Data and they end up spending lots of time and enthusiasm in strenuous intricacies of coding.  


It’s obvious that more you learn programming the better developer you will become. However, this article covers how much and what all programming concepts once should know to get started with Apache Spark. I will mainly cover two programming language Python and Scala. In this article we will discuss the bare minimum programming concepts needed to start with Apache Spark.

These are the topics which you must understand before starting hands on in Spark: -

1.   Variables
2.   Conditional Statements 
3. Loop
4.   Function/Procedure
5.   Exception
6.   Data Structures
7.    Lambda Functions
8.   Creating/Importing modules, jars(in Scala)
9.   Class and Object
10.   Some built in methods (e.g. eval(), range(), exec(), len(), rand(), datetime())


Apart from these, a little understanding of python modules and scala/java jars would also be required. To begin with Apache Spark you just need to have basic understanding of the above mentioned topics, doesn’t matter which language you prefer. Once you are done, you are good to start.

If you are really naïve in programming then I would suggest you to go with Python, Python will be very easy as it has faster learning curve. I will prepare separate tutorial for Python and Scala to cover the topics mentioned above.

Guys All the Best and get ready to explore the lightening-fast data processing power of Spark.  


To learn more on Spark click hereLet us know your views or feedback on Facebook or Twitter @BigdataDiscuss.

No comments:

Post a Comment