Skip to main content

2019 | Buch

Scala Programming for Big Data Analytics

Get Started With Big Data Analytics Using Apache Spark

insite
SUCHEN

Über dieses Buch

Gain the key language concepts and programming techniques of Scala in the context of big data analytics and Apache Spark. The book begins by introducing you to Scala and establishes a firm contextual understanding of why you should learn this language, how it stands in comparison to Java, and how Scala is related to Apache Spark for big data analytics. Next, you’ll set up the Scala environment ready for examining your first Scala programs. This is followed by sections on Scala fundamentals including mutable/immutable variables, the type hierarchy system, control flow expressions and code blocks.
The author discusses functions at length and highlights a number of associated concepts such as functional programming and anonymous functions. The book then delves deeper into Scala’s powerful collections system because many of Apache Spark’s APIs bear a strong resemblance to Scala collections.
Along the way you’ll see the development life cycle of a Scala program. This involves compiling and building programs using the industry-standard Scala Build Tool (SBT). You’ll cover guidelines related to dependency management using SBT as this is critical for building large Apache Spark applications. Scala Programming for Big Data Analytics concludes by demonstrating how you can make use of the concepts to write programs that run on the Apache Spark framework. These programs will provide distributed and parallel computing, which is critical for big data analytics.
What You Will LearnSee the fundamentals of Scala as a general-purpose programming languageUnderstand functional programming and object-oriented programming constructs in ScalaUse Scala collections and functions Develop, package and run Apache Spark applications for big data analyticsWho This Book Is For
Data scientists, data analysts and data engineers who intend to use Apache Spark for large-scale analytics.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Scala Language
Abstract
Programming languages have been around for a very long time and have evolved significantly. Starting from the very foundational binary language, which merely consisted of 0 and 1 bits, a huge array of languages have been developed over the years to address a number of growing challenges in different contexts. All of these languages have a core purpose—to enable developers to write instructions in way that is understandable by computers with the goal of completing a specific task. Some languages specialize in a specific set of tasks, such as developing web applications, whereas others are preferred for Data Science and machine learning tasks. Yet others are recommended for developing applications in the Windows operating system, and the list goes on.
Irfan Elahi
Chapter 2. Installing Scala
Abstract
Before you can use Scala to write exciting programs and make your way to excellence in Apache Spark, you need to install Scala in your system. Even if Scala isn’t installed on your system, you can open Notepad or any text editor of your choice and write Scala code/expressions and then save the contents of that file with a.scala extension. However, that won’t help you in achieving your desired goal, which is to compile the program that you wrote, run your Scala program, or package it (mostly in the form of JAR files). Not to mention that you won’t be using the facilities that come with the Scala shell (more on that later). These and many other characteristics are enabled if you have Scala installed on your system.
Irfan Elahi
Chapter 3. Using the Scala Shell
Abstract
In the developer communities in nearly all parts of the world, productivity is deemed one of the critical key performance indicators (KPIs). If the experience of using a tool or language enhances a developer’s productivity, this is considered as a strong plus for that tool or language. There can be many language characteristics that increase a developer’s productivity, and one of the reasons Scala is loved by many is because of its REPL/shell feature.
Irfan Elahi
Chapter 4. Variables
Abstract
When you write programs, you use variables and you use a lot of them. The notion of variables was introduced briefly in the previous chapter. You use them to refer to different objects that you create. For example, if you are writing a program to store the result of the mathematical expression 10+5, you will typically store it in a container/placeholder so that you can reuse it with further operations. For instance, if you created a variable that stored an integer value, you could further use it for numerical operations like addition or subtraction. In the context of programming, these placeholders/containers are called variables, and that’s what we will thoroughly explore in this chapter.
Irfan Elahi
Chapter 5. Data Types
Abstract
In our daily life, we encounter data of varying nature. For instance, our names consist of letters, our mobile numbers are numbers, our decisions are usually yes (true) or no (false). To represent these and many other types, every programming language has a type system to support this notion. Using the combination of different types that come with a language’s type system, we can create variables associated with these types to implement tasks of varying nature. Like many other languages, Scala has a strong type system (which in fact is more sophisticated than many others) and there are different types available out of the box that you can use and work with.
Irfan Elahi
Chapter 6. Conditional Statements
Abstract
In life, we make a number of decisions at different moments. For example, if it’s raining then we will not play outside; if a customer is susceptible to be churned, then we will use a particular marketing approach, and so on. There are many examples of conditional statements like this. Similarly, in programming, at different instances we have to consider a number of decisions and based on the results, decide the logic/flow of the program. For example, if a username exists and if the password matches, the user will be able to log in; otherwise, they are denied access. If we don’t make these decisions, there can be no notion of intelligence in our programs and thus their usefulness is severely limited.
Irfan Elahi
Chapter 7. Code Blocks
Abstract
As you progress through this book, you will come to appreciate that Scala embodies a number of constructs that you can leverage to your advantage. Such constructs help structure your code better, reduce verbosity, and improve productivity. In this chapter, we explore a feature in Scala that will help you achieve some of the aforementioned benefits.
Irfan Elahi
Chapter 8. Functions
Abstract
When you are programming, you are solving a specific problem. The nature and complexity of that problem can vary. It can be as simple as finding a square root of a function or it can be as complex as writing data cached in RAM to multiple nodes in your environment in a parallel fashion (hint: Apache Spark). In either case, you write code and expressions that help you address that problem in your program. When you are in the mindset of programming, you think about what your input is and what your output is. In the case of finding the square root of a number, the number is your input and the square root of that number is your output. Then you write statements that help you address that problem. This concept lays the foundation of functions that we will explore in detail in this chapter.
Irfan Elahi
Chapter 9. Collections
Abstract
So far, we’ve been working with variables and data types. We have seen that every line in Scala is an expression that returns a value. You can also group multiple expressions in the form of code block so that the result of last expression gets returned. But in that case, only one value is returned and the variables we’ve been working so far have one value in them.
Irfan Elahi
Chapter 10. Loops
Abstract
Many tasks in our life require some degree of iteration or repetition. For instance, books have chapters and you start with one chapter and go until the end. If we want to add numbers from 1 to 100, we have to start at 1 and add them together until we reach the last number. If you are logging into an online web application, you may be prompted for the correct combination of username and password until you get it right. The list of such tasks can go on and on, but the gist of the matter is that if you have to model such tasks in programming languages, you will have to rely on constructs to do so.
Irfan Elahi
Chapter 11. Classes and Packages
Abstract
We all live in a world composed of different objects. Just pause and look around you. You will probably find objects like chairs, tables, laptops, televisions, and so on. If you put your programming hat on and observe them, you will find that each of these objects has two main parts:
Irfan Elahi
Chapter 12. Exception Handling
Abstract
It’s a cliché that life is unpredictable. At any point in life, the unexpected can happen and in order to survive the surprises of life, it’s important to be well prepared.
Irfan Elahi
Chapter 13. Building and Packaging
Abstract
Doesn’t it feel great that you are nearing your goal to learn Scala for Big Data analytics? With such an ambitious goal, you need to orient yourself according to the development patterns. You need to follow the development lifecycle currently practiced in the development community. You need to be exposed to another dimension in the development endeavors, which will enable you to extend your development efforts beyond Spark Shell. In a nutshell, now is the right time to pivot to topics that relate to building and packaging your Scala code.
Irfan Elahi
Chapter 14. Hello Apache Spark
Abstract
Doesn’t it feel good when you are in the vicinity of your envisioned and cherished destination? When you see in retrospect that you’ve been through a long journey and the milestone that you once dreamed of is in your reach? You must have the same feeling as you start this chapter, because this last chapter of the book is all about how you can put the concepts you’ve learned into practice and work on Big Data analytics and Apache Spark. So buckle up as it’s going to be an exciting ride!
Irfan Elahi
Backmatter
Metadaten
Titel
Scala Programming for Big Data Analytics
verfasst von
Irfan Elahi
Copyright-Jahr
2019
Verlag
Apress
Electronic ISBN
978-1-4842-4810-2
Print ISBN
978-1-4842-4809-6
DOI
https://doi.org/10.1007/978-1-4842-4810-2