Skip to main content

2018 | Buch

Practical Machine Learning with Python

A Problem-Solver's Guide to Building Real-World Intelligent Systems

verfasst von: Dipanjan Sarkar, Raghav Bali, Tushar Sharma

Verlag: Apress

insite
SUCHEN

Über dieses Buch

Master the essential skills needed to recognize and solve complex problems with machine learning and deep learning. Using real-world examples that leverage the popular Python machine learning ecosystem, this book is your perfect companion for learning the art and science of machine learning to become a successful practitioner. The concepts, techniques, tools, frameworks, and methodologies used in this book will teach you how to think, design, build, and execute machine learning systems and projects successfully.

Practical Machine Learning with Python follows a structured and comprehensive three-tiered approach packed with hands-on examples and code.

Part 1 focuses on understanding machine learning concepts and tools. This includes machine learning basics with a broad overview of algorithms, techniques, concepts and applications, followed by a tour of the entire Python machine learning ecosystem. Brief guides for useful machine learning tools, libraries and frameworks are also covered.

Part 2 details standard machine learning pipelines, with an emphasis on data processing analysis, feature engineering, and modeling. You will learn how to process, wrangle, summarize and visualize data in its various forms. Feature engineering and selection methodologies will be covered in detail with real-world datasets followed by model building, tuning, interpretation and deployment.

Part 3 explores multiple real-world case studies spanning diverse domains and industries like retail, transportation, movies, music, marketing, computer vision and finance. For each case study, you will learn the application of various machine learning techniques and methods. The hands-on examples will help you become familiar with state-of-the-art machine learning tools and techniques and understand what algorithms are best suited for any problem.

Practical Machine Learning with Python will empower you to start solving your own problems with machine learning today!

What You'll Learn

Execute end-to-end machine learning projects and systems

Implement hands-on examples with industry standard, open source, robust machine learning tools and frameworks

Review case studies depicting applications of machine learning and deep learning on diverse domains and industriesApply a wide range of machine learning models including regression, classification, and clustering.Understand and apply the latest models and methodologies from deep learning including CNNs, RNNs, LSTMs and transfer learning.

Who This Book Is For

IT professionals, analysts, developers, data scientists, engineers, graduate students

Inhaltsverzeichnis

Frontmatter

Understanding Machine Learning

Frontmatter
Chapter 1. Machine Learning Basics
Abstract
The idea of making intelligent, sentient, and self-aware machines is not something that suddenly came into existence in the last few years. In fact a lot of lore from Greek mythology talks about intelligent machines and inventions having self-awareness and intelligence of their own. The origins and the evolution of the computer have been really revolutionary over a period of several centuries, starting from the basic Abacus and its descendant the slide rule in the 17th Century to the first general purpose computer designed by Charles Babbage in the 1800s. In fact, once computers started evolving with the invention of the Analytical Engine by Babbage and the first computer program, which was written by Ada Lovelace in 1842, people started wondering and contemplating that could there be a time when computers or machines truly become intelligent and start thinking for themselves. In fact, the renowned computer scientist, Alan Turing, was highly influential in the development of theoretical computer science, algorithms, and formal language and addressed concepts like artificial intelligence and Machine Learning as early as the 1950s. This brief insight into the evolution of making machines learn is just to give you an idea of something that has been out there since centuries but has recently started gaining a lot of attention and focus.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 2. The Python Machine Learning Ecosystem
Abstract
In the first chapter we explored the absolute basics of Machine Learning and looked at some of the algorithms that we can use. Machine Learning is a very popular and relevant topic in the world of technology today. Hence we have a very diverse and varied support for Machine Learning in terms of programming languages and frameworks. There are Machine Learning libraries for almost all popular languages including C++, R, Julia, Scala, Python, etc. In this chapter we try to justify why Python is an apt language for Machine Learning. Once we have argued our selection logically, we give you a brief introduction to the Python Machine Learning (ML) ecosystem. This Python ML ecosystem is a collection of libraries that enable the developers to extract and transform data, perform data wrangling operations, apply existing robust Machine Learning algorithms and also develop custom algorithms easily. These libraries include numpy, scipy, pandas, scikit-learn, statsmodels, tensorflow, keras, and so on. We cover several of these libraries in a nutshell so that the user will have some familiarity with the basics of each of these libraries. These will be used extensively in the later chapters of the book. An important thing to keep in mind here is that the purpose of this chapter is to acquaint you with the diverse set of frameworks and libraries in the Python ML ecosystem to get an idea of what can be leveraged to solve Machine Learning problems. We enrich the content with useful links that you can refer to for extensive documentation and tutorials. We assume some basic proficiency with Python and programming in general. All the code snippets and examples used in this chapter is available in the GitHub repository for this book at https://github.com/dipanjanS/practical-machine-learning-with-python under the directory/folder for Chapter 2. You can refer to the Python file named python_ml_ecosystem.py for all the examples used in this chapter and try the examples as you read this chapter or you can even refer to the jupyter notebook named The Python Machine Learning Ecosystem.ipynb for a more interactive experience.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma

The Machine Learning Pipeline

Frontmatter
Chapter 3. Processing, Wrangling, and Visualizing Data
Abstract
The world around us has changed tremendously since computers and the Internet became mainstream. With the ubiquitous mobile phones and now Internet enabled devices, the line between the digital and physical worlds is more blurred than it ever was. At the heart of all this is data. Data is at the center of everything around us, be it finance, supply chains, medical science, space exploration, communication, and what not. It is not surprising that we have generated 90% of the world's data in just last few years and this is just the beginning. Rightly, data is being termed as the oil of the 21st Century. The last couple of chapters introduced the concepts of Machine Learning and the Python ecosystem to get started. This chapter introduces the core entity upon which the Machine Learning world relies to show its magic and wonders.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 4. Feature Engineering and Selection
Abstract
Building Machine Learning systems and pipelines take significant effort, which is evident from the knowledge you gained in the previous chapters. In the first chapter, we presented some high-level architecture for building Machine Learning pipelines. The path from data to insights and information is not an easy and direct one. It is tough and also iterative in nature involving data scientists and analysts to reiterate through several steps multiple times to get to the perfect model and derive correct insights. The limitation of Machine Learning algorithms is the fact that they can only understand numerical values as inputs. This is because, at the heart of any algorithm, we usually have multiple mathematical equations, constraints, optimizations and computations. Hence it is almost impossible for us to feed raw data into any algorithm and expect results. This is where features and attributes are extremely helpful in building models on top of our data.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 5. Building, Tuning, and Deploying Models
Abstract
A very popular saying in the Machine Learning community is "70% of Machine Learning is data processing" and going by the structure of this book, the quote seems quite apt. In the preceding chapters, you saw how you can extract, process, and transform data to convert it to a form suitable for learning using Machine Learning algorithms. This chapter deals with the most important part of using that processed data, to learn a model that you can then use to solve real-world problems. You also learned about the CRISP-DM methodology for developing data solutions and projects—the step involving building and tuning these models is the final step in the iterative cycle of Machine Learning.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma

Real-World Case Studies

Frontmatter
Chapter 6. Analyzing Bike Sharing Trends
Abstract
"All work and no play" is a well-known proverb and we certainly do not want to be dull. So far, we have covered the theoretical concepts, frameworks, workflows, and tools required to solve Data Science problems. The use case driven theme begins with this chapter. In this section of the book, we cover a wide range of Machine Learning/Data Science concepts through real life use cases. Through this and coming chapters, we will discuss and apply concepts learned so far to solve some exciting real-world problems.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 7. Analyzing Movie Reviews Sentiment
Abstract
In this chapter, we continue with our focus on case-study oriented chapters, where we will focus on specific real-world problems and scenarios and how we can use Machine Learning to solve them. We will cover aspects pertaining to natural language processing (NLP), text analytics, and Machine Learning in this chapter. The problem at hand is sentiment analysis or opinion mining, where we want to analyze some textual documents and predict their sentiment or opinion based on the content of these documents. Sentiment analysis is perhaps one of the most popular applications of natural language processing and text analytics with a vast number of websites, books and tutorials on this subject. Typically sentiment analysis seems to work best on subjective text, where people express opinions, feelings, and their mood. From a real-world industry standpoint, sentiment analysis is widely used to analyze corporate surveys, feedback surveys, social media data, and reviews for movies, places, commodities, and many more. The idea is to analyze and understand the reactions of people toward a specific entity and take insightful actions based on their sentiment.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 8. Customer Segmentation and Effective Cross Selling
Abstract
Money makes the world go round and in the current ecosystem of data intensive business practices, it is safe to claim that data also makes the world go round. A very important skill set for data scientists is to match the technical aspects of analytics with its business value, i.e., its monetary value. This can be done in a variety of ways and is very much dependent on the type of business and the data available. In the earlier chapters, we covered problems that can be framed as business problems (leveraging the CRISP-DM model) and linked to revenue generation. In this chapter we will directly focus on two very important problems that can directly have a positive impact on the revenue streams of businesses and establishments particularly from the retail domain. This chapter is also unique in the way that we address a different paradigm of Machine Learning algorithm altogether, focusing more on tasks pertaining to pattern recognition and unsupervised learning.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 9. Analyzing Wine Types and Quality
Abstract
In the last chapter, we looked at specific case studies leveraging unsupervised Machine Learning techniques like clustering and rule-mining frameworks. In this chapter, we focus on some more case studies relevant to supervised Machine Learning algorithms and predictive analytics. We have looked at classification based problems in Chapter 7, where we built sentiment classifiers based on text reviews to predict the sentiment of movie reviews. In this chapter, the problem at hand is to analyze, model, and predict the type and quality of wine using physicochemical attributes. Wine is a pleasant tasting alcoholic beverage, loved by millions across the globe. Indeed many of us love to celebrate our achievements or even unwind at the end of a tough day with a glass of wine! The following quote from Francis Bacon should whet your appetite about wine and its significance.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 10. Analyzing Music Trends and Recommendations
Abstract
Recommendation engines are probably the most well known Machine Learning applications. They are one of the most recognized form of Machine Learning solutions. A lot of people who don't belong to the Machine Learning community often assume that recommendation engines are its only use. Although we know that Machine Learning has a vast subspace where recommendation engines are just one of the candidates, there is no denying the popularity of recommendation engines. One of the reasons for their popularity is their ubiquitous nature; anyone who is online, in any way, has been in touch with a recommendation engine. They are used for recommending products on ecommerce sites, travel destinations on travel portal, songs/videos on streaming sites, restaurants on food aggregator portals, etc. The list is long that underlines their universal application.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 11. Forecasting Stock and Commodity Prices
Abstract
In the chapters so far, we covered a variety of concepts and solved diverse real-world problems. In this chapter, we will dive into forecast/prediction use cases. Predictive analytics or modeling involves concepts from data mining, advanced statistics, Machine Learning, and so on to model historical data to forecast future events. Predictive modeling has use cases across domains such as financial services, healthcare, telecommunications, etc.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Chapter 12. Deep Learning for Computer Vision
Abstract
Deep Learning is not just a keyword abuzz in the industry and academics alike, it has thrown open a whole new field of possibilities. Deep Learning models are being employed in all sorts of use cases and domains, some of which we saw in the previous chapters. Artificial neural networks have tremendous potential to learn complex non-linear functions, patterns, and representations and their power is driving research in multiple fields, including computer vision, audio-visual analysis, chatbots and natural language understanding, to name a few. In this chapter, we touch on some of the advanced areas in the field of computer vision, which have recently come into prominence with the advent of Deep Learning. This includes real-world applications like image categorization and classification and they very popular concept of image artistic style transfer. Computer vision is all about the art and science of making machines understand high-level useful patterns and representations from images and video so that it would be able to make intelligent decisions similar to what a human would do upon observing its surroundings. Building on core concepts like convolutional neural networks and transfer learning, this chapter provides you with a glimpse into the forefront of Deep Learning research with several real-world case studies from computer vision.
Dipanjan Sarkar, Raghav Bali, Tushar Sharma
Backmatter
Metadaten
Titel
Practical Machine Learning with Python
verfasst von
Dipanjan Sarkar
Raghav Bali
Tushar Sharma
Copyright-Jahr
2018
Verlag
Apress
Electronic ISBN
978-1-4842-3207-1
Print ISBN
978-1-4842-3206-4
DOI
https://doi.org/10.1007/978-1-4842-3207-1

Premium Partner