nach oben

2021 | Buch

Python Programming for Data Analysis

verfasst von: Dr. José Unpingco

Verlag: Springer International Publishing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This textbook grew out of notes for the ECE143 Programming for Data Analysis class that the author has been teaching at University of California, San Diego, which is a requirement for both graduate and undergraduate degrees in Machine Learning and Data Science. This book is ideal for readers with some Python programming experience. The book covers key language concepts that must be understood to program effectively, especially for data analysis applications. Certain low-level language features are discussed in detail, especially Python memory management and data structures. Using Python effectively means taking advantage of its vast ecosystem. The book discusses Python package management and how to use third-party modules as well as how to structure your own Python modules. The section on object-oriented programming explains features of the language that facilitate common programming patterns.

After developing the key Python language features, the book moves on to third-party modules that are foundational for effective data analysis, starting with Numpy. The book develops key Numpy concepts and discusses internal Numpy array data structures and memory usage. Then, the author moves onto Pandas and details its many features for data processing and alignment. Because strong visualizations are important for communicating data analysis, key modules such as Matplotlib are developed in detail, along with web-based options such as Bokeh, Holoviews, Altair, and Plotly.

The text is sprinkled with many tricks-of-the-trade that help avoid common pitfalls. The author explains the internal logic embodied in the Python language so that readers can get into the Python mindset and make better design choices in their codes, which is especially helpful for newcomers to both Python and data analysis.

To get the most out of this book, open a Python interpreter and type along with the many code samples.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Basic Programming

Abstract

Understanding the internal logic of the Python language makes it easier to use effectively. We provide the motivations and development for key data structures such as lists and dictionaries as well as looping structures, decorators, and generators. We detail how memory is used with these data structures as well as an in-depth breakdown of the internals of Python functions. Python asynchronous programming via asyncio is discussed as are methods for debugging and logging codes.

José Unpingco

Chapter 2. Object-Oriented Programming

Abstract

Python is an object-oriented language but with many features implemented by convention instead of by the language itself. This leads to more flexibility in object-oriented design. We discuss and develop examples using Python multiple inheritance and break down the individual elements of Python class design including class functions and static methods. Metaprogramming techniques such monkey-patching are developed alongside abstract base classes. Some common design patterns implemented using Python are also discussed.

José Unpingco

Chapter 3. Using Modules

Abstract

Python comes with an amazing standard library as well as a lively community of third-party third-party modules. Thus, using modules effectively is key to good Python programming. This section develops the methods and strategies for using both the standard library and third-party modules, as well as recommendations for creating virtual environments for code development and deployment. Both the pip andn conda package managers are discussed.

José Unpingco

Chapter 4. Numpy

Abstract

Numpy numerical arrays are the foundation of all data science and machine learning in Python. This section develops the Numpy array data structure in detail, especially memory management. Slicing, reshaping, and stacking arrays are developed in detail. Using the wide variety of universal functions to accelerate numerical computations is discussed, as is broadcasting, arguably the most powerful feature of Numpy arrays. Managing numerical types using Numpy dtypes is key to accelerating computations and using memory effectively. Numpy encloses a powerful linear algebra library that is also discussed.

José Unpingco

Chapter 5. Pandas

Abstract

Pandas is a powerful data processing library that makes complicated data transformations almost automatic. This chapter develops the key data structures of Pandas, the Series and DataFrame, as well as how to use them effectively. Pandas categorical objects allow for efficient memory usage. Like Numpy, Pandas also supports broadcasting. Understanding the Pandas MultiIndex object helps slicing and aligning multidimensional data. Pandas provides an extension framework for customizing the visual display of DataFrames, which abbreviates codes by adding new code to the DataFrame itself. Python supports methods such as rolling and filling, which are very important longitudinal time-series analysis.

José Unpingco

Chapter 6. Visualizing Data

Abstract

Data visualization is key for presenting analysis results as well as for debugging codes. Matplotlib is developed in detail as are web-based visualization alternatives such as Bokeh, Altair, Holoviews, and Plotly. The Seaborn statistical visualization module, which is built on top of Matplotlib, is developed in detail.

José Unpingco

Backmatter

Titel: Python Programming for Data Analysis
verfasst von: Dr. José Unpingco
Verlag: Springer International Publishing
Electronic ISBN: 978-3-030-68952-0
Print ISBN: 978-3-030-68951-3
DOI: https://doi.org/10.1007/978-3-030-68952-0

Neuer Inhalt

Bildnachweise

VDI-Icon, Profil Icon, inhalt2, Springer Professional Modul/© Springer Fachmedien Wiesbaden GmbH, Nachhaltigkeitsaward Key Visual/© Cometis AG/Global ESG Monitor | Daniel Rupp | Generiert mit KI, Search Icon, Banner Hanser, Beijing Auto Show 2024: Deutsche Hersteller wollen angreifen./© EKH-Pictures / Generated with AI / Stock.adobe.com, Buchstaben, die aus einem Megaphon kommen/© MicroStockHub/Getty Images/iStock, Digitale Lieferkette/© zapp2photo / stock.adobe.com, Zeitschrift Wissensmanagement Cover, PatentFit-Logo/© Springer Fachmedien Wiesbaden GmbH, Sustainibility Finance/© Robert Kneschke / stock.adobe.com / Springer Fachmedien Wiesbaden GmbH, Zukunftswerkstatt Sales Excellence 2024/© AndreyPopov / Getty Images / iStock, 2023_Antrieb/© supervisuell

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Chapter 1. Basic Programming

Chapter 2. Object-Oriented Programming

Chapter 3. Using Modules

Chapter 4. Numpy

Chapter 5. Pandas

Chapter 6. Visualizing Data

Backmatter

Neuer Inhalt

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.