Das Kapitel vertieft sich in die Spannung zwischen Effizienz und Benutzerfreundlichkeit im Rahmen tiefer Lernprozesse und beleuchtet die Zielkonflikte zwischen aufgeschobenen und eifrigen Ausführungsstilen. Es stellt Hybridize Functions vor, ein Open-Source-Tool, das entwickelt wurde, um imperativen Deep-Learning-Code automatisch umzurechnen, um die Leistung zu verbessern. Das Tool integriert statische Analysen von WALA und Ariadne in ein PyDev Eclipse IDE Plug-in, das es Entwicklern ermöglicht, Python-Funktionen zur Graphenausführung sicher und effizient zu transformieren. Das Kapitel behandelt die technischen Herausforderungen, vor denen das Werkzeug bei seiner Einführung stand, einschließlich der Integration von Tensortyp-Inferenzen und der Modernisierung von Ariadne zur Unterstützung zeitgenössischer Python-Konstrukte und TensorFlow 2-APIs. Es präsentiert auch eine Evaluierung des Tools zu 19 verschiedenen Python-Deep-Learning-Projekten, die eine durchschnittliche Beschleunigung von 2,16 bei vernachlässigbaren semantischen Unterschieden zeigt. Das Kapitel schließt mit einer Diskussion zukünftiger Arbeiten, einschließlich der möglichen Einbeziehung fortgeschrittener containerbasierter Analysen und automatischer Codeaufteilung, um die Hybridisierungsmöglichkeiten weiter zu verbessern.
KI-Generiert
Diese Zusammenfassung des Fachinhalts wurde mit Hilfe von KI generiert.
Abstract
Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code—supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. Though hybrid approaches aim for the “best of both worlds,” using them effectively requires subtle considerations to make code amenable to safe, accurate, and efficient graph execution—avoiding performance bottlenecks and semantically inequivalent results. We discuss the engineering aspects of a refactoring tool that automatically determines when it is safe and potentially advantageous to migrate imperative DL code to graph execution and vice-versa.
1 Introduction
Fig. 1.
Screenshot of the Hybridize Functions.
Machine Learning (ML), including Deep Learning (DL), systems are pervasive, and—as datasets grow—efficiency becomes essential to support responsiveness [53]. Efficient DL frameworks have traditionally embraced a deferred execution-style that supports symbolic, graph-based Deep Neural Network (DNN) computation [10, 21]. While scalable, development is error-prone, cumbersome, and difficult to debug [25, 26, 51, 52]. Contrarily, more natural, less error-prone, and easier-to-debug imperative DL frameworks [3, 12, 40] encouraging eager execution have emerged. They are, however, less efficient and scalable as their deferred-execution counterparts [10, 18, 20, 29, 37, 40]. Thus, hybrid approaches [4, 18, 37] execute imperative DL programs as static graphs at run-time. For example, in TensorFlow [1], AutoGraph [37] can enhance run-time performance by decorating (annotating) appropriate Python function(s) with
(Fig. 1).
Anzeige
Though promising, hybrid approaches require non-trivial metadata [29] and exhibit limitations and known issues [19] with native program constructs. Subtle considerations are required to make code amenable to safe, accurate, and efficient graph execution [5, 7‐9]. Alternative approaches [29, 34, 44] may impose custom Python interpreters or require additional or concurrently running components, which may be impractical for industry, support only specific Python constructs, or still require function decoration. Thus, developers are burdened with manually specifying the functions to be converted. Advances in DL are likely to be futile if they cannot be effectively used. Manual analysis and refactoring (semantics-preserving, source-to-source transformation) to achieve optimal results can be overwhelming, error- and omission-prone [13] and is further complicated by the increasing amount of Object-Orientation (OO) in DL code [12] and dynamically-typed languages (e.g., Python), where concrete runtime information is sparse at development time.
In this paper, we report on the design and implementation of a fully automated, publicly available, and open-source refactoring tool named Hybridize Functions [24] that transforms otherwise eagerly-executed imperative (Python) DL code for enhanced performance. The tool—implemented as a PyDev [50] Eclipse [16] Integrated Development Environment (IDE) plug-in that integrates static analyses from WALA [45] and Ariadne [15]—assists developers in specifying whether such code could be reliably and efficiently executed as graphs at run-time. Although it works on Python code, the tool nevertheless utilizes the Java Development Tools (JDT) [17] refactoring infrastructure [6] with a UI, preview pane, and refactoring unit tests. The approach at the tool’s foundation is based on a novel tensor (matrix-like data structures) analysis specifically for imperative DL code—infers when it is safe and potentially advantageous to migrate imperative DL code to graph execution or eagerly executing code already running as graphs.
Our tool interprocedurally identifies—at the project-level—Python functions that can execute more efficiently as hybrid functions and which may be hindered by hybrid execution. It also discovers potential side-effects in Python functions to transform functions to either execute eagerly or in hybrid mode safely. Though the refactorings operate on imperative DL code that is easier-to-debug than its deferred-execution counterparts, they themselves do not improve debuggability but instead enable performant yet easily-debuggable (imperative) DL code.
The tool was evaluated on 19 Python imperative DL programs of varying size and domain with a total of 132.05 K lines of code, where we found that 42.56% of candidate functions were refactorable, with an observed average speedup of 2.16 during performance testing. Due to its popularity and extensive analysis by previous work [11, 23, 25, 27, 35, 38, 51, 52], we focus on hybridization in Tensor- Flow. In this paper, we discuss engineering challenges we faced in implementing the tool used in the study. We make the following specific contributions:
Implementation and motivation. Our tool’s novel engineering aspects are detailed with a focus on its integration of tensor type inference at the instruction-based IR level with a Python development IDE plug-in. Also, architecture, API usage, data representations, algorithms, implementation issues, and a more comprehensive motivation are outlined.
Modernization engineering. We detail engineering aspects of our modernization effort of Ariadne in adding new enhancements, including new Python language features and additional library modeling.
Anzeige
2 Motivation
We present examples that highlight some of the challenges associated with analyzing and refactoring imperative DL code to be executed as graphs at run-time with improved efficiency. Listing 2 portrays TensorFlow imperative (OO) DL code representing a modestly-sized model for classifying images. By default, this code runs eagerly; however, it may be possible to enhance performance by executing it as a graph at run-time. Listing 2, lines 1 and 15 display the refactoring with the imperative DL code executed as a graph at run-time (added code is underlined). AutoGraph [37] is now used to potentially improve performance by decorating
with
. At run-time,
’s execution will be “traced” and an equivalent graph will be generated [19]. In this case, a speedup (\(\nicefrac { run time _{ old }}{ run time _{ new }}\)) of \(\sim \) 9.22 ensues [31]. Though promising, using hybridization reliably and efficiently is challenging [7‐9, 19, 29]. If used incorrectly, hybridization may yield programs that result in unexpected run-time behavior. For instance, side-effect producing, native Python statements are problematic for
*-decorated functions [19]. Because their executions are traced, a function’s behavior is “etched” (frozen) into its corresponding graph and thus can have unexpected results.
3 Implementation
3.1 Architecture and Dependencies
Fig. 2.
Overall architecture.
Figure 2 shows the overall architecture of our tool and the dependencies it builds upon. Hybridize Functions [24] is implemented as a publicly available, open-source Py- Dev [50] Eclipse [16] IDE plug-in and built upon the WALA [45] Ariadne [15] analysis framework. Eclipse is leveraged for its existing, well documented and integrated refactoring framework and test engine [6], including transformation APIs (e.g.,
), refactoring preview pane (Fig. 1), precondition checking (e.g.,
,
), and refactoring testing (e.g.,
).
A challenge in building the tool was to rework much of the existing Java (JDT) refactoring tooling to work with Python. For example, we re-engineered a common framework for building Java refactoring tools [32] to factor out refactoring components that are language agnostic and ones that were specific for Java. In Hybridize Functions, we then extend the parts that are language agnostic to work with Python. This approach allowed us to reuse much of the existing Java refactoring tooling, which is well-tested and robust, and only focus on the parts that are specific to Python. Although PyDev is—itself—an Eclipse plug-in that provides Python development support within Eclipse—including refactoring—it also uses Eclipse native refactoring APIs. Instead, we use PyDev for other features, as subsequently described.
PyDev is used for efficient program entity indexing, extensive refactoring support [6], and that it is completely open-source for all Python development. We built atop of PyDev a fully-qualified name (FQN) lookup that uses the indexing—ideal for large projects—and use it to resolve decorator names. PyDev uses Jython for Python parsing (Fig. 2).
WALA is used for static analyses, such as ModRef, for which we built our side-effect analysis upon. Ariadne, which depends on WALA, is used for its Python and tensor analysis, including type inference and (TensorFlow) library modeling. The tensor type inference is used to determine which functions should be hybridized, as those with tensor parameters are ideal for tracing as demonstrated in §2. The library modeling allows us to simulate the static analysis of large libraries like TensorFlow without actually running the tool. The modeling also allows us to track tensor values accepted and returned by library API. Though TensorFlow includes type hints that could also be used to track tensor values flowing into functions, they do not assist with the graph construction. For transformation, PyDev ASTs with source symbol bindings are used as an intermediate representation (IR), while the static analysis consumes a Static Single Assignment (SSA) [43] IR.
Both PyDev and Ariadne use Jython 3 for generating Python ASTs (Fig. 2). Thus, there is some redundancy in the AST generation, however, the ASTs are consumed for different purposes. Future work may involve decoupling the Python ASTs from both tools to have a single intermediate representation for both. There are some representation differences with the ASTs produced by Ariadne and that produces by PyDev that complicate the AST matching. For example, Ariadne considers type hints part of a parameter expression while PyDev does not.
3.2 Static Analysis Integration
Though both PyDev and Ariadne initially represent the Python source code as ASTs, as PyDev is an IDE plug-in with refactoring support, such a representation is necessary to perform transformations on the source code. Ariadne, on the other hand, transforms the Python ASTs to “CAsts” (Common Abstract Syntax Trees), which are part of WALA, i.e., its underlying framework. CAst is meant to represent multiple languages using a single ASTs and is commonly used by the JavaScript tooling supported by WALA. The CAst is then transformed to SSA IR, which is the typical input form for advanced static analyses. Thus, in our plug-in, because our input is a PyDev project (in AST) and because we eventually transform the source code (i.e., the AST), a mapping mechanism is necessary to correlate the original ASTs with the WALA produced SSA IR. In other words, we need to correlate the results obtained by the static analysis with the input source code so that we know which elements to transform and which not to transform. To do this, we adopt a similar mechanism as Khatchadourian et al. [33]. Specifically, we approximate the original source location of the resulting IR using various attributes. The matching is non-trivial as the SSA (3-operand address format; similar to assembly or bytecode) is very different than the Python source code. We match the file names, functions, and line numbers, as well as discover if the SSA element is a parameter or not (for function candidate selection). By finding the parameter number in the IR, we then match that with the parameter expression in the original AST. A key difference here between Khatchadourian et al. [33] is that, in their approach, the Java bytecode is analyzed, making it more difficult to correlate the results with the original source code. In our case, we have the original source code in CAst form, which makes the correlation easier, as CAst can store the original source positions. It is possible to do this in bytecode analysis, but the line numbers are approximated.
Although Ariadne has Python parsing capabilities, we integrated it with PyDev due to its excellent and long-lived refactoring support for Python, including refactoring preview pane, element GUI selection, and refactoring undo history. In other words, Ariadne, like WALA, is an program analysis framework and thus does not have AST manipulation capabilities that have been traditionally found in IDEs. PyDev—an IDE plug-in—can transform Python ASTs. An alternative is to use the analysis results from Ariadne to guide refactoring recommendations that are then sent to the IDE via the Language Server Protocol (LSP) [36]. Then, an IDE receiving the LSP messages would be responsible for executing the refactoring. However, refactoring support in LSP is currently pending [28, 46].
3.3 Modernizing Ariadne
Prior to our integration, Ariadne only worked on TensorFlow 1 code, i.e., deferred-execution style DL code. Specifically, it did not analysis summaries for API that is typical used by TensorFlow 2 clients, i.e., imperative DL code, and did not support Python constructs that were commonly used in this paradigm. We augmented Ariadne to analyze imperative Deep Learning (Python) code by vastly expanding the XML summaries to support a wide variety of popular TensorFlow 2 APIs. We also added support for Python module packages [41], wild card imports, intra-package references (relative imports;
) [42], package initialization scripts, automatic discovery of unit test entry points, iteration of non-scalar tensor datasets [22], modeling of additional and popular libraries [2, 39], and analyzing static and class methods, custom decorators, and callable objects (functors) (heavily used by Keras models). We have contributed these enhancements back to the open-source Ariadne project [47].
To implement wild card imports, we use a queue that inserts only at the beginning. Then, when we detect a wild card, the last wild card import to be seen in the Python file is considered first. This resolves a potential situation where multiple libraries can export the same name.
We further enable Ariadne to process code in Python module packages [41], i.e., having the input code being spread out among multiple files and directories. Although the original analysis is interprocedural, it did not originally support module packages, i.e., spanning local Python modules in complex directory structures. We implement this enhancement using a
variable that is optionally used as input to our analysis. This variable is a sequence of systems paths where the analysis should look for modules and resembles the variable used by the Python interpreter (a similar variable is used in PyDev [50]). We modified Ariadne such that when it finds a Python module that resides in a path contained in
, it adjusts the call graph node identifier so that other modules may find it through
statements.
Fig. 3.
Snippet of a
package initialization code (
) [30].
Related to packages, we also enhanced Ariadne to support package initialization scripts. To denote a (sub)package, typically, an (empty)
file would be placed in the (sub)package directory. The Python interpreter then treats any scripts in the directory as part of the package. However, this file may also have package initialization code contained within it. A common idiom is to include
statements here so that the scripts within the package can be more easily referenced by clients by only using the package name. For example, the
package initialization code in Fig. 3 enables clients to more simply import the BERT model using
. Without the initialization code,
would refer to the
module (
) instead of the class representing the BERT model in
.
To achieve this, we breakdown two cases; one for explicit imports and one for wildcards (the latter is shown in Fig. 3). For the first case, we add to the SSA IR a field to the globally exported value representing the
module (
in Lst. 1.2). The client SSA IR then references the name from the module as if the module contains the code declaring the name as opposed to importing it itself, as shown on line 6 in Lst. 1.2. The wild card (second) is more challenging and requires manipulating the pointer analysis as we have done with general wild card imports as described above but adding more flexibility in discovering the location of the instance, i.e., we add a two-step jump to the instance from the client code. Essentially, we substitute the package’s initialization script for the current script when we detect that the package is being imported.
As mentioned earlier, we also add support for intra-package references (relative imports;
) [42], which were popular in our subject set. They also may take the form of
, where
is a module in the specified relative package and
a name defined in
, e.g., a function, class, or variable.
We add the ability to analyze static and class methods to Ariadne. We do so by adding a class metadata variable to the constructor call and the trampoline in the SSA. The trampoline for class methods passes the class instance rather than an object instance. Complicating the matter somewhat is the ability to use an instance on the LHS, for which the class must be extracted.
3.4 Transformation
To transform a function to hybrid, we add the
decorator to the definition of the function. However, we first compute the correct prefix to use by analyzing the import statements in the file. In Python, import statements can reside anywhere in the file and may be scoped to certain blocks. Moreover, import statements can be repeated, with the closest import taking precedence over preceding ones. Lastly, imports can be arbitrarily aliased, e.g.,
,
, and
are all valid ways to import the TensorFlow library. The inserted decorator would then depend on this import. The aforementioned examples would result in the decorators
,
, and
, respectively.
To transform a function to eager, we remove the
decorator by first finding it in a potential list of decorators for the function in question. Like the previous case, finding the decorator is dependent on the import statements. Thanks to PyDev’s indexing, however, this turns out to be an easier case; we simply look up the decorator expression in PyDev’s database to see if it resolves to
. Note that a simple text search may result in incorrect removals; we encountered instances in our subject set where
did not refer to anything in TensorFlow but rather a custom entity, perhaps for mocking.
4 Evaluation Summary
We applied our approach to 19 open-source Python imperative DL programs of varying size and domain, with thousands of source lines of code ranging from 0.12 to 36.72. Our tool considered 766 Python functions, automatically refactoring 42.56% despite being highly conservative. During a run-time performance evaluation, we measured an average relative model training speedup of 2.16 (a memory consumption measurement is pending). Python is a complex language with many dynamic features; thus, our tool may not sound in all cases. To gauge the extent our tool produces correct results, we also measured model accuracy and loss before and after refactoring and found negligible differences. Our results suggest that our tool can nevertheless improve model training speed without introducing significant semantic differences. This is most likely due to our tool’s conservativeness, practitioners not favoring highly complex features [49], and Ariadne supporting some dynamic features like callbacks. While it possible for our tool to hybridize an incompatible function, the negligible differences in model accuracy and loss further suggest that these situations were avoided on our subjects. And the improved speedup suggests that it does not introduce retracing, although additional testing is currently ongoing. Code readability could also be impacted; however, our tool only adds or removes a single function decorator.
5 Conclusion & Future Work
Our automated refactoring tool, Hybridize Functions, assists developers with writing optimal imperative DL Python code. It is open-source and available as a PyDevEclipse plug-in. The tool integrates an Eclipse-based refactoring with the Python static analyses offered by WALAAriadne, for which we expanded for modern versions of TensorFlow and modern Python constructs that are commonly used in imperative DL programs. Nineteen Python DL projects totaling 132.05 K lines of code were used in the tools assessment, and a speedup of 2.16 on the refactored code was observed. In the future, we will explore incorporating advanced container-based analyses [14, 48] and automatically splitting DL code into more functions to increase hybridization opportunities.
Acknowledgments
This material is based upon work supported by the National Science Foundation under Award Nos. CCF 2200343, CNS 2213763, and CCF 2343750.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.