1 Introduction
FMDiff
, the supporting tool, to extract a larger corpus of data covering more than twenty architecture-specific feature models applied for over sixteen releases of the Linux kernel, from release 2.6.39 until release 3.14. We use the collected data to draw lessons about the evolution of the Linux kernel.FMDiff
to compare the evolution of those different models and answer the following research question: RQ2: To what extent does a feature change affect all architecture-specific feature models of the Linux kernel? Our data show that the different architecture feature models follow very different evolution paths and that between 10 and 50 % of feature changes affect all architectures depending on the release. This suggests that extrapolation of observations done on the evolution of one architecture-specific feature model should be conducted with care, and points to a potential caveat in the Linux development process.FMDiff
is introduced and evaluated in Sect. 4. We illustrate the capability of our tool in Sect. 5 by answering our two research questions. We reflect on the use of FMDiff
and fine-grained feature changes in the context of the evolution of highly variable systems and product lines in Sect. 6. Section 7 presents related work. Finally, we conclude this paper and elaborate on potential future applications of FMDiff
in Sect. 8.2 Background: the Linux kernel variability model
Kconfig
language. In this section, we present general information regarding the Kconfig
language, the Linux kernel that we used as a case study, and the model transformation we perform on the Linux feature model before analysis.2.1 The Kconfig language
Kconfig
language.Kconfig
language, features have at least a name (following the config
keyword on line 3) and a type. The type attribute specifies what kind of values can be associated with a feature. A feature of type Boolean
can either be selected (with value y
for ‘yes’) or not selected (with value n
for ‘no’). Tristate features have a second selected state (m
for ‘module’), implying that the features are selected and are meant to be added to the kernel in the form of a loadable kernel module. Finally, features can be of type integer (int
or hex
) or type string
. In our example, the ACPI_AC
feature is of type tristate
(line 4). Features can also have default values, in our example the feature is selected by default (y
on line 5), provided that the condition following the if
keyword is satisfied. The text following the type on line 4 is the prompt
attribute. It defines whether the feature is visible in the configuration tools during the configuration process. The absence of such text means the feature is not visible.depends
(or depends on
) statement followed by an expression of features (see line 6). If the expression is satisfied, the feature becomes selectable. The second one, expressing reverse-dependencies, is declared by the select
statement. If the feature is selected, then the target of the select
will be selected as well (POWER_SUPPLY
is the target of the select
statement on line 7). The select
statement may be conditional. In such cases, an if
statement is appended. depends
, select
and constrained default
statements are used to specify the cross-tree constraints of the Linux kernel FM. A feature can have any number of such statements.
Kconfig
provides the means to express constraints on sets of features, such as the if
statement shown on line 1. This statement implies that all features declared inside the if
block depend on the ACPI
feature. This is equivalent to adding a depends ACPI
statement to every feature declared within the if
block. Another possibility is to use choices
. Such statement provides constructs similar to “alternative” (1 of) and “or” feature constraints (1 or more of) found in the FODA feature modelling notation [18]. A choice
itself can also be subjected to constraints and have dependencies expressed using depends
statement.Kconfig
offers the possibility to define a feature hierarchy using menus and menuconfigs. Those objects are used to express logical grouping of features and organize the presentation of features in the kernel configurator. The configurator may also rely on the dependencies declared between features to create the displayed hierarchy. Constrains defined on menus and menuconfigs are applicable to all elements within. Menu can have the “visible” attribute, associated with a Boolean expression of features, complementing the “prompt” attribute. More details about the Kconfig language can be found in the official documentation.2
2.2 The Linux kernel
Menuconfig
(among other tools), the kernel configurator. This tool displays available configuration options in the form of a tree, and as the user selects or unselects options, the tree is updated to show only options that are compatible with the current selection.2.3 Feature model representation
Undertaker
, to translate Kconfig features into an easier to process format [43]. This tool has been used in the past for similar purposes. Undertaker
uses it to reformat the Kconfig model before using it to determine feature presence conditions. It produces a set of “.rsf” files, containing annotated triplets formatted according to the “Rigi Standard Format” [40]. Each file contains an architecture-specific FM, i.e. an instance of the Linux FM where the choice of hardware architecture is predetermined.Undertaker
.ACPI_AC
and type tristate
. The second line declares a prompt attribute for feature ACPI_AC
and its value is set to true (1
). The third line declares the default value of the ACPI_AC
feature, which is set to y
if the expression X86&& ACPI
evaluates to true. Line 4 adds a select statement reading when ACPI_AC
is selected the feature POWER_SUPPLY
is selected as well, if the expression X86&& ACPI
evaluates to true. Finally, the last line adds a cross-tree constraint reading feature ACPI_AC
is selectable (depends) only if X86&& ACPI
evaluates to true.Undertaker
eases feature extraction but modifies their declaration. Among the applied modifications, two are most important for our approach: first, Undertaker
flattens the feature hierarchy and then resolves features depends
statements. Concerning the flattening of the hierarchy, Undertaker
modifies the depends
statement of each feature to mirror the effects of its hierarchy. For instance, Undertaker
propagates surrounding if
conditions to the depends
statements of all features contained in the if-block. This explains the addition of ACPI
to the condition of the depends
statement on line 5 of Listing 2. Concerning the resolution of depends
statements, Undertaker
propagates conditions expressed in the depends
statement of a feature to its default
and select
conditions. This explains the condition X86&& ACPI
that has been added to the select (ItemSelects
) and default value (Default
) statements. Such transformations will influence the results of the comparison process and the interpretation of the captured changes. However, it has to be noted that the changes preserve the Kconfig semantics as described in [33].
3 Change classification
default
statement and the change operation applied i.e. addition (ADD
), removal (REM
) or modification (MOD
). Figure 1 depicts our change classification scheme.ADD_FEATURE
, REM_FEATURE
and MOD_FEATURE
. In the following, we abbreviate lower-level change types by prefixing the feature property that can change with the three change operations ADD
, REM
, and MOD
.{ADD, REM, MOD}_ATTR
, {ADD, REM, MOD}
_
DEPENDS
, {ADD, REM, MOD}_DEF_VAL
and {ADD,REM,MOD}_SELECT
.-
Attribute change types: we track changes occurring on the type and prompt attributes. Combined with the three possible operations, we have
{ADD, REM, MOD}_TYPE
and{ADD, REM, MOD}_PROMPT
. -
Depends statement change types: depends statements contain a Boolean expression of features. We use a set of change types describing changes occurring in that expression, namely
{ADD, REM, MOD}_DEPENDS_EXP
. In addition, we further detail these changes by recording the addition and removal of feature references (mentions of feature names) in the Boolean expression with the two change types{ADD,REM}_DEPENDS_REF
. -
Default statement change types: default statements are composed of a default value and a condition. Both the condition and the value can be Boolean expressions of features. Default values can be either added or removed recorded as
{ADD, REM}_DEF_VAL
change types. Changes in the default statement condition are stored as{ADD, REM, MOD}_DEF_VAL_COND
. Finally, we track feature references changes in the default value using{ADD, REM}_DEF_VAL_REF
and in the default value condition using change types{ADD, REM}
_
DEF_VAL_COND_REF
. -
Select statement change types: select statements are composed of a target and a condition which, if satisfied, will trigger the selection of the target feature. Similar to the default statement change types, we record
{ADD, REM, MOD}_SELECT_TARGET
changes. Changes to the select condition are recorded as{ADD, REM,MOD}
_SELECT_COND
. Finally, to track changes in feature references inside a select condition, we use the{ADD, REM}_SELECT_REF
change types.
Kconfig
language. Note that feature references contained in depend statements, select statements and default value statements can only be added or removed as reference is either present or not. This leaves us with seven entities on which three operations are possible and three for which we will consider only two—for a total of twenty-seven change types.MOD_FEATURE
and the sub-category
MOD_DEF_VAL
, since the feature and default value declaration already existed, and finally the ADD_DEF_VAL_COND
change type denoting the addition of a condition to the default value statement, and a ADD_DEF_VAL_REF
change type for each of the features referenced in the added default value condition.Kconfig
provides several additional capabilities, namely menus to organize the presentation of features in the Linux kernel configurator tool, range
attribute on features and options such as env
, defconfig_list
or modules
. We do not keep track of menu changes, but we do capture the dependencies induced by menus. Undertaker
propagates feature dependencies of menus to the features a menu contains in the same way it propagates if
block constraints. Undertaker
does not export the range
attribute of features; therefore, we cannot keep track of changes on this attribute and do not include them in our feature change classification scheme. We plan to address this issue in our future work. Furthermore, Undertaker
does not export options such as env
, defconfig_list
or modules
, and we cannot track changes in such statements. But, because those options are not properties of features and do not change their characteristics, we consider the loss of this information as negligible when studying FM evolution.REM_FEATURE
denoting that the feature declaration was removed. Some combinations are also constrained by Kconfig, such as the change type
ADD_TYPE
can only occur in the context of a feature creation, i.e. with the change category
ADD_FEATURE
.merge feature
or move feature
. Such changes can be viewed as a combination of simple changes described by our change classification. A merge operation would then result in the deletion of a feature and probably changes in the constraints of another one. The semantic of the change operation is lost (we cannot know that it was a merge operation), but its effect on the FM itself is captured in the form of a set of change types.4 FMDiff
FMDiff
. We then compare feature changes captured by FMDiff
and changes observed in the original model. This allows us to evaluate the consistency of the changes captured with our approach and verify that FMDiff
provides more information than textual differencing.4.1 FMDiff overview
FMDiff
is to automate the extraction of changes occurring on the Linux FM and classify those changes according to the scheme presented in the previous section. The extraction of feature changes is performed in several steps as depicted in Fig. 2.
4.1.1 Feature model extraction
Undertaker
tool to extract architecture-specific FMs for each version. Undertaker outputs one “.rsf” file per architecture per version, in the format described in Sect. 2.FMDiff
. The rsf triplets contain Kconfig choice structures, which are not always named in the Kconfig files. They are automatically renamed by Undertaker
(e.g. CHOICE_32
) guaranteeing the consistency of the rsf representation. Because the naming process is an automatic and does not depend on the content of choice, or its attributes, the same choice structure can be renamed differently in different versions. As a consequence, we cannot rely on naming to identify uniquely and reliably evolving choice structures. For those reasons, we ignore all choices when reconstructing the feature model from “.rsf” files. Note that the hierarchy constrains imposed by the choices are still reported on the relevant features during the hierarchy flattening process. However, we do lose information regarding mutually exclusive features.choice
, referring to them by their generated name. We replace all choice identifiers in feature statements by CHOICE
. Doing this, we cannot trace the evolution of choice structures but prevent polluting the results with changes in the choice name generation order while we still are able to track changes in feature dependencies on choices.4.1.2 FMDiff feature model reconstruction
FMDiff
compares FMs that are instances of the meta-model shown in Fig. 3.FeatureModel
represents the root element having two attributes denoting the architecture and the version of the FM. A FeatureModel
contains any number of features represented as Feature
. Each feature has a name, type (Boolean, tristate, integer, etc.) and prompt attribute. In addition, each feature contains a Depends
attribute representing the depends
statements of a Kconfig feature declaration. All features referenced by the depends
statement are stored in a collection of feature names, called DependsReferences
.Default
Statements
, containing a default value and its associated condition. Furthermore, a feature can have any number of Select Statements
containing a select target and a condition. The condition of both statements is recorded as string by the attribute Condition
. The features referenced by the condition of each statement are stored in the collection DefaultValueReferences
or Select References
respectively.depends
statements, but in our meta-model, we allow features to have only one. In the case where FMDiff
finds more than one for a single feature, it concatenates those statements using a logical AND
operator. This preserves the Kconfig semantics associated with multiple depends
statements.OR
operator if their respective default values are the same. We do the same transformation for select statement conditions, for the same reasons.4.1.3 Comparing models
FMDiff
builds upon the EMF Compare
4 framework. EMF Compare is part of the Eclipse Modelling Framework (EMF) and provides a customizable “diff” engine to compare models. It is used to compare models in various domains, like interface history extraction [31], or IT services modelling [13], and is flexible and efficient. EMF Compare takes as input a meta-model, in our case the meta-model shown in Fig. 3, and two instances of that meta-model each representing one version of an architecture-specific Linux FM. EMF Compare outputs the list of differences between them.FMDiff
feature meta-model. For instance, a difference can be an “addition” of a string in the DependsReferences
attribute of a feature. Another example is the “change” of the Condition
attribute of a Select Statement
element, in which case EMF Compare gives us the old and new attribute value.4.1.4 Classifying changes
FeatureModel
object to identify which features have been added and removed, giving us the feature change category. Then, we focus on differences in “contains” relationships on each Feature
to extract changes occurring at a statement level, providing us with the change sub-category. The differences in attribute values of the various properties are then analysed to determine the change type. Finally, changes are regrouped by feature name, creating for each feature change the three-level classification.4.2 Evaluating FMDiff
FMDiff
’s value lies in its ability to accurately capture changes occurring on the Linux feature model (consistency) and its ability to provide information that would be otherwise difficult to obtain (interestingness). To evaluate FMDiff
with respect to those two aspects, we compare it with the information on changes that we obtained by manually analysing the textual differences between two versions of Kconfig files. We consider FMDiff
data to be consistent if it contains all changes seen in Kconfig files, and its data interesting if it provides more information than what can be obtained using textual differences. We start by describing the data set used for the evaluation and then assess them separately.4.2.1 Data set
4.2.2 Consistency
FMDiff
data set. Changes not meeting this criteria would be signs of inconsistencies between the two representations of the same changes. To evaluate the consistency of the captured changes, we verify that a set of feature changes observed in Kconfig files are also recorded by FMDiff
.FMDiff
captures feature changes per architecture, we first determine in which architecture(s) those feature changes are visible. Then, we compare Kconfig files diff’ with the feature changes captured by FMDiff
for one of those architectures. We pick architectures in such a way that all architectures are used during the experiment.FMDiff
data (1) matches the Kconfig modification if it contains the description of all feature changes—including attribute and value changes; (2) partially matches if FMDiff
records a change of a feature but that change differs from what we found out by manually analysing the Kconfig files; (3) mismatches if the change is not captured by FMDiff
.FMDiff
misses changes; hence, the more full matches, the more consistent FMDiff
data are. We also take into account that renamed features will be seen in FMDiff
as “added” and “removed”.FMDiff
data, described by 121 records of our database. A single partial match was recorded, caused by an incomplete “.rsf” file. A default value statement (def_bool_y) was not translated by Undertaker
in any of the architecture-specific “.rsf” files. In two cases, the FMDiff
changes did not match the Kconfig feature changes. In both cases, developers removed one declaration of a feature that was declared multiple (2) times, with different default values, in different Kconfig files. In FMDiff
, a change in the feature default value was recorded, which is consistent with the effect of the deletion on the architecture-specific FM. Based on this, we argue that FMDiff
accurately described this change.FMDiff
did capture all the changes occurring in “.rsf” files. Moreover, a large majority (94 %) of Kconfig file changes were reflected in FMDiff
’s data. In the remaining cases, FMDiff
still captures accurately the effects of Kconfig file changes on Linux FM. We conclude, based on our sample, that the data set obtained with FMDiff
is consistent with respect to the changes occurring on the Linux FM.4.2.3 Interestingness
FMDiff
to evaluate the interestingness of the collected data. We will consider that FMDiff
provides “interesting” information for developers and maintainers if it makes available information otherwise difficult to obtain.FMDiff
data set to the Kconfig file modifications that caused them. For each change, we determine the set of Kconfig files of both versions of the Linux FM that contain the modified feature. We then perform the textual diff on these files and manually analyse the changes. If the diff cannot explain the feature change recorded by FMDiff
, we move up the Kconfig file hierarchy and analyse the textual differences of files that include this file via the source
statement.FMDiff
changes and Kconfig file changes can either (1) match if the change can be traced to a modification of a feature in a Kconfig file; (2) indirectly match if the change can be explained by a Kconfig file change, but the feature or attribute seen as modified in the Kconfig file is not the same as the one observed in FMDiff
data; or finally, (3) mismatch if it cannot be traced to a Kconfig file change.FMDiff
change is the result of Undertaker
propagating dependency changes onto other feature attributes or onto its subfeatures (e.g. when a depends
statement is modified on a parent feature). Here, indirect matches indicate that FMDiff
captures side effects of changes made on Kconfig files, more difficult to observe using textual differences.Undertaker
’s output. We obtained 26 matches, 79 indirect matches, and finally 2 features were renamed and those changes were successfully captured as deletion and creation of a new feature. Among the indirect matches, 61 are due to hierarchy expansion and 18 due to depends
statement expansion on other attributes.FMDiff
could not be directly linked to feature changes in Kconfig files but to changes in the feature hierarchy or other feature attributes. We argue that even if FMDiff
data do not always reflect the actual modifications performed by developers in Kconfig files, it captures the effect of the changes on the Linux FM. In fact, those 79 indirect matches indicate that FMDiff
data contain more information than what can be obtained from the textual differences between two versions of the same Kconfig file, where such effects need to be reconstructed manually.5 Using FMDiff to understand feature changes in the Linux kernel feature model
FMDiff
captures changes occurring on features of the Linux kernel and stores each individual change in a database. Thanks to this format, we can easily query the gathered information to study the evolution of the kernel feature model (FM) over time. We use this information to identify the most common change operations performed on features and study the pervasiveness of feature changes across the multiple architecture-specific FMs of the kernel, and to answer the research questions as raised in the introduction.5.1 High-level view of the Linux FM evolution
FMDiff
data. RQ1: What are the most common operations performed on features in the Linux kernel feature model?
FMDiff
captures: addition, removal and modification of features. We use our database to query, for a given architecture, features that were changed during a specific release. Listing 3 shows an example of such query, giving us the number of features modified during release 3.0 for a single architecture. We compute, for sixteen releases, the total number of changed features and the number of modified, added and removed features in each architecture-specific FM, using only the first level of our change classification. To obtain an overview of the changes occurring in each release, we average number of modified, added and removed features per architecture.
5.2 Evolution of architecture-specific FMs
5.2.1 Motivation
FMDiff
in the form of the removal of ACPI_POWER_METER and the addition of SENSORS_ACPI_POWER. Using our database, we can observe that the removal of the ACPI_POWER_METER only affected two architectures: x86 and IA64. However, the addition of SENSORS_ACPI_POWER can be seen in x86, IA64 and ARM. Given the commit message, it is unclear whether this was the expected outcome or not. The change does not seem to have been reverted since then.FMDiff
data, we can observe that in release 3.0, the depend statement and select condition attributes of these features were modified in X86 (adding references to the X86 feature) in the X86 FM as a result of a change in the feature’s hierarchy. However, it is, for instance, also seen as added in ARM and other architecture-specific FMs.“Untested as I don’t have a cross-compiler.” 8
“We have only tested these patchset on x86 platforms, and have done basic compilation tests using cross-compilers from ftp.kernel.org. That means some code may not pass compilation on some architectures.” 9
“I didn’t compile-test any of it, I don’t do the cross-compile thing, and maybe I missed something.” 10
5.2.2 Methodology
FMDiff
database. Then, we isolate unique feature names from that set. We obtain a first list of feature names (marked as “1” in Fig. 5). We split that set into two: features that are seen as changed in FMDiff
data in all architecture-specific FMs, and those that are seen changed in only some architectures. This gives us the feature sets marked as “2.1” and “2.2” in Fig. 5.FMDiff
data set.5.2.3 Experimental setup
FMDiff
database. The scripts are available in our code repository.11
5.2.4 Results
Linux Kernel release | Total number of changed features | % of changed features affecting all architectures |
---|---|---|
2.6.39 | 1016 | 26.47 |
3.0 | 1020 | 58.43 |
3.1 | 567 | 35.62 |
3.2 | 2361 | 39.00 |
3.3 | 946 | 24.10 |
3.4 | 778 | 32.39 |
3.5 | 1103 | 39.16 |
3.6 | 823 | 34.14 |
3.7 | 1285 | 29.09 |
3.8 | 963 | 29.38 |
3.9 | 1773 | 57.75 |
3.10 | 1299 | 32.10 |
3.11 | 4556 | 8.12 |
3.12 | 1406 | 47.93 |
3.13 | 620 | 52.58 |
3.14 | 704 | 53.12 |
5.2.5 Architecture-specific evolution
6 Discussion
6.1 Fine-grained feature changes
Undertaker
hierarchy and attribute expansion, FMDiff
not only captures changes visible in Kconfig files, but also the side effects of those changes (indirect matches). It makes explicit FM changes that would otherwise only be visible by manually expanding dependencies and conditions of features and feature attributes. Such an analysis requires expertise in the Kconfig language as well as in-depth knowledge of Linux feature structures. As mentioned in Sect. 4.2, FMDiff
captures accurately a large majority of feature changes applied to the Linux kernel FM. Using FMDiff, feature changes are stored as lists of statement changes with the attribute values before and after the change (following our classification). Developers and maintainers modifying Kconfig files can use our tool to assess the effect of the changes they perform on the feature hierarchy. By querying FMDiff
data, they can obtain the list of feature changes between their local version and the latest release. This will give them insight on the spread of a change by answering questions such as “which features are impacted?” and “should this feature be impacted?”. Moreover, developers can follow the impact of changes performed by others on their subsystem, by looking at changes occurring on features of their sub-system.FMDiff
will help in such endeavours, pinpointing instances of such scenarios in this history of Linux kernel FM.6.2 Architecture-specific evolution
FMDiff
, can capture the impact of feature changes across architectures. With this additional information, developers would have a better view of how often their modifications affect different architectures, making them more aware to such situations. If they wish to cross-compile their code, then FMDiff
would give them a list of the impacted architectures to consider first.6.3 Threats to validity
FMDiff
include the edits performed by developers on Kconfig files as well as their consequences on the other features of the model. After the model transformation, we cannot differentiate between developer edits in the Kconfig files (human operation) and the propagated effect of those changes on other features. Following this, we transform the Undertaker model into an EMF model for comparison purposes; further modifying the data, we use for this study. We argue that both developer edits and their propagated effects are relevant for the study of the evolution of the Linux FM. The transformation performed by Undertaker adheres to the Kconfig semantics as described in [33] (except for the “range” attribute, which is not extracted). This comforts us in the idea that the transformed model in the “.rsf” format produced by Undertaker can be used as a mean to study the evolution of the Linux FM. The model transformation from “.rsf” to EMF does not preserve the semantics of the Kconfig language, as we do not keep track of the order of certain attributes (such as default statements), and we do not consider CHOICE
elements. Our data set cannot be used to reflect on the evolution of the allowed configurations of the Linux kernel: we cannot tell which configurations were added or removed by looking at the feature changes captured by FMDiff
. But, as we have shown in Sect. 4.2, the changes captured by FMDiff
are consistent with the changes observed in Kconfig files. For those reasons, we are confident that the gathered data can be used to observe and reflect on feature changes occurring in Kconfig models.FMDiff
. We have to consider that for a study over a longer period of time, we would have to take into account those changes, adapt the tool and classification in accordance to the evolution of the language.CHOICE
structures, present but with a specific naming convention, are removed from our intermediate model. However, the range
attribute is not used widely (less than 170 occurrences in 3.10 kernel, for over 12,000 features), and for this reason, we do not believe that this influenced our results or conclusions. During our manual evaluation of FMDiff, we found no occurrence of changes on CHOICE
structures, comforting us in the idea that this is not a common change. But we assume that such changes can occur and would be overlooked by FMDiff
. Changes to CHOICE
structure would impact the contained features—the hierarchy flattening transformation ensures this. While we do not capture CHOICE
changes, we can still observe their effects on features. For those reasons, we believe the loss of information has a minimal impact on our observations but must be taken into account for further analysis.Menu
, Menuconfig
, Choice
or If
construct is modified by developers, changes to its dependencies will be reflected on the features it contains. As direct consequence, we will observe more feature modifications than if we looked at the actual edits performed by the developers, increasing the number of observed modifications of existing features. We would argue here first that the modifications do occur: the features are indeed modified, but indirectly. In that sense, the captured information is accurate and does reflect the actual state of features in the feature model. Considering the overwhelming majority of modification of existing features in certain releases (more than 70 % in release 3.7), we believe that our conclusion holds: feature modifications are, if not the most, at least a very common type of change on every observed release.FMDiff
data should not be used to reflect on the possible configurations of the system, but only on feature changes.FMDiff
on other systems than Linux. The implementation of FMDiff
ties us to a specific type of system. Moreover, the Kconfig-based change classification has a pervasive effect on the different components of the tool, making adaptation potentially complicated. But the approach presented in this paper could be applicable to highly variable systems having an explicit variability model, as often found in the software product line domain for instance. While the Linux kernel is not a software product line, it does have the main technical characteristics of such systems [36] hinting that our approach could be applicable in this larger context. Existing feature change classifications [8, 26] can be adapted, as we did in this work, to match other feature notations. Then, one will have to adapt the feature model comparison process to support that new classification. Previous work on feature models showed that their maintenance can be complex and error prone [5, 15]. With an approach such as FMDiff
, it would be possible to extract new information about the evolution of the features using already existing artefacts, at the cost of adapting our tool.7 Related work
depends
statement can be either interpreted as a cross-tree constraint or a hierarchy relationship. As a consequence, we cannot automatically decide how a depend statement should be mapped to more standard FODA notation [18] and reuse the appropriate change classifier. Secondly, FMDiff
is able to capture changes in feature attributes which are not considered by these classifications.Undertaker
[10, 38, 42] are the main examples of such tools. We chose to rely on Undertaker
for its convenient wrapping of kconfigdump
, allowing us to use the same tools that are also used by the Linux kernel development team. LVAT could have allowed us to capture the feature hierarchy. However, kconfigdump
flattening of the hierarchy facilitated capturing feature hierarchy changes through changes of depends
statements.FMDiff
captures feature changes but does not use nor rely on commit information and file change details. We have shown that modifications played a major role in the evolution of the Linux FM, and for this reason, the data set built using FMDiff
appears to be more suited to describe in details the evolution of the Linux FM.8 Conclusion
FMDiff
tool, automating our approach, and the data set we built during this study. We showed that the data obtained with this tool is consistent with changes observed in the Kconfig model and provides more comprehensive information about feature changes than what could be obtained using textual differences. We used our tool to extract feature model changes occurring in sixteen releases of the Linux kernel, building a structured and detailed history of the Linux kernel FM evolution.FMDiff
data set to explore the evolution of the Linux kernel feature model. Our findings regarding the evolution of this model constitute our last two contributions, highlighting the informative value of fine-grained feature changes and approaches such as FMDiff
.FMDiff
data to compare the evolution of the different architecture-specific FMs of the Linux kernel. This allowed us to show that the different architectures evolved differently and that feature changes affecting multiple architectures were common. Based on this information, we made the following two observations. First, we pointed out that future research on the evolution of the Linux kernel FM should specify which architectures were studied, as observations made on a small subset of architecture-specific FMs are not generalizable to all of them without careful consideration. We then show that the gathered information allows to reflect on the development practices of the kernel developers with respect to multi-architecture development processes.FMDiff
can be used to facilitate maintenance operations. The data set built using FMDiff
could be used to link the evolution of variability models with the evolution of their implementation. Modifications of feature dependencies captured by our approach could be valuable information when observing changes in code dependencies for instance. Another possibility would be to explore the relationship between the fine-grained changes and delta-oriented approaches used in the management of product lines, where our representation of changes could be of use. While we have shown here that feature changes do not equally affect all architecture-specific feature models of the Linux kernel, a subset of the architecture-specific FMs might evolve similarly. The identification of such groups of architecture-specific FMs would allow us to refine the extent to which conclusions drawn from the observation of a single architecture-specific FMs can be generalized.