An ontology-based approach for the reconstruction and analysis of digital incidents timelines
Introduction
Nowadays, digital investigations require the analysis of a large amount of heterogeneous data. The study of these volumes of data is a tedious task which leads to cognitive overload due to the very large amount of information to be processed. To help the investigators, many tools have been developed. Most of them extract unstructured data from various sources without bridging the gap of semantic heterogeneity and without addressing the problem of cognitive overload. To resolve these issues, a promising perspective consists of using a precise and reliable representation allowing to structure data on the one hand, and to standardise their representation on the other hand. A structured and formal knowledge representation has two goals: 1) to build automated processes more easily by making information understandable by machines and 2) to give to investigators an easy way to query, analyse and visualise information. Computer forensics investigations also have to fulfil a set of legal and juridical rules to ensure the admissibility of results in a court. It is particularly necessary to ensure that all evidence presented at a trial are credible and that the methods used to produce evidence are reproducible and did not alter the objects found in the crime scene. Problems of traceability and reproducibility of reasoning are widely discussed in the literature and provenance is particularly relevant to be applied to digital investigations. Indeed, as defined by (Gil and Miles, 2013), the provenance of a resource is a record describing the entities and the processes involved in the creation, dispersion or others activities that affect that resource. The provenance provides a fundamental basis for assessing the authenticity and the truth value of a resource and its reproducibility.
In our work, we propose an innovative digital forensic approach based on a knowledge model allowing to represent accurately a digital incident and all the steps used during an investigation to produce each result. In addition, a full set of operators to manipulate the content of this ontology is proposed. We introduce extraction and instantiation operators to build automatically the knowledge base using digital traces extracted from disk images. Then, automatic analysis operators taking advantage of this ontology are also proposed. In particular, we focus on an operator used to identify potential correlations between events.
This paper is structured as follow: Section 2 gives a comprehensive state of the art of event reconstruction approaches for digital forensics. Section 3 introduces the SADFC approach. In particular, the structure of the three-layered ontology is detailed and its various operators are described. Section 4 evaluates the performance of our approach and illustrates its capabilities on a case study.
Section snippets
State of the art
Event reconstruction is a complex process because of three main issues: the large volume of data, the heterogeneity of information due to the use of a large number of sources and the legal requirements that the results have to meet. This section aims to review the existing solutions described in the literature to reconstruct and analyse past events. The quality and the relevance of nine reconstruction approaches are reviewed based on the previous three issues and their limitations. These
An ontology-based approach for event reconstruction
To fulfil the seven criteria highlighted in Section 2, we introduced an approach based on ontology. This section aims to show how this approach meets the problem requirements and it is structured as follows. Section 3.1 gives an overview of the proposed approach and justifies the conceptual choices that have ruled its development. Section 3.2 introduces the concepts and the relationships of the ontology representing accurately a digital incident. Section 3.3 presents the implemented tools to
Experimentation and results
This section aims to demonstrate the capabilities of our approach in the context of a digital investigation analysis. For this, we propose a study of the performance and relevance of the approach through simulated cases. The configuration of the machine used to run the experiment and hosting the triple store (Stardog server 2.2.4) has a 3.20 GHz Intel Core i5-3470 processor and 8 GB RAM. Disk images used are generated from a virtual machine running Windows 7.
The first task of the experiment is
Conclusion and future works
In this paper, we proposed an approach called SADFC for Semantic Analysis of Digital Forensic Cases. This approach is based on an ontology to represent accurately a digital incident and the associated digital investigation. The ontology, named ORD2I, is associated with a set of tools for extracting information from disk images seized on crime scenes, instantiating the ontology, deducing new knowledge and analysing it. SADFC provides answers to the three issues identified in the state of the
Acknowledgements
The above work is a part of a collaborative research project between the CheckSem team (Le2i UMR CNRS 6306 University of Burgundy) and the UCD School of Computer Science and Informatics. This project is supported by the University College Dublin and the Burgundy region (France). The authors would like to thank Séverine Fock for the advices and the proofreading of this manuscript.
References (22)
- et al.
Face: automated digital evidence discovery and correlation
Digit Investig
(2008) - et al.
Leveraging cybox to standardize representation and exchange of digital forensic information
Digit Investig
(2015) - et al.
A complete formalized knowledge representation model for advanced digital forensics timeline analysis
Digit Investig Fourteenth Annu DFRWS Conf
(2014) - et al.
Finite state machine approach to digital event reconstruction
Digit Investig
(2004) A translation approach to portable ontology specifications
Knowl Acquis
(1993)- et al.
A framework for post-event timeline reconstruction using neural networks
Digit Investig
(2007) - et al.
Automated event and social network extraction from digital evidence sources with ontological mapping
Digit Investig
(2015) - et al.
Automated recognition of event scenarios for digital forensics
Maintaining knowledge about temporal intervals
Commun ACM
(1983)Cyber observable expression (cybox) use cases
(2011)
The enhanced digital investigation process model
Cited by (46)
Ontology-based case study management towards bridging training and actual investigation gaps in digital forensics
2023, Forensic Science International: Digital InvestigationFormal concept analysis approach to understand digital evidence relationships
2023, International Journal of Approximate ReasoningDetecting the software usage on a compromised system: A triage solution for digital forensics
2023, Forensic Science International: Digital InvestigationCitation Excerpt :Log2timeline (Guðjónsson, 2010; Metz and Guðjónsson, 2021) displays a timeline of timing information extracted from different file types; timing information from various parts of the hard disk such as file system, Registry, log files, prefetch files, browser history, and system memory are displayed chronologically. Although log2timeline itself cannot reconstruct high-level events, it is used in several research projects (Chabot et al., 2015; Du and Scanlon, 2019; Bhandari and Jusas, 2020; Good and Peterson, 2017). Timeline2GUI (Debinski et al., 2019) is a graphical interface that reads CSV files generated by log2timeline and performs several operations, including filtering, sorting, searching, and highlighting text.
STITCHER: Correlating digital forensic evidence on internet-of-things devices
2020, Forensic Science International: Digital InvestigationCitation Excerpt :Chabot et al. (2014) presented a scenario reconstruction, semantic analysis and expert knowledge approach coupled with a formal-based timeline reconstruction and incident modelling. Chabot et al. (2015) further suggested the reconstruction and analysis of incidents via an ontology-based approach. These approaches work well for traditional investigations based on web browsing or even executable binaries on computers.
A semi-automated forensic investigation model for online social networks
2020, Computers and SecurityFormal knowledge model for online social network forensics
2020, Computers and SecurityCitation Excerpt :Judicial processes always demand an explainable theory for the conclusions generated by automated methods. However, very few proven and formal theories exist in digital forensics (Arshad et al., 2019b; Chabot et al., 2015). Notably, the existing theories are not entirely suitable for explaining the automated forensic process on online social networks.