Speakers
Description
Artificial Intelligence for Software Engineering (AI4SE) examines how AI may improve the software-intensive systems engineering lifecycle encompassing several artifacts spanning all phases of continuous integration and continuous development. As these phases involve heterogeneous data sources and added levels of complexity according to the use case, traceability plays a vital role in ensuring the completeness of the software development process. Over time these linked artifacts are stored in system modellers and relational databases which ultimately increases the end-user effort in terms of heterogeneity, querying, quality assurance, time and storage complexity.
However, recent advances in the field of graph-based technologies, Natural Language Processing (NLP) and Machine learning (ML) aid to overcome these limitations. Traceability graphs are heterogeneous dynamic graphs stored in graph databases as they evolve throughout the project lifecycle. ML and NLP technologies foster research in building and analysing these connected traceability graphs. With comprehensive research on domain ontologies and system development activities in AI, algorithms can learn from the structure of large-scale graphs and takes advantage of the “relationship-first” nature of graph schemas. This enables the ability to vectorize nodes, edges and even graphs in low-dimensional space for feature learning called “embeddings”(vectors). This automated way of learning the intricate features from the topology of graphs for tasks aids in node/edge reconstruction, pattern discovery, node/edge classification etc. Further, storing them as embeddings reduces space and time complexity for stakeholders.
An MBSE-based use-case on graph-based benchmark traceability dataset mAquaLush is explored. A framework is implemented for building and analysing an ontology-driven knowledge graph as a part of a Master’s thesis titled “Conceptualization of a framework for AI-based software graph completion from heterogeneous data sources”. On experimenting the with graph completion problem, i.e proposing a missing link between two artifacts or an anticipated link feature learning-based walk methods (DeepWalk, RandomWalk and Meta2Vec) were collectively proven to perform efficiently.
Keywords: Traceability graphs, Ontologies, Feature engineering, Feature learning, Graph completion, Knowledge graphs