Doctoral thesis

Software documentation : automation and challenges


156 p

Thèse de doctorat: Università della Svizzera italiana, 2020

English Despite the undeniable practical benefits of documentation during software development and evolution activities, its creation and maintenance is often neglected, leading to inadequate and even inexistent documentation. Thus, it is not unusual for developers to deal with unfamiliar code they have difficulties in comprehending. Browsing the official documentation, or accessing online resources, such as Stack Overflow, can help in this "code comprehension" activity that, however, remains highly time-consuming. Enhancing the code comprehension process has been the goal of several works aimed at automatically documenting software artifacts. Although these techniques addressed the issue, they exhibit a number of major limitations such as working at a coarse-grained level, and not allowing to document a single line of code of interest. While the creation of such novel systems entails conceptual and technical challenges related to the collection, inference, interpretation, selection, and presentation of useful information, it also requires solid empirical foundations on software developers' needs —— what information is (or is not) useful when to developers. Our thesis is that empirical knowledge about software documentation issues experienced and considered relevant by practitioners is instrumental to lay the foundations for the next-generation tools and techniques for automated software documentation. To this aim, in this dissertation we present our research accomplishments towards automating developer documentation on two fronts: (1) empirical studies on the nature of software documentation with a specific focus on documentation issues experienced by software developers, and (2) development of tools supporting the code comprehension process. In the former direction, we conducted a large-scale empirical study, where we mined, analyzed, and categorized a large number of documentation-related artifacts and developed a detailed taxonomy of documentation issues from which we infer a series of actionable proposals both for researchers and practitioners. We validated our findings by surveying professional software practitioners. In the latter direction, we developed ADANA, a framework which generates fine-grained code comments for a given piece of code at the granularity level intended by the developer. Our contributions to the body of software documentation knowledge shed light on unseen facts about overlooked software documentation matter and lay the foundations for the next-generation tools and techniques for automated software documentation.
  • English
Computer science and technology
License undefined
Persistent URL

Document views: 163 File downloads:
  • 2020INFO014.pdf: 445