C2S: Translating Natural Language Comments to Formal Program Specifications

Abstract:

The paper discusses the critical role of formal program specifications in software engineering (SE) tasks such as program verification, synthesis, software testing, and debugging. It acknowledges the challenges associated with manually written specifications, primarily their time-consuming nature and susceptibility to errors. To address these issues, the paper introduces C2S (Comments to Specifications), a novel tool designed to automatically translate natural language comments into formal program specifications.

Introduction:

Formal program specifications are essential in various SE tasks. However, the manual process of writing these specifications is fraught with disadvantages, prompting the need for automated tools. This paper proposes such a tool, motivated by three key observations:

  1. Modern SE projects often include abundant natural language documentation, which provides a potential source of information for automatic specification generation.
  2. Existing tools like @tComment, Toradocu, and Jdoctor rely on manually defined patterns, which limits their flexibility and applicability.
  3. There is a need for a more generalized approach to automatically synthesize specifications from diverse types of comments.

Contributions:

The paper’s contributions are twofold:

  1. It proposes a novel technique for automatically translating natural language comments into formal specifications.
  2. It develops a prototype tool based on this concept.

My personal comments:

  • An additional contribution could be a case study examining the distribution and categories of natural language comments in modern SE projects.
  • This study could also explore the limitations of existing tools in translating these comments.

System Design and Workflow of C2S:

The workflow and design of C2S are central to the paper. Two components are particularly emphasized:

  1. Grammar Rules: These are crucial for generating a range of specification candidates.
  2. Testing: This phase is vital for filtering out incorrect specification candidates.

The paper posits that a combination of these elements within C2S can effectively translate natural language comments into formal specifications, thereby enhancing the efficiency and accuracy of SE tasks.