DON26TZ01-NV008 TITLE: Automated Software Test Generation and Augmentation for Improved Debloating
OUSW (R&E) CRITICAL TECHNOLOGY AREA(S): Quantum and Battlefield Information Dominance (Q-BID)
COMPONENT TECHNOLOGY PRIORITY AREA(S): Advanced Computing and Software;Integrated Network Systems-of-Systems;Integrated Sensing and Cyber
PROJECTED CMMC LEVEL REQUIREMENT: Level 2 (Self)
OBJECTIVE: Develop an automated solution for developing, enhancing, expanding, and augmenting software tests to more safely broaden the employment of proactive cyber techniques such as debloating and post-construction software refactoring. Technology is needed to refine a suite of tests to a level such that it may serve as a practical expression of a software transformation objective to drive other tools as well as validate their output. Technology should leverage multi-modal methods such as ingesting code and documentation as well as be compatible with DevOps processes.
DESCRIPTION: Modern software development practices such as industrialized code reuse and artificial intelligence (AI) assistance enable developers to produce increasingly complex and capable software more quickly and cheaply than ever before. The tools to ensure that all this software is well-tested and that all of the included code is well-tailored to the deployment scenario, however, have lagged by comparison.
Modern applications often include hundreds to thousands of libraries and other dependencies, with often only a small portion of the code in each being ever needed by users in each deployment scenario. The excess code that remains often tends to be less used in general, less well-scrutinized, and full of obscure features that will often be found (sometimes only years later) to contain vulnerabilities. To address this problem, numerous tools have been developed to identify bloat and then modify the software by removing unneeded code [Ref 1]. Configurations, usage logs, and tests that are fed as inputs to code transformation tools to tell them what to cut are referred to as the debloat specifications [Refs 1, 2].
Because the economics of code reuse will continue to drive library and package developers to maximize generality, debloating must happen through a separate process that begins after those components are built into a specific application. The fact that another process will be modifying code separate from the original one that designed, implemented, and tested those components adds risk—it is not uncommon to see flawed or incomplete transformations. Evaluation results in [Ref 2] showed that 37% of the debloated binaries they created failed to correctly execute the functionality they were intending to retain.
Many factors can contribute to a transformation yielding a broken application, but one of the biggest is a low-quality debloat specification. Developer-authored tests are often limited and the users of debloating tools rarely can specify in exact detail all the features they actually need for a given deployment scenario. These incomplete specifications can lead tools to be overly aggressive in things like security checks and exception handlers that are critical to application safety and robustness [Ref 3].
To better address the problem of low-quality and incomplete debloat specifications, new technology is needed to more fully incorporate and automate the capturing of desired software behaviors for input to a debloat tool. The technology should be able to take advantage of code analysis as well as analysis of related artifacts such as documentation, build configs, existing tests, and even user input, as long as it can be made practical and easy for a user to answer. Various works have explored methods and techniques for capturing exception handers [Ref 3], balancing reduction with a targeted amount of generality [Ref 4], and leveraging AI to incorporate new tests [Refs 5, 6, 7]. All may inform strategies for automated test generation and augmentation that can lead to higher quality debloat specifications.
PHASE I: Define and develop a concept for automated multi-modal processing of code and other DevOps repository artifacts such as user guides, etc. to generate and augment a suite of tests that can serve as the inputs to proactive cyber security tools, namely debloating. Work toward a design that can develop tests based on unstructured documents and interact with a user to refine the tests. The Phase I Option, if exercised, would develop the initial test augmentation capability to create the full prototype in Phase II.
PHASE II: Develop a prototype containerized test augmentation capability to validate the concept defined in Phase I. Demonstrate the automated multi-modal processing of code, DevOps repository artifacts, and, if necessary, user interview inputs, into developing, enhancing, expanding, and augmenting software tests by the prototype. Ensure that the prototype is deployable in a software factory environment and able to develop many tests to sufficiently, reliably, and robustly enable the debloat of (1) an application using only its existing limited test suite, (2) unstructured program documents like user guides, and (3) real-time user input at the non-expert level by the end of Phase II.
PHASE III DUAL USE APPLICATIONS: Integrate the Phase II developed test augmentation capability with Program of Record systems and their applications. Field containerized solutions that integrate with existing build pipelines.
Potential commercial applications include automated software testing and fuzzing harness generation, a growing need due to the proliferation of AI-generated code.
REFERENCES:
KEYWORDS: Cyber; Software Testing; Automation; artificial intelligence; AI; machine learning; ML; large language model; LLM; Debloating; Feature Specification
TPOC 1
Ryan Craven
ryan.m.craven2.civ@us.navy.milTPOC 2
Dan Koller
daniel.p.koller.civ@us.navy.mil
** TOPIC NOTICE ** |
The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoW FY-26 Release 1 SBIR BAA. Please see the official DoW Topic website at www.dodsbirsttr.mil/submissions/solicitation-documents/active-solicitations for any updates. The DoW issued its Navy FY-26 Release 1 SBIR Topics pre-release on April 13, 2026 which opens to receive proposals on May 6, 2026, and closes June 3, 2026 (12:00pm ET). Direct Contact with Topic Authors: During the pre-release period (April 13, through May 5, 2026) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. The TPOC contact information is listed in each topic description. Once DoW begins accepting proposals on May 6, 2026 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period. DoD On-line Q&A System: After the pre-release period, until May 20, 2026, at 12:00 PM ET, proposers may submit written questions through the DoW On-line Topic Q&A at https://www.dodsbirsttr.mil/submissions/login/ by logging in and following instructions. In the Topic Q&A system, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing. DoW Topics Search Tool: Visit the DoW Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoW Components participating in this BAA.
|