DON26BZ01-NV010; TITLE: E-2D Large Language Model Entity (ELLMENT)
COMPONENT TECHNOLOGY PRIORITY AREA(S): Advanced Computing and Software;Trusted AI and Autonomy
PROJECTED CMMC LEVEL REQUIREMENT: Level 2 (Self)
The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.
OBJECTIVE: Develop and implement a traceable, explainable, referenced, and reasoned Large Language Model (LLM) that functions as an on-demand Natural Language Processing (NLP) decision-support assistant for Naval Flight Officers (NFOs) and mission crew aboard a carrier-based, all weather, tactical battle management, airborne early warning, and command and control aircraft.
DESCRIPTION: Artificial Intelligence/Machine Learning (AI/ML) technologies are transforming how complex data is understood and acted upon in operational environments. This SBIR topic seeks to explore the development of a domain-specific LLM system to support rapid insight generation from structured and unstructured documents (e.g., Tactics, Techniques, and Procedures [TTPs]), mission logs, communications, and other high-volume data sources relevant to tactical operations.
The goal is to deliver a modular, self-contained AI/NLP solution that can assist NFOs and mission crew by summarizing, reasoning over, and extracting meaning from dense operational material in real time. This LLM must be specifically designed to operate in a stand-alone configuration in accordance with information assurance policies, with mechanisms for traceability, where the information came from and how is it connecting to the goal, source attribution, and model transparency. The system must also support future extensibility to multi-modal data ingestion.
Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. owned and operated with no foreign influence as defined by 32 U.S.C. § 2004.20 et seq., National Industrial Security Program Executive Agent and Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Counterintelligence and Security Agency (DCSA) formerly Defense Security Service (DSS). The selected contractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances. This will allow contractor personnel to perform on advanced phases of this project as set forth by DCSA and NAVAIR in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material during the advanced phases of this contract IAW the National Industrial Security Program Operating Manual (NISPOM), which can be found at Title 32, Part 2004.20 of the Code of Federal Regulations.
PHASE I: Define and develop the foundational architecture and baseline capability for implementing Large Language Model Operations (LLMOps) in support of mission decision-aid tools for the E-2D platform, as outlined by Gallagher et al. [Ref 2].
1. Security, Ethics, and Data Governance Planning
• The small business will collaborate with relevant Navy civilian representatives—such as TPOCs and PMA-231 S&T leads—to:
• Establish appropriate data classification levels for training and deployment environments
• Define a cybersecurity framework aligned with DOW and platform-specific requirements
• Incorporate an ethical AI governance structure, including bias mitigation and auditability provisions
2. LLM Selection and Mission Alignment
• An appropriate LLM architecture will be selected based on mission-specific demands of the aircraft operator, with consideration for:
• Performance in tactical and technical language domains
• Model transparency and explainability
• Compatibility with in-theater deployment constraints
3. Corpus Curation and Model Training
• The selected LLM will be trained on an aircraft relevant corpus, including—but not limited to—mission-specific Tactics, Techniques, and Procedures (TTPs), doctrine documents, and communication logs. Training methodologies will include:
• Prompt engineering
• Fine-tuning with Navy-specific linguistic patterns and use cases
• Retrieval-Augmented Generation (RAG) to support on-demand referencing of large knowledge bases
4. Evaluation and Output Validation
• Model performance will be assessed using a comprehensive metrics suite, as recommended by Diaz-de-Arcaya et al. (2024) and Gallagher et al. (2023), including:
• Response accuracy and relevance
• Appropriateness and alignment with operational context
• Bias detection and mitigation
• Trustworthiness
• Independent Subject Matter Expert (SME) evaluation
5. Deployment Pathways and Phase II Readiness
• As part of final Phase I efforts, the small business will:
• Evaluate and down-select hardware and software deployment options (e.g., computer architecture, human-machine interface designs)
• Develop a baseline implementation roadmap for transitioning to Phase II prototype construction and TRL advancement
PHASE II: The developed LLM will be deployed to a stand-alone laboratory environment, for rigorous evaluation in an Operator-in-the-Loop (OITL) configuration. In this setup, NFOs and mission operators will engage with the LLM across representative command and control mission scenarios to assess its efficacy as a real-time natural language decision-support system.
Subject Matter Experts (SMEs) in specific operations and AI/ML will conduct structured evaluations using predefined metrics identified in Phase I —including response accuracy, contextual relevance, trustworthiness, and bias sensitivity. Iterative testing cycles will drive continuous refinement of the model’s behavior and performance . Performance referring to the system’s suggestions compared to SME suggestions.
To support future scale-up, candidate computing architectures will be assessed, including emerging platforms such as quantum-accelerated processing (e.g., D-Wave). These evaluations will focus on increasing operational capacity, expanding conversational memory (buffer length) and handling of larger mission datasets in constrained compute environments.
A lifecycle monitoring framework will also be established to operate the LLM Ops strategy introduced in Phase I. This includes procedures for tracking long-term model performance, retraining triggers, audit logs, output traceability, and alignment with evolving mission requirements.
Work in Phase II may become classified. Please see note in the Description section.
PHASE III DUAL USE APPLICATIONS: Upon successful completion of final verification and validation (V&V) testing, the developed system will be authorized for transition to designated operational platforms and associated industry partners, in alignment with established Navy acquisition and technology transition procedures.
In parallel, the capability has garnered interest from additional mission-critical stakeholders—specifically ONR Code 32, in connection with Anti-Submarine Warfare (ASW) mission domains. This cross-domain interest highlights the system’s adaptability and potential for broader operational utility beyond its original use case, further enhancing the value and return on investment for the Department of the Navy.
The development and refinement of such an LLM pushes the boundaries of AI and NLP, contributing to the overall advancement of these technologies. The need for traceable, referenced data management promotes innovation in data governance, lineage tracking, and knowledge management, which are valuable for private sector organizations dealing with large datasets.
Examples of Dual-Use Applications include:
• Predictive Maintenance: Predicting equipment failures and optimizing maintenance schedules
• Supply Chain Optimization: Optimizing supply chain logistics
• Threat Detection: Identifying and responding to cyber threats
• Security Auditing: Automating security audits
Overall, this approach has a more focused and specialized domain than current commercial applications. While LLMs like Gemini and ChatGPT focused on cloud-based approach, the proposed LLM suggests that a targeted local network approach can be a forward design to target specific problems. Some examples of alternative cloud-based approaches could include Neuromorphic computing, Local LLMs, LLMs on edge devices, and Small Language Models (SLMs).
REFERENCES:
KEYWORDS: Large language model; LLMs; Natural Language Processing; NLP; Multi-modal approaches
TPOC:
NAVAIR SBIR/STTR POC
navair-sbir@us.navy.mil
** TOPIC NOTICE ** |
The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoW FY-26 Release 1 SBIR BAA. Please see the official DoW Topic website at www.dodsbirsttr.mil/submissions/solicitation-documents/active-solicitations for any updates. The DoW issued its Navy FY-26 Release 1 SBIR Topics pre-release on April 13, 2026 which opens to receive proposals on May 6, 2026, and closes June 3, 2026 (12:00pm ET). Direct Contact with Topic Authors: During the pre-release period (April 13, through May 5, 2026) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. The TPOC contact information is listed in each topic description. Once DoW begins accepting proposals on May 6, 2026 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period. DoD On-line Q&A System: After the pre-release period, until May 20, 2026, at 12:00 PM ET, proposers may submit written questions through the DoW On-line Topic Q&A at https://www.dodsbirsttr.mil/submissions/login/ by logging in and following instructions. In the Topic Q&A system, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing. DoW Topics Search Tool: Visit the DoW Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoW Components participating in this BAA.
|