N252-109 TITLE: Automated Intelligence Preparation of the Battlespace from Open-Source Information
OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Advanced Computing and Software;Human-Machine Interfaces;Trusted AI and Autonomy
OBJECTIVE: Develop automated tools using artificial intelligence/machine learning (AI/ML) to generate inputs to Intelligence Preparation of the Battlespace (IPB) for the Marine Corps Mission Planning Process.
DESCRIPTION: The Marine Corps Planning Process (MCPP) requires Marines to perform an IPB, which is the "systematic process of analyzing and visualizing the portions of the mission variables of the adversary, terrain, weather, and civil considerations in a specific area of interest and for a specific mission. By applying IPB, commanders gain the information necessary to selectively apply and maximize operational effectiveness at critical points in time and space." [Ref 1].
However, the IPB is a time and manpower intensive process requiring Marines to analyze physical, temporal, cognitive, and virtual considerations [Ref 2]. Open-source information is often valuable input to the IPB given the dynamic nature of many IPB considerations, but this information is frequently the costliest to find and incorporate. In a tactical situation, Marines may not have the time or expertise to perform a full analysis across these domains.
Current IPB inputs are also not easily actionable by AI/ML analytics due to the analog nature of the resulting products. Automated generation of the IPB in structured, computer-readable formats would enable additional AI/ML tools to incorporate and exploit those products in other parts of the MCPP. A tool that solve this problem should scan through open sources of information of IPB-relevant content. Open sources to include scans should be both user-provided and discovered by web scraping. It should then retrieve and parse that information into a structured, computer-readable format usable by other analytics supporting MCPP. IPB content generated by the tool should include references to the original source of the information. The tool should estimate the accuracy of the information based on the source, the correlations between different open sources, and the correlations to authoritative U.S. sources. Marines should also be able to influence rankings of specific generated content or source sites.
The focus in developing the tool should be in discovery of relevant information to the IPB process, computer understanding of that information, AI-assisted data and cleaning transformation to enable rapid, actionable use by other algorithms, and content evaluation (i.e., source quality, accuracy, uncertainty estimates) of the generated information. Previous work in this area used Natural Language Processing (NLP) and Semantic Search techniques but were lacking in both quality and specificity of the generated information. Generally, results from previous techniques still required significant human review and refinement. Recent innovations in Large Language Models (LLMs) and Deep Learning are expected to support significant improvements in the development of a solution.
PHASE I: Determine the technical feasibility of a concept for the automated generation of IPB content. Develop a quality metric for discovered data sources to include accuracy/uncertainty considerations and user preferences. Prepare a Phase II plan.
PHASE II: Develop and evaluate a prototype for the automatic generation of IPB products from open sources. The prototype should use both a pre-defined list and automatically discovered sources. The prototype should generate IPB products targeting a minimum of three IPB considerations (e.g., civil considerations, adversary order of battle, adversary tactics, etc.). Generated products should be stored in data formats optimized to the structure of the underlying information such as JSON, Geo Tag Image File Format (GeoTIFF), and Network Common Data Form (NetCDF., etc. A product quality metric should be developed as an extension of the data source quality metric. The product quality metric should granularly evaluate information found in data sources for accuracy/uncertainty via correlation with other sources (both other open sources and authoritative sources) or other means. For Phase II authoritative sources need only include trusted public sources (e.g., open access government webpages).) The prototype should support user input to refine source quality and the accuracy of specific information in the system.
Produce the following deliverables: (1) a working prototype developed according to the extended Phase II requirements; (2) product quality metric methodology; (3) a test report documenting results of prototype evaluation.
PHASE III DUAL USE APPLICATIONS: Support the Marine Corps in transitioning the technology for Marine Corps use. Develop the software for evaluation to determine its effectiveness in either a formal Marine Corps schoolhouse or other training setting. Incorporate the product into larger AI-enabled mission planning tools such as the Higher Echelon Mission Planner. Generated IPB products will be presented directly to the Marines via the planning tool and used as inputs to other AI/ML analytics support course of action analysis.
As appropriate, focus on broadening capabilities and commercialization plans. Development of affordable, scalable, non-proprietary technologies are needed to accelerate the transition of the Marine Corps to an information age model. The commercial sector is developing some of these AI-enabled technologies, but they often do not deal with critical issues regarding non-existent, limited, or low-quality source data, do not address the diversity of data modalities employed by Naval forces, and often come with prohibitive licensing and usage fees. This technology will have broad application in the commercial sector. Examples of businesses that would be interested in this technology include companies developing generative AI, companies providing search engines, and news media. All of these companies would benefit from the ability to identify low-quality source data to improve the accuracy of the information they provide to their customers. Additionally, companies focused on Generative AI have also been pushing to incorporate non-textual data into their offerings. Research with multi-modal data sources would benefit them as well.
REFERENCES:
- "Marine Corps Planning Process, MCWP 5-1." Department of the Navy, 24 August 2010. https://www.marines.mil/Portals/1/MCWP%205-1.pdf
- "Intelligence Preparation of the Battlespace, MCRP 2-10B.1." United States Marine Corps, 1 July 2023. https://www.marines.mil/Portals/1/Publications/MCRP%202-10B.1.pdf?ver=WZgANNGDKsmcphvgtPdt-Q%3d%3d
KEYWORDS: Artificial Intelligence, Machine Learning, Mission Planning, Intelligence Preparation of the Battlespace, IPB, Marine Corps Planning Process, MCPP, AI/ML
** TOPIC NOTICE ** |
The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoD 25.2 SBIR BAA. Please see the official DoD Topic website at www.dodsbirsttr.mil/submissions/solicitation-documents/active-solicitations for any updates.
The DoD issued its Navy 25.2 SBIR Topics pre-release on April 2, 2025 which opens to receive proposals on April 23, 2025, and closes May 21, 2025 (12:00pm ET).
Direct Contact with Topic Authors: During the pre-release period (April 2, 2025, through April 22, 2025) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. The TPOC contact information is listed in each topic description. Once DoD begins accepting proposals on April 23, 2025 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period.
DoD On-line Q&A System: After the pre-release period, until May 7, 2025, at 12:00 PM ET, proposers may submit written questions through the DoD On-line Topic Q&A at https://www.dodsbirsttr.mil/submissions/login/ by logging in and following instructions. In the Topic Q&A system, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing.
DoD Topics Search Tool: Visit the DoD Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoD Components participating in this BAA.
Help: If you have general questions about the DoD SBIR program, please contact the DoD SBIR Help Desk via email at DoDSBIRSupport@reisystems.com
|
Topic Q & A
4/9/25 |
QA |
Topic Clarification provided by TPOCs May 6, 2025:
The goal of this topic is to automatically generate Intelligence Preparation of the Battlefield (IPB) products with improved data accuracy, data currency, and product generation speed compared to existing methods.
Expanding the open data sources used to build an IPB will necessarily include less reliable sources, and the capability should handle the variable reliability intelligently. We want offerors to use their expertise
to develop tools to assess the accuracy/uncertainty of data sources and the information they contain. The result should be a structured digital IPB product. The content should mimic what a Marine would manually generate
but in digital form using structured and open data formats. The goal is not to just accumulate information for use by a Marine. The goal is to generate the final product and present that to the Marine. References, metadata,
and quality evaluations can all be attached to that final product to guide its use.
Questions:
- Q) Who is the target user-base?
- Q) How is the capability expected to be deployed? Are there limitations in hardware or networking?
- Q) Will the government be providing Subject Matter Experts (SMEs) to the chosen performer(s)?
- Q) How will the capability be used? Is a user interface a requirement?
- Q) Is there a preference on which IPB considerations the offeror targets? Is there a priority to the IPB considerations?
- Q) How should accuracy/uncertainty for data sources be handled? Are there specific metrics or methodologies that offerors should use?
- Q) What should the output of the capability look like? Is there a specific format desired by the government?
- Q) Are there specific data sources the offerors should support in their capability? Are there data sources that should be avoided?
- Q) Is there a specific technical approach that offerors should use to conduct this research?
- Q) What data modalities are of interest for this topic? Should the focus be strictly on textual information?
- Q) How will developed capabilities be evaluated?
- Q) Is there a desired run time to generate an IPB product?
|
|
A. |
- GCE Marines at the Regiment level, below.
- To be deployed on an on-prem server. If capability has HW requirements beyond modern enterprise server, identify in the proposal. If non-standard costs associated should be noted. Capability doesn’t need to support tactical edge deployment and communications, or disconnected operation. Capability shouldn't be limited to cloud-only deployments, rely on cloud specific functionality, or incorporate 3rd party functionality.
- The gov’t won't provide SMEs, firm will provide expertise to perform the work.
- Capability intended for external tools. Provide M2M interfaces. Design interfaces desired, a common pattern would be a RESTful API document using OpenAPI. If M2M interface is open and provides functionality it is acceptable. User interface isn't a requirement of topic. Expectation is external tools will provide interfaces to user and pass along inputs to the capability interfaces. Assume any level of interactivity with end user provided the interactive functionality is made available via the M2M interfaces.
- No preference or priority for IPB. Target whichever IPB considerations can best support within capability and will provide USMC the most benefit when implemented.
- Assessing, fusing, and reporting accuracy/uncertainty of sources is critical to this topic. Increase the breadth of open source data that can be incorporated into an IPB. Sources will be of questionable reliability and some may intentionally include inaccurate information. Critical part of the research for this topic is to incorporate less reliable sources by developing methods to assess the accuracy of information and increase overall accuracy by incorporating info across sources. Not a specific technical approach desired . Use expertise to design an approach best suited
- Objective is to produce IPB products in a digital form usable by other SW tools. Collecting pieces of info from multiple sources of specific IPB consideration is not sufficient. Product should be contained in structured and open data format. No one format is required. Different IPB products may align best with different products. The IPB product information should be in a structured form such that external tools can easily further analyze the data or visualize it for the user, potentially in conjunction with other data.
- No specific set of data sources required. Topic does have requirement to support configuration of suggested data source (both persistent suggested sources and one-time suggested sources). Suggested sources meant to augment the list of data sources rather than limit list of data sources. No strict set of data sources that should be avoided. Goal of topic is to include less reliable or even potentially malicious data sources in the discovery process and intelligently assessing and incorporating the information that is discovered. All data modalities are of interest.
- No specific technical approach desired. LLMs are mentioned in the topic as possible approach. However, LLMs are not required to be used, nor will their use be evaluated more favorably. Any approach that can solve the problem is within scope. LLMs may not be the best solution for every (or any) IPB consideration.
- All data modalities are within scope for this topic. Do not have to limit to textual information. Text, audio, images, video, raw gridded data, structure data, etc. are all in scope. Different IPB considerations may benefit from sourcing different data modalities.
- Capability will be evaluated on the quality of the product generated including the accuracy, currency, and clarity of the information in the IPB product. Accuracy/uncertainty metadata associated with information in the IPB will be considered when evaluating the overall accuracy of the product. Active-duty and mission planning SMEs will be evaluating. Time to generate the product and solution costs will also be included in the evaluation.
- Generation of the IPB product should be completed within an hour of initial request.
|
4/23/25 |
Q. |
- What level of interactivity should the tool provide for Marines to adjust source/site rankings or correct content inaccuracies? Should changes be logged for auditability or adaptive learning?
- How do you envision accuracy, uncertainty, and source reliability being quantified or visualized for the end user? Would you prefer Bayesian models, confidence scoring, user-feedback loops, or a combination?
- What types of NLP/LLM capabilities are most desired—e.g., summarization, information extraction, named entity recognition, semantic search? Should we prioritize open-source LLMs or explore fine-tuning smaller models for secure environments?
- What are the anticipated deployment environments for this system (e.g., cloud-based, edge compute, disconnected operations), and are there cybersecurity or bandwidth constraints we should plan for?
- What specific evaluation benchmarks or criteria will define a successful prototype in Phase II—e.g., reduction in time to produce IPB, percentage accuracy improvement, or usability by Marines with minimal technical background?
- Are there any constraints or preferences regarding the automated discovery and ingestion of open-source content—such as pre-approved domains, government feeds, or multilingual sources? Should the tool also consider semi-structured content like PDFs, maps, or forum posts?
|
|
A. |
- The tool should allow Marines to adjust the source list (via machine to machine methods from other user facing tools) and site rankings / uncertainty estimates.
- The method of quantifying uncertainty is left to the offeror. The requirement is that the information is communicated along with the data.
- The desire is to get as close to a complete IPB consideration as possible along with an estimate of accuracy / uncertainty for the generated product. The offeror can choose any approach to achieve this goal (with or without LLMs). Right now, the desire is for a system to work on open-source data at the unclassified level. There is no preference on open-source LLMs or fine-tuned smaller models.
- The system should run on an on-prem server with Internet access. Disconnected operations are not required for this effort. The system should limit protocol (and port) usage to HTTP/HTTPS (80/443). There are no specific bandwidth constraints beyond functioning on a standard commercial internet link.
- Reduction in time, improvement in accuracy, and improvement in completeness are all relevant metrics. The intent will be to use the tool in a machine-to-machine manner with other components providing a user interface so usability will not be evaluated. Ease of integration with other tools will be a metric.
- The tool should support Marines adding sources (via machine to machine methods from other user facing tools). Multilingual sources are certainly within scope. Further user refinement of constraints would be useful but not required. The tool may consider all content unstructured, semi-structured, and structured.
|
4/21/25 |
Q. |
- Are there specific mission variables or IPB considerations (e.g., terrain, civil factors, adversary TTPs) that should be prioritized for Phase I modeling?
- Will the government provide a representative list of open-source data repositories or expect offerors to define and justify their own selection strategy for content discovery?
- Are non-textual or multimodal data types (e.g., images, geospatial files, structured web tables) expected to be included in Phase I feasibility, or is the initial focus primarily on text-based inputs?
- Does the government have a preferred method or tool for computing source trust/uncertainty metrics (e.g., correlation scoring, reputation-based weighting), or is that left to the offeror’s discretion?
- What level of content granularity is expected in Phase I outputs—paragraph-level, sentence-level, or full-document-level transformations into structured formats?
- Should system architecture account for offline use or tactical edge deployment, or is initial development assumed to take place in a cloud-accessible unclassified environment?
- Will the Phase I evaluation process prioritize end-to-end automation or the modularity of components (e.g., separate pipelines for retrieval, parsing, formatting, and ranking)?
- Is there a target format or ontology (e.g., JSON, NetCDF, GeoTIFF) for the structured outputs that would most closely match downstream Marine Corps integration goals?
|
|
A. |
- All mission variables and IPB considerations are in scope.
- The offeror will define their own method of content discovery but Marines must be able to provide specific sources (via machine to machine methods from other user facing tools).
- The offeror should decide what data to include to best support automated IPB consideration generation.
- No. This is left to the offeror’s discretion.
- No specific level of content granularity is required. The offeror can choose the level of granularity most effective to the goal of creating a usable IPB product in a machine-to-machine readable format.
- The system should run on an on-prem server with Internet access. Offline use and tactical edge deployments are not required for this effort.
- Evaluation will be based on the end-to-end system.
- Structured open formats are required. Otherwise, offerors are allowed to pick the best format for the information that is generated.
|
4/20/25 |
Q. |
- What specific open-source information types or platforms should the tool prioritize for scanning and parsing to support IPB considerations (e.g., social media, news articles, geospatial data)? Are there preferred sources (e.g., specific websites, public databases) or data modalities (e.g., text, images, videos) that are most relevant to the IPB process, and what are their typical formats?
- Can you clarify the expected level of automation for generating structured IPB products, and how much human oversight or interaction is anticipated? Should the tool fully automate the generation of IPB products with minimal Marine input, or is it expected to support Marines by providing editable drafts that they refine?
- What are the key performance metrics for evaluating the tool’s accuracy and quality of generated IPB products, particularly for the source and product quality metrics? Beyond correlations with other sources, are there specific thresholds for accuracy, uncertainty estimates, or relevance that the tool must achieve, and how should trade-offs (e.g., speed vs. accuracy) be prioritized?
- Are there specific data formats or integration requirements for the structured, computer-readable IPB products to ensure compatibility with other MCPP analytics tools? For example, should the tool prioritize formats like JSON, GeoTIFF, or NetCDF, and are there existing Marine Corps systems (e.g., Higher Echelon Mission Planner) that the outputs must interface with?
- What are the operational constraints for deploying the tool in tactical environments, such as connectivity, computational resources, or user interface preferences? Are there limitations on internet access, processing power, or device types (e.g., mobile vs. desktop) that the tool must accommodate, and what are the expectations for usability by Marines with varying technical expertise?
- Who specifically are the anticipated end users of this technology once it is developed?
|
|
A. |
- The government is not prioritizing specific data sources. It is intended that Marines will be able to provide specific sources at the time of usage.
- The primary goal is to generate tools to help a Marine build an IPB. Building a full IPB report draft would not be out of scope of that goal, but a Marine should be able to review and modify the usage of discovered data as well as customize the final report.
- The government is not prioritizing trade-offs. Offerors will have to develop metrics for IPB consideration and methods of calculating those metrics.
- Preference is computer understandable formats (json, xml, etc) that would be usable by other software tools.
- The capability is expected to run on an on-prem server. The system will have internet access. The intent is for this capability to be used by other user facing applications, so a user interface is not required. The capability should support machine to machine interaction from other user facing tools.
- End users are Marines working at Battalion and higher levels focused on Maritime and ground domains.
|