2025 UAP Workshop: Narrative Data, Infrastructures, and Analysis
All-domain Anomaly Resolution Office · aaro analysis
Attributed analysis published by AARO/ORNL — an interested party's position, not an independent verdict. Presented alongside the case record, not as a resolution of it.
This is one record. The archive holds the rest — ask it anything across the UAP Files files and every answer is cited to the page.
Ask the archive about this →2025 UAP Workshop:
Narrative Data, Infrastructures, and Analysis
Workshop Synthesis and Recommendations
August 5-6, 2025
Associated Universities, Inc. (AUI)
Workshop sponsored by:
All-domain Anomaly Resolution Office (AARO)
Table of Contents
Executive Summary ........................................................................................................................ 2
Introduction and Purpose ................................................................................................................ 3
About the Workshop ....................................................................................................................... 3
Establishing open dialogue ......................................................................................................... 4
Workshop Summary ....................................................................................................................... 4
Agenda overview ........................................................................................................................ 4
Breakout Discussion Summaries ................................................................................................ 5
Breakout Session #1: Identifying, accessing, and integrating data sources [DAY 1] ............ 5
Breakout Session #2: Pathways for data analysis and interpretation at scale [DAY 1] ......... 5
Breakout Session #3: Cleaning, organizing, and linking data: What can and should be done?
[DAY 2] .................................................................................................................................. 5
Outcomes and Recommendations ................................................................................................... 7
Synthesis of Findings .................................................................................................................. 7
Relevant data types and sources of UAP narrative reports ..................................................... 7
Barriers and challenges in data collection and use ................................................................. 7
Metadata and context for usability and analysis ..................................................................... 8
Linking data sources and developing a unified approach ....................................................... 8
Assessing credibility and quality of reports ............................................................................ 8
AI and analytical methods ...................................................................................................... 9
Forward-looking strategy and key considerations .................................................................. 9
Recommended actionable next steps ........................................................................................ 10
Appendix A: Invitation Letter ....................................................................................................... 11
Appendix B: Guidelines for Conduct ........................................................................................... 13
Appendix C: Workshop Agenda ................................................................................................... 14
Appendix D: Breakout Session Prompts....................................................................................... 15
Executive Summary
From both government and scientific perspectives, advancing Unidentified Anomalous
Phenomena (UAP) research requires rigorous data collection, standardization, and analysis.
Most UAP reports are fragmented, sparse, and unstructured, ranging from military logs and pilot
reports to archival records, social media posts, and civilian testimony. Interpreting this
heterogeneous data at scale is complicated by barriers of classification, translation, and retention.
At the same time, UAP reports also present
research requires rigorous data collection, standardization, and analysis.
Most UAP reports are fragmented, sparse, and unstructured, ranging from military logs and pilot
reports to archival records, social media posts, and civilian testimony. Interpreting this
heterogeneous data at scale is complicated by barriers of classification, translation, and retention.
At the same time, UAP reports also present opportunities for novel methods of integration,
metadata design, and analysis. The 2025 UAP Workshop on Narrative Data, Infrastructures, and
Analysis brought together 40 participants from government, academia, and independent research
organizations. The meeting focused specifically on the challenges and opportunities of working
with UAP narrative reports and related data sources.
Workshop discussions highlighted several cross-cutting findings. First, effective progress
requires clear standards and common reporting templates, with robust metadata capturing time,
location, provenance, morphology, and contextual details. Second, linking across datasets –
military and civilian, to include archival, environmental, and technical - must balance
interoperability with privacy, ethical, and classification constraints. Third, credibility is best
assessed through corroboration, but for efficiency there is a need for automated methods to filter
reports and surface the most promising for investigation. Fourth, AI and machine learning tools
offer capacity for transcription, triage, clustering, and semantic search, but they must be
deployed cautiously to avoid hallucination, bias, and amplification of hoaxes. Human oversight
and iterative workflows remain essential. Finally, the workshop underscored the importance of
community engagement and trust-building, encouraging the scientific community to cultivate a
sustainable “community of practice” for UAP research with further work and convenings.
This report concludes with recommended actionable next steps to establish metadata templates;
combine human expertise with AI tools; leverage existing tools and infrastructures; support
triage with awareness of bias; convene community members; facilitate qualitative integration in
investigation, such as interviews; prioritize collection of new high-quality reports while
integrating historical data; and improve reporting interfaces to enhance accessibility,
collaboration, and transparency. Together, these findings and recommendations point toward a
multi-disciplinary and community-engaged approach to UAP narrative data, which may
influence how and where technical sensors are deployed.
2
Introduction and Purpose
Understanding the nature of Unidentified Anomalous Phenomena
reports while
integrating historical data; and improve reporting interfaces to enhance accessibility,
collaboration, and transparency. Together, these findings and recommendations point toward a
multi-disciplinary and community-engaged approach to UAP narrative data, which may
influence how and where technical sensors are deployed.
2
Introduction and Purpose
Understanding the nature of Unidentified Anomalous Phenomena (UAP) has emerged in recent
years as a pressing area of inquiry in need of rigorous scientific approaches, as well as cross-
disciplinary, cross-sector and international collaboration. Analyzing reports of UAP related
sightings and experiences presents unique challenges due to the large-scale, heterogeneous, and
qualitative nature of the reports originating from military and civilian sources. These reports
typically lack standardized metadata, making comparative analysis difficult. Additionally, the
integration of UAP reports from disparate sources—such as military databases, online reporting
systems, digital and digitized archival records, and social media—poses significant challenges
for harmonization and verification of data and construction of evidence. The complexity of these
datasets requires innovative data infrastructure solutions to enhance reliability, accessibility, and
interoperability. The workshop explored these challenges and sought strategies to improve UAP
data standardization, integration, and analytical approaches.
Recent advances in artificial intelligence (AI) and machine learning present both opportunities to
address challenges, along with potential hazards. Tools such as Large Language Models (LLMs)
can assist with transcription, clustering, and pattern detection at scale, but they risk introducing
bias and hallucination. Responsible use of AI to help organize, analyze, and integrate UAP
reports at scale requires evaluation, human oversight, and shared frameworks for interpretation,
alongside new models to ensure transparency and trust across diverse research communities.
Therefore, the overall purpose of the workshop was to gather perspectives from the broader
scientific community and advance the science of UAP.
About the Workshop
The workshop centered on the collection, organization, and interpretation of UAP reports, with
attention to the challenges and opportunities of working with narrative data. The primary
objectives established for the workshop were to:
• Assess the current landscape of UAP reporting systems and data repositories;
Identify key challenges and gaps in UAP data collection, standardization, and
•
accessibility;
• Explore methodologies for data analysis and pattern recognition in UAP reports;
• Nurture trust and collaboration between
the challenges and opportunities of working with narrative data. The primary
objectives established for the workshop were to:
• Assess the current landscape of UAP reporting systems and data repositories;
Identify key challenges and gaps in UAP data collection, standardization, and
•
accessibility;
• Explore methodologies for data analysis and pattern recognition in UAP reports;
• Nurture trust and collaboration between researchers, government agencies, and civilian
organizations; and
• Propose recommendations for developing a robust UAP data infrastructure.
3
Outside participation was limited due to budget constraints and institutional capacity. Potential
participants were identified based on demonstrated expertise in one or more of the following
areas: AI and machine learning; UAP research and data; physical and natural sciences;
information and data science; archives and records; analysis methods; cyberinfrastructure and
computation; and human and social sciences.
If an invitee declined to attend, we extended an invitation to another candidate with similar
skills/experience identified through online research and word of mouth. The final workshop
included 40 participants.
Establishing open dialogue
Participant privacy was an important consideration throughout workshop planning, and
Institutional Review Board (IRB) approval governed data collection and security for the
workshop. The organizing committee further wished to establish a neutral environment in which
participants holding diverse beliefs and backgrounds would feel comfortable engaging. It was
very important that those attending the workshop felt comfortable sharing their thoughts and
ideas without being concerned about what others might say or do. The planning committee also
decided not to publicize the workshop online beforehand to limit outside attention and encourage
comfort and open discourse among an intimate group of participants. Participants were urged to
avoid taking photos or attributing statements to individuals without permission. The organizers
made efforts to accommodate privacy concerns after they identified a final list of attendees. This
included:
● Name tag options: individuals could simply list their first name with no institutional
affiliation;
● Individuals could choose to remove themselves from some sessions or conversations if
they felt uncomfortable engaging in various topics;
● Photographing other attendees was not permitted unless an attendee received consent
from all individuals who appeared in a photo; and
● Respect for all and approaching conversations with an open mind was a requirement for
participation. If an individual did not feel this was possible, they were asked to not attend.
See email communication sent to all attendees in Appendix B:
they felt uncomfortable engaging in various topics;
● Photographing other attendees was not permitted unless an attendee received consent
from all individuals who appeared in a photo; and
● Respect for all and approaching conversations with an open mind was a requirement for
participation. If an individual did not feel this was possible, they were asked to not attend.
See email communication sent to all attendees in Appendix B: Guidelines for Conduct.
Workshop Summary
Agenda overview
The event began with a casual, pre-workshop networking social in the evening of August 4,
2025. The organizers provided welcome and opening remarks on the morning of August 5,
2025. Brief participant introductions followed these remarks. A keynote address about the
importance of good UAP data primed participants for the first breakout session (“Identifying,
accessing, and integrating data sources”), held before breaking for lunch. The afternoon of
4
August 5, 2025 began with a plenary talk, followed by the first panel discussion, “Opportunities
and challenges with AI”, and a second breakout session (“Pathways for data analysis and
interpretation at scale”). Day 1 concluded with a brief whole group discussion. A workshop
dinner was held at a restaurant near the workshop venue. Day 2 began with a second plenary talk
and second panel discussion, “Harmonizing qualitative and quantitative perspectives on narrative
data.” After lunch, a series of lightning talks were delivered by participants ahead of the final
breakout session (“Cleaning, organizing and linking data: What can and should be done?”).
Throughout the event, the organizing team collected notes that were later transcribed and
anonymized. For each breakout session, moderators collected records, and notetakers were
assigned to further ensure a robust record of the workshop proceedings.
Breakout Discussion Summaries
Prompts for each breakout session are included in Appendix D: Breakout Session Prompts.
Breakout Session #1: Identifying, accessing, and integrating data sources [DAY 1]
The first breakout session addressed central challenges of UAP research. Discussions revealed
the scope of the UAP data landscape as a patchwork of historical case files, contemporary
narrative reports, sensor-based data (radar, imagery, flight data), and environmental or contextual
datasets (weather, astronomical, seismological). Participants expressed enthusiasm for the
potential
sources [DAY 1]
The first breakout session addressed central challenges of UAP research. Discussions revealed
the scope of the UAP data landscape as a patchwork of historical case files, contemporary
narrative reports, sensor-based data (radar, imagery, flight data), and environmental or contextual
datasets (weather, astronomical, seismological). Participants expressed enthusiasm for the
potential to link these disparate sources, but they also acknowledged the barriers posed by
inconsistency in metadata, classification restrictions, missing or inaccessible records, and stigma
around UAP reporting. Despite these challenges, groups converged on the outlook that with clear
standards, prototype integration projects, and intentional collaboration across organizations, it is
possible to create interoperable and sharable datasets that would enable more rigorous and
scalable analysis of UAP reports.
Breakout Session #2: Pathways for data analysis and interpretation at scale [DAY 1]
The second breakout session explored methods and limitations for analyzing UAP narrative data.
Across groups, participants grappled with the tension between extracting operationally useful
signals and respecting the experiential, cultural, and historical richness embedded in reports.
Overall, groups agreed that UAP narratives cannot be reduced to a single analytic approach.
Corpus-level methods (time/space clustering, keyword trends, statistical correlation, graph
analysis) are useful for pattern detection and hypothesis generation, while narrative/experiential
methods (phenomenology, discourse analysis) are useful for preserving meaning, cultural
context, and witness voices. Infrastructures should allow these modes to coexist.
Breakout Session #3: Cleaning, organizing, and linking data: What can and should be
done? [DAY 2]
The third and final breakout activity analyzed the structure of a hypothetical online reporting
form that has collected 1,000 UAP reports stored as PDF files to identify possibilities for data
5
analysis with the data collected, as well as potential improvement of the form. The discussion led
to the following overarching suggestions that are broadly informative for online UAP reporting
tools.
1. Intake flow and structure:
• Begin with a free-text box (and optional audio upload) where the witness provides their
account in their own words. Use AI-assisted extraction to propose structured fields,
which the witness can then confirm or correct.
Frame questions around what was perceived (angular size, shape, movement
overarching suggestions that are broadly informative for online UAP reporting
tools.
1. Intake flow and structure:
• Begin with a free-text box (and optional audio upload) where the witness provides their
account in their own words. Use AI-assisted extraction to propose structured fields,
which the witness can then confirm or correct.
Frame questions around what was perceived (angular size, shape, movement, sound,
effects) rather than presumed properties (exact distance, solid object dimensions).
•
2. Additions to the form:
• Ask witnesses to explain how they estimated size, distance, or speed (i.e. context
prompts).
• Capture whether this has happened before and, if so, how often.
•
Instead of “mass sighting: yes/no”, include approximate numbers of witnesses.
Include a field for whether the object seemed to react to observer presence.
•
• Add examples of technological effects (e.g., radio static, car failure) and basic prompts
about feelings or aftereffects that could be informative (e.g., “Did you discuss this with
others? Would you want professional/peer support?”).
• Automatically ingest and display photo metadata (camera model, timestamp, location),
giving users the option to redact sensitive fields.
3. Standardization and cleaning:
• Accept location information including city/address/zip/lat–long, with simple guidance
and drop-downs, and normalize on the back end.
• Enforce a single-entry format for dates and times (calendar widget or drop-downs).
• Allow multiple inputs for units (imperial/metric) but convert and store consistently.
•
Include structured numeric fields for object count and multiple objects, with adaptive
follow-up to describe each object separately.
4. Taxonomical considerations:
•
Provide a concise taxonomy of common shapes (disk, sphere, triangle, cigar, “other”) but
allow free-text for unusual forms.
• Update descriptive references for cultural familiarity (using objects such as coins or debit
card to estimate size) and internationalize/translate forms for broader accessibility.
5. Integration and linkage:
•
Include a field to indicate whether the event was reported elsewhere (NUFORC,
MUFON, FAA, etc.).
6
• Design the schema so reports can be linked to FAA/NASA Aviation Safety Reporting
System (ASRS) data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight
tracks, weather radar, astronomical databases
.
5. Integration and linkage:
•
Include a field to indicate whether the event was reported elsewhere (NUFORC,
MUFON, FAA, etc.).
6
• Design the schema so reports can be linked to FAA/NASA Aviation Safety Reporting
System (ASRS) data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight
tracks, weather radar, astronomical databases, fireball networks, etc.
• Enable dynamic follow-ups for multiple objects, multiple witnesses, or sequential events.
6. Governance and trust:
• Give reporters clear control over what information (such as geolocation, photo metadata)
is shared publicly.
• Commit to aggregated, de-identified data releases (maps, trend summaries) to build trust
without encouraging hoaxes.
• Light-touch well-being questions were suggested, to help identify if respondents would
like professional or peer follow-up without stepping into clinical assessment.
Outcomes and Recommendations
Synthesis of Findings
Relevant data types and sources of UAP narrative reports
Participants emphasized that UAP research requires drawing on a diverse ecosystem of data,
extending beyond witness testimony. Primary narrative reports in formats ranging from PDFs
and CSVs to emails and oral histories remain central, offering firsthand accounts that, when
digitized and transcribed as needed, can be structured for analysis. These reports are
complemented by smartphone photos and videos, which are widely available but often of poor
quality, though improving over time.
Government sources are handling both classified and unclassified records, including finished
intelligence and historic documents. Military reports and ship logs are particularly robust,
providing structured information on platforms, flight plans, and pilots, while the FAA continues
to collect pilot reports. Other data streams include social media posts, which are often
multimodal (such as online and social media videos); international partner databases; and
structured technical or scientific sensor data, such as radar or spectrum analyses. Supplementary
contextual data is also critical, including flight and weather records, seismological data, satellite
imagery, and even doorbell videos or CCTV systems can corroborate sightings.
Barriers and challenges in data collection and use
Despite many potential sources of data, significant obstacles remain. Access to social media data
has become more restricted due to corporate licensing policies, while ethical and jurisdictional
considerations complicate usage. Classification remains a dominant barrier, as substantial UAP
data may be captured on
, and even doorbell videos or CCTV systems can corroborate sightings.
Barriers and challenges in data collection and use
Despite many potential sources of data, significant obstacles remain. Access to social media data
has become more restricted due to corporate licensing policies, while ethical and jurisdictional
considerations complicate usage. Classification remains a dominant barrier, as substantial UAP
data may be captured on classified sensors, automatically rendering it inaccessible until
declassified. Other challenges include language and translation barriers, with both human and
automated systems prone to errors, especially in low-resource languages. Stigma in reporting,
7
particularly among pilots, undermines data timeliness and completeness, while the lack of
standardized reporting formats across agencies and organizations further fragments the
landscape. Time sensitivity and weak retention policies have led to the loss of critical records, as
in the well-known Nimitz case. Technical issues are also substantial. Older data can be difficult
to digitize, cursive writing resists Optical Character Recognition (OCR) systems, and
crowdsourced transcription projects suffer from low-quality outputs, recently worsened by
misuse of generative AI. Finally, the field must grapple with fake data and disinformation,
including AI-generated photos or videos, which pose risks for both public trust and analytic
integrity.
Metadata and context for usability and analysis
Effective use of UAP data requires rich contextual metadata. Every report should ideally contain
time, date, and location, preferably with geospatial precision. Distinguishing between descriptive
metadata (objective characteristics like morphology or frequency band) and interpretive metadata
(subjective effects or experiential meaning) is key. Metadata should also capture event-specific
details, such as behaviors, sensor positions, and witness background, and must extend to
technical parameters for structured data. Provenance (the chain of custody and source of the
data) is essential for ensuring interpretability and trust. For visual evidence, metadata such as
device type and embedded geotags allow validation against reported facts. Participants also
emphasized flexible and well-designed reporting forms, for example including “refuse to
answer” options to prevent fabricated entries when respondents lack knowledge.
Linking data sources and developing a unified approach
Given the fragmented nature of UAP data, participants argued for modest, pilot-scale integration
projects as a starting point. Establishing common terminology and data dictionaries is important
to
. Participants also
emphasized flexible and well-designed reporting forms, for example including “refuse to
answer” options to prevent fabricated entries when respondents lack knowledge.
Linking data sources and developing a unified approach
Given the fragmented nature of UAP data, participants argued for modest, pilot-scale integration
projects as a starting point. Establishing common terminology and data dictionaries is important
to harmonize datasets across agencies and disciplines. Modular and extensible metadata
standards could lead toward a composable ecosystem, potentially implemented through
standardized templates, Interface Control Documents (ICDs), or APIs. Some form of established
governance is needed to facilitate data management and access and engagement for researchers
while alleviating inter-agency silos. Transparency was highlighted as both a goal and a
challenge, as unclassified data should be made available to academia, while sensitive material
must remain protected. Lessons from other fields, such as genetics and astronomy, were cited as
models for developing interoperable metadata standards and ontology-driven approaches.
Assessing credibility and quality of reports
Participants highlighted the importance of sensor reliability, noting that human perception is
fallible. Establishing gold standard exemplars of high-quality reports could help guide future
collection and analysis. Semi-automated triage, assisted by AI, offers promise for sifting through
massive datasets to identify cases with likely conventional explanations as well as cases of
potential interest, though human oversight remains indispensable. Furthermore, credibility is
enhanced when reports are corroborated by multiple witnesses or independent data streams, such
8
as radar or weather records. Interviews and psychological screening of witnesses was offered as
an example of how to assess motivations and reduce false reports, though it was acknowledged
that this is difficult to implement at scale. At the same time, biases in favor of certain professions
(pilots, police) must be acknowledged due to enhanced observational training and skills. A
phenomenological approach (qualitative analysis of indicators of lived experiences) allowing
patterns to emerge from narrative accounts was recommended as a complement to quantitative
methods, ensuring that unusual but meaningful details are not prematurely excluded.
AI and analytical methods
AI offers opportunities for pattern recognition, hypothesis generation, and efficiency gains in
large-scale text and multimodal data analysis. Techniques such as semantic search, clustering,
and multimodal modeling (for example, combining acoustic and infrared signals) can help
identify anomalies. AI
recommended as a complement to quantitative
methods, ensuring that unusual but meaningful details are not prematurely excluded.
AI and analytical methods
AI offers opportunities for pattern recognition, hypothesis generation, and efficiency gains in
large-scale text and multimodal data analysis. Techniques such as semantic search, clustering,
and multimodal modeling (for example, combining acoustic and infrared signals) can help
identify anomalies. AI is also valuable for routine tasks, such as extracting dates or locations
from unstructured text, or triaging likely misidentifications. However, there are risks associated
with AI. Hallucination (the generation of convincing but false conclusions) remains a core
concern. AI analysis is only as reliable as the quality of its input, underscoring the “garbage in,
garbage out” principle. Additionally, LLMs are already biased by UFO-related cultural content,
potentially skewing analyses. Small datasets limit the potential for model training, though pre-
trained models may still be repurposed. Best practices involve an iterative human-AI
collaboration, where algorithms provide preliminary analysis that is verified, corrected, and
enriched by human researchers. Ensemble approaches, leveraging multiple models, may reduce
error rates. Overall, tasks must be carefully defined to align AI methods with research goals,
ensuring a balance between qualitative depth and quantitative rigor.
Forward-looking strategy and key considerations
The group emphasized the need for a forward-looking research infrastructure that integrates
proactive data collection, robust metadata standards, and interdisciplinary collaboration. Some
argued for focusing on new, higher-quality data collection while others urged continued
investment in historical data to preserve its potential value. Future infrastructure priorities
include a unified security solution for managing classified and unclassified data, improved
questionnaire design for witness reports, and benchmarking systems to track analytic
performance over time. Importantly, even “low quality” or stigmatized reports should not be
discarded but made available for diverse lines of inquiry an
Original source: view the released document
More from the UAP Files files
- 65_HS1-834228961_62-HQ-83894_Section_10
- 65_HS1-834228961_62-HQ-83894_Section_2
- 65_HS1-834228961_62-HQ-83894_Section_3
- 65_HS1-834228961_62-HQ-83894_Section_4
- 65_HS1-834228961_62-HQ-83894_Section_5
- 65_HS1-834228961_62-HQ-83894_Section_6
- 65_HS1-834228961_62-HQ-83894_Section_7
- 65_HS1-834228961_62-HQ-83894_Section_9
- 65_HS1-834228961_62-HQ-83894_Serial_130
- 65_HS1-834228961_62-HQ-83894_Serial_153
- 65_HS1-834228961_62-HQ-83894_Serial_164
- 65_HS1-834228961_62-HQ-83894_Serial_220