2025 UAP Workshop: Narrative Data, Infrastructures, and Analysis

All-domain Anomaly Resolution Office · aaro analysis

Attributed analysis published by AARO/ORNL — an interested party's position, not an independent verdict. Presented alongside the case record, not as a resolution of it.

This is one record. The archive holds the rest — ask it anything across the UAP Files files and every answer is cited to the page.

Ask the archive about this →

2025 UAP Workshop: Narrative Data, Infrastructures, and Analysis Workshop Synthesis and Recommendations August 5-6, 2025 Associated Universities, Inc. (AUI) Workshop sponsored by: All-domain Anomaly Resolution Office (AARO) Table of Contents Executive Summary ........................................................................................................................ 2 Introduction and Purpose ................................................................................................................ 3 About the Workshop ....................................................................................................................... 3 Establishing open dialogue ......................................................................................................... 4 Workshop Summary ....................................................................................................................... 4 Agenda overview ........................................................................................................................ 4 Breakout Discussion Summaries ................................................................................................ 5 Breakout Session #1: Identifying, accessing, and integrating data sources [DAY 1] ............ 5 Breakout Session #2: Pathways for data analysis and interpretation at scale [DAY 1] ......... 5 Breakout Session #3: Cleaning, organizing, and linking data: What can and should be done? [DAY 2] .................................................................................................................................. 5 Outcomes and Recommendations ................................................................................................... 7 Synthesis of Findings .................................................................................................................. 7 Relevant data types and sources of UAP narrative reports ..................................................... 7 Barriers and challenges in data collection and use ................................................................. 7 Metadata and context for usability and analysis ..................................................................... 8 Linking data sources and developing a unified approach ....................................................... 8 Assessing credibility and quality of reports ............................................................................ 8 AI and analytical methods ...................................................................................................... 9 Forward-looking strategy and key considerations .................................................................. 9 Recommended actionable next steps ........................................................................................ 10 Appendix A: Invitation Letter ....................................................................................................... 11 Appendix B: Guidelines for Conduct ........................................................................................... 13 Appendix C: Workshop Agenda ................................................................................................... 14 Appendix D: Breakout Session Prompts....................................................................................... 15 Executive Summary From both government and scientific perspectives, advancing Unidentified Anomalous Phenomena (UAP) research requires rigorous data collection, standardization, and analysis. Most UAP reports are fragmented, sparse, and unstructured, ranging from military logs and pilot reports to archival records, social media posts, and civilian testimony. Interpreting this heterogeneous data at scale is complicated by barriers of classification, translation, and retention. At the same time, UAP reports also present research requires rigorous data collection, standardization, and analysis. Most UAP reports are fragmented, sparse, and unstructured, ranging from military logs and pilot reports to archival records, social media posts, and civilian testimony. Interpreting this heterogeneous data at scale is complicated by barriers of classification, translation, and retention. At the same time, UAP reports also present opportunities for novel methods of integration, metadata design, and analysis. The 2025 UAP Workshop on Narrative Data, Infrastructures, and Analysis brought together 40 participants from government, academia, and independent research organizations. The meeting focused specifically on the challenges and opportunities of working with UAP narrative reports and related data sources. Workshop discussions highlighted several cross-cutting findings. First, effective progress requires clear standards and common reporting templates, with robust metadata capturing time, location, provenance, morphology, and contextual details. Second, linking across datasets – military and civilian, to include archival, environmental, and technical - must balance interoperability with privacy, ethical, and classification constraints. Third, credibility is best assessed through corroboration, but for efficiency there is a need for automated methods to filter reports and surface the most promising for investigation. Fourth, AI and machine learning tools offer capacity for transcription, triage, clustering, and semantic search, but they must be deployed cautiously to avoid hallucination, bias, and amplification of hoaxes. Human oversight and iterative workflows remain essential. Finally, the workshop underscored the importance of community engagement and trust-building, encouraging the scientific community to cultivate a sustainable “community of practice” for UAP research with further work and convenings. This report concludes with recommended actionable next steps to establish metadata templates; combine human expertise with AI tools; leverage existing tools and infrastructures; support triage with awareness of bias; convene community members; facilitate qualitative integration in investigation, such as interviews; prioritize collection of new high-quality reports while integrating historical data; and improve reporting interfaces to enhance accessibility, collaboration, and transparency. Together, these findings and recommendations point toward a multi-disciplinary and community-engaged approach to UAP narrative data, which may influence how and where technical sensors are deployed. 2 Introduction and Purpose Understanding the nature of Unidentified Anomalous Phenomena reports while integrating historical data; and improve reporting interfaces to enhance accessibility, collaboration, and transparency. Together, these findings and recommendations point toward a multi-disciplinary and community-engaged approach to UAP narrative data, which may influence how and where technical sensors are deployed. 2 Introduction and Purpose Understanding the nature of Unidentified Anomalous Phenomena (UAP) has emerged in recent years as a pressing area of inquiry in need of rigorous scientific approaches, as well as cross- disciplinary, cross-sector and international collaboration. Analyzing reports of UAP related sightings and experiences presents unique challenges due to the large-scale, heterogeneous, and qualitative nature of the reports originating from military and civilian sources. These reports typically lack standardized metadata, making comparative analysis difficult. Additionally, the integration of UAP reports from disparate sources—such as military databases, online reporting systems, digital and digitized archival records, and social media—poses significant challenges for harmonization and verification of data and construction of evidence. The complexity of these datasets requires innovative data infrastructure solutions to enhance reliability, accessibility, and interoperability. The workshop explored these challenges and sought strategies to improve UAP data standardization, integration, and analytical approaches. Recent advances in artificial intelligence (AI) and machine learning present both opportunities to address challenges, along with potential hazards. Tools such as Large Language Models (LLMs) can assist with transcription, clustering, and pattern detection at scale, but they risk introducing bias and hallucination. Responsible use of AI to help organize, analyze, and integrate UAP reports at scale requires evaluation, human oversight, and shared frameworks for interpretation, alongside new models to ensure transparency and trust across diverse research communities. Therefore, the overall purpose of the workshop was to gather perspectives from the broader scientific community and advance the science of UAP. About the Workshop The workshop centered on the collection, organization, and interpretation of UAP reports, with attention to the challenges and opportunities of working with narrative data. The primary objectives established for the workshop were to: • Assess the current landscape of UAP reporting systems and data repositories; Identify key challenges and gaps in UAP data collection, standardization, and • accessibility; • Explore methodologies for data analysis and pattern recognition in UAP reports; • Nurture trust and collaboration between the challenges and opportunities of working with narrative data. The primary objectives established for the workshop were to: • Assess the current landscape of UAP reporting systems and data repositories; Identify key challenges and gaps in UAP data collection, standardization, and • accessibility; • Explore methodologies for data analysis and pattern recognition in UAP reports; • Nurture trust and collaboration between researchers, government agencies, and civilian organizations; and • Propose recommendations for developing a robust UAP data infrastructure. 3 Outside participation was limited due to budget constraints and institutional capacity. Potential participants were identified based on demonstrated expertise in one or more of the following areas: AI and machine learning; UAP research and data; physical and natural sciences; information and data science; archives and records; analysis methods; cyberinfrastructure and computation; and human and social sciences. If an invitee declined to attend, we extended an invitation to another candidate with similar skills/experience identified through online research and word of mouth. The final workshop included 40 participants. Establishing open dialogue Participant privacy was an important consideration throughout workshop planning, and Institutional Review Board (IRB) approval governed data collection and security for the workshop. The organizing committee further wished to establish a neutral environment in which participants holding diverse beliefs and backgrounds would feel comfortable engaging. It was very important that those attending the workshop felt comfortable sharing their thoughts and ideas without being concerned about what others might say or do. The planning committee also decided not to publicize the workshop online beforehand to limit outside attention and encourage comfort and open discourse among an intimate group of participants. Participants were urged to avoid taking photos or attributing statements to individuals without permission. The organizers made efforts to accommodate privacy concerns after they identified a final list of attendees. This included: ● Name tag options: individuals could simply list their first name with no institutional affiliation; ● Individuals could choose to remove themselves from some sessions or conversations if they felt uncomfortable engaging in various topics; ● Photographing other attendees was not permitted unless an attendee received consent from all individuals who appeared in a photo; and ● Respect for all and approaching conversations with an open mind was a requirement for participation. If an individual did not feel this was possible, they were asked to not attend. See email communication sent to all attendees in Appendix B: they felt uncomfortable engaging in various topics; ● Photographing other attendees was not permitted unless an attendee received consent from all individuals who appeared in a photo; and ● Respect for all and approaching conversations with an open mind was a requirement for participation. If an individual did not feel this was possible, they were asked to not attend. See email communication sent to all attendees in Appendix B: Guidelines for Conduct. Workshop Summary Agenda overview The event began with a casual, pre-workshop networking social in the evening of August 4, 2025. The organizers provided welcome and opening remarks on the morning of August 5, 2025. Brief participant introductions followed these remarks. A keynote address about the importance of good UAP data primed participants for the first breakout session (“Identifying, accessing, and integrating data sources”), held before breaking for lunch. The afternoon of 4 August 5, 2025 began with a plenary talk, followed by the first panel discussion, “Opportunities and challenges with AI”, and a second breakout session (“Pathways for data analysis and interpretation at scale”). Day 1 concluded with a brief whole group discussion. A workshop dinner was held at a restaurant near the workshop venue. Day 2 began with a second plenary talk and second panel discussion, “Harmonizing qualitative and quantitative perspectives on narrative data.” After lunch, a series of lightning talks were delivered by participants ahead of the final breakout session (“Cleaning, organizing and linking data: What can and should be done?”). Throughout the event, the organizing team collected notes that were later transcribed and anonymized. For each breakout session, moderators collected records, and notetakers were assigned to further ensure a robust record of the workshop proceedings. Breakout Discussion Summaries Prompts for each breakout session are included in Appendix D: Breakout Session Prompts. Breakout Session #1: Identifying, accessing, and integrating data sources [DAY 1] The first breakout session addressed central challenges of UAP research. Discussions revealed the scope of the UAP data landscape as a patchwork of historical case files, contemporary narrative reports, sensor-based data (radar, imagery, flight data), and environmental or contextual datasets (weather, astronomical, seismological). Participants expressed enthusiasm for the potential sources [DAY 1] The first breakout session addressed central challenges of UAP research. Discussions revealed the scope of the UAP data landscape as a patchwork of historical case files, contemporary narrative reports, sensor-based data (radar, imagery, flight data), and environmental or contextual datasets (weather, astronomical, seismological). Participants expressed enthusiasm for the potential to link these disparate sources, but they also acknowledged the barriers posed by inconsistency in metadata, classification restrictions, missing or inaccessible records, and stigma around UAP reporting. Despite these challenges, groups converged on the outlook that with clear standards, prototype integration projects, and intentional collaboration across organizations, it is possible to create interoperable and sharable datasets that would enable more rigorous and scalable analysis of UAP reports. Breakout Session #2: Pathways for data analysis and interpretation at scale [DAY 1] The second breakout session explored methods and limitations for analyzing UAP narrative data. Across groups, participants grappled with the tension between extracting operationally useful signals and respecting the experiential, cultural, and historical richness embedded in reports. Overall, groups agreed that UAP narratives cannot be reduced to a single analytic approach. Corpus-level methods (time/space clustering, keyword trends, statistical correlation, graph analysis) are useful for pattern detection and hypothesis generation, while narrative/experiential methods (phenomenology, discourse analysis) are useful for preserving meaning, cultural context, and witness voices. Infrastructures should allow these modes to coexist. Breakout Session #3: Cleaning, organizing, and linking data: What can and should be done? [DAY 2] The third and final breakout activity analyzed the structure of a hypothetical online reporting form that has collected 1,000 UAP reports stored as PDF files to identify possibilities for data 5 analysis with the data collected, as well as potential improvement of the form. The discussion led to the following overarching suggestions that are broadly informative for online UAP reporting tools. 1. Intake flow and structure: • Begin with a free-text box (and optional audio upload) where the witness provides their account in their own words. Use AI-assisted extraction to propose structured fields, which the witness can then confirm or correct. Frame questions around what was perceived (angular size, shape, movement overarching suggestions that are broadly informative for online UAP reporting tools. 1. Intake flow and structure: • Begin with a free-text box (and optional audio upload) where the witness provides their account in their own words. Use AI-assisted extraction to propose structured fields, which the witness can then confirm or correct. Frame questions around what was perceived (angular size, shape, movement, sound, effects) rather than presumed properties (exact distance, solid object dimensions). • 2. Additions to the form: • Ask witnesses to explain how they estimated size, distance, or speed (i.e. context prompts). • Capture whether this has happened before and, if so, how often. • Instead of “mass sighting: yes/no”, include approximate numbers of witnesses. Include a field for whether the object seemed to react to observer presence. • • Add examples of technological effects (e.g., radio static, car failure) and basic prompts about feelings or aftereffects that could be informative (e.g., “Did you discuss this with others? Would you want professional/peer support?”). • Automatically ingest and display photo metadata (camera model, timestamp, location), giving users the option to redact sensitive fields. 3. Standardization and cleaning: • Accept location information including city/address/zip/lat–long, with simple guidance and drop-downs, and normalize on the back end. • Enforce a single-entry format for dates and times (calendar widget or drop-downs). • Allow multiple inputs for units (imperial/metric) but convert and store consistently. • Include structured numeric fields for object count and multiple objects, with adaptive follow-up to describe each object separately. 4. Taxonomical considerations: • Provide a concise taxonomy of common shapes (disk, sphere, triangle, cigar, “other”) but allow free-text for unusual forms. • Update descriptive references for cultural familiarity (using objects such as coins or debit card to estimate size) and internationalize/translate forms for broader accessibility. 5. Integration and linkage: • Include a field to indicate whether the event was reported elsewhere (NUFORC, MUFON, FAA, etc.). 6 • Design the schema so reports can be linked to FAA/NASA Aviation Safety Reporting System (ASRS) data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight tracks, weather radar, astronomical databases . 5. Integration and linkage: • Include a field to indicate whether the event was reported elsewhere (NUFORC, MUFON, FAA, etc.). 6 • Design the schema so reports can be linked to FAA/NASA Aviation Safety Reporting System (ASRS) data, Automatic Dependent Surveillance-Broadcast (ADS-B) flight tracks, weather radar, astronomical databases, fireball networks, etc. • Enable dynamic follow-ups for multiple objects, multiple witnesses, or sequential events. 6. Governance and trust: • Give reporters clear control over what information (such as geolocation, photo metadata) is shared publicly. • Commit to aggregated, de-identified data releases (maps, trend summaries) to build trust without encouraging hoaxes. • Light-touch well-being questions were suggested, to help identify if respondents would like professional or peer follow-up without stepping into clinical assessment. Outcomes and Recommendations Synthesis of Findings Relevant data types and sources of UAP narrative reports Participants emphasized that UAP research requires drawing on a diverse ecosystem of data, extending beyond witness testimony. Primary narrative reports in formats ranging from PDFs and CSVs to emails and oral histories remain central, offering firsthand accounts that, when digitized and transcribed as needed, can be structured for analysis. These reports are complemented by smartphone photos and videos, which are widely available but often of poor quality, though improving over time. Government sources are handling both classified and unclassified records, including finished intelligence and historic documents. Military reports and ship logs are particularly robust, providing structured information on platforms, flight plans, and pilots, while the FAA continues to collect pilot reports. Other data streams include social media posts, which are often multimodal (such as online and social media videos); international partner databases; and structured technical or scientific sensor data, such as radar or spectrum analyses. Supplementary contextual data is also critical, including flight and weather records, seismological data, satellite imagery, and even doorbell videos or CCTV systems can corroborate sightings. Barriers and challenges in data collection and use Despite many potential sources of data, significant obstacles remain. Access to social media data has become more restricted due to corporate licensing policies, while ethical and jurisdictional considerations complicate usage. Classification remains a dominant barrier, as substantial UAP data may be captured on , and even doorbell videos or CCTV systems can corroborate sightings. Barriers and challenges in data collection and use Despite many potential sources of data, significant obstacles remain. Access to social media data has become more restricted due to corporate licensing policies, while ethical and jurisdictional considerations complicate usage. Classification remains a dominant barrier, as substantial UAP data may be captured on classified sensors, automatically rendering it inaccessible until declassified. Other challenges include language and translation barriers, with both human and automated systems prone to errors, especially in low-resource languages. Stigma in reporting, 7 particularly among pilots, undermines data timeliness and completeness, while the lack of standardized reporting formats across agencies and organizations further fragments the landscape. Time sensitivity and weak retention policies have led to the loss of critical records, as in the well-known Nimitz case. Technical issues are also substantial. Older data can be difficult to digitize, cursive writing resists Optical Character Recognition (OCR) systems, and crowdsourced transcription projects suffer from low-quality outputs, recently worsened by misuse of generative AI. Finally, the field must grapple with fake data and disinformation, including AI-generated photos or videos, which pose risks for both public trust and analytic integrity. Metadata and context for usability and analysis Effective use of UAP data requires rich contextual metadata. Every report should ideally contain time, date, and location, preferably with geospatial precision. Distinguishing between descriptive metadata (objective characteristics like morphology or frequency band) and interpretive metadata (subjective effects or experiential meaning) is key. Metadata should also capture event-specific details, such as behaviors, sensor positions, and witness background, and must extend to technical parameters for structured data. Provenance (the chain of custody and source of the data) is essential for ensuring interpretability and trust. For visual evidence, metadata such as device type and embedded geotags allow validation against reported facts. Participants also emphasized flexible and well-designed reporting forms, for example including “refuse to answer” options to prevent fabricated entries when respondents lack knowledge. Linking data sources and developing a unified approach Given the fragmented nature of UAP data, participants argued for modest, pilot-scale integration projects as a starting point. Establishing common terminology and data dictionaries is important to . Participants also emphasized flexible and well-designed reporting forms, for example including “refuse to answer” options to prevent fabricated entries when respondents lack knowledge. Linking data sources and developing a unified approach Given the fragmented nature of UAP data, participants argued for modest, pilot-scale integration projects as a starting point. Establishing common terminology and data dictionaries is important to harmonize datasets across agencies and disciplines. Modular and extensible metadata standards could lead toward a composable ecosystem, potentially implemented through standardized templates, Interface Control Documents (ICDs), or APIs. Some form of established governance is needed to facilitate data management and access and engagement for researchers while alleviating inter-agency silos. Transparency was highlighted as both a goal and a challenge, as unclassified data should be made available to academia, while sensitive material must remain protected. Lessons from other fields, such as genetics and astronomy, were cited as models for developing interoperable metadata standards and ontology-driven approaches. Assessing credibility and quality of reports Participants highlighted the importance of sensor reliability, noting that human perception is fallible. Establishing gold standard exemplars of high-quality reports could help guide future collection and analysis. Semi-automated triage, assisted by AI, offers promise for sifting through massive datasets to identify cases with likely conventional explanations as well as cases of potential interest, though human oversight remains indispensable. Furthermore, credibility is enhanced when reports are corroborated by multiple witnesses or independent data streams, such 8 as radar or weather records. Interviews and psychological screening of witnesses was offered as an example of how to assess motivations and reduce false reports, though it was acknowledged that this is difficult to implement at scale. At the same time, biases in favor of certain professions (pilots, police) must be acknowledged due to enhanced observational training and skills. A phenomenological approach (qualitative analysis of indicators of lived experiences) allowing patterns to emerge from narrative accounts was recommended as a complement to quantitative methods, ensuring that unusual but meaningful details are not prematurely excluded. AI and analytical methods AI offers opportunities for pattern recognition, hypothesis generation, and efficiency gains in large-scale text and multimodal data analysis. Techniques such as semantic search, clustering, and multimodal modeling (for example, combining acoustic and infrared signals) can help identify anomalies. AI recommended as a complement to quantitative methods, ensuring that unusual but meaningful details are not prematurely excluded. AI and analytical methods AI offers opportunities for pattern recognition, hypothesis generation, and efficiency gains in large-scale text and multimodal data analysis. Techniques such as semantic search, clustering, and multimodal modeling (for example, combining acoustic and infrared signals) can help identify anomalies. AI is also valuable for routine tasks, such as extracting dates or locations from unstructured text, or triaging likely misidentifications. However, there are risks associated with AI. Hallucination (the generation of convincing but false conclusions) remains a core concern. AI analysis is only as reliable as the quality of its input, underscoring the “garbage in, garbage out” principle. Additionally, LLMs are already biased by UFO-related cultural content, potentially skewing analyses. Small datasets limit the potential for model training, though pre- trained models may still be repurposed. Best practices involve an iterative human-AI collaboration, where algorithms provide preliminary analysis that is verified, corrected, and enriched by human researchers. Ensemble approaches, leveraging multiple models, may reduce error rates. Overall, tasks must be carefully defined to align AI methods with research goals, ensuring a balance between qualitative depth and quantitative rigor. Forward-looking strategy and key considerations The group emphasized the need for a forward-looking research infrastructure that integrates proactive data collection, robust metadata standards, and interdisciplinary collaboration. Some argued for focusing on new, higher-quality data collection while others urged continued investment in historical data to preserve its potential value. Future infrastructure priorities include a unified security solution for managing classified and unclassified data, improved questionnaire design for witness reports, and benchmarking systems to track analytic performance over time. Importantly, even “low quality” or stigmatized reports should not be discarded but made available for diverse lines of inquiry an

Original source: view the released document

2025 UAP Workshop: Narrative Data, Infrastructures, and Analysis

More from the UAP Files files