Proposte tesi CNR-IIA
AI-Enabled Natural Language Access to Environmental Data through the Discovery and Access Broker (DAB)
The thesis will explore the integration of Artificial Intelligence agents and Large Language Models (LLMs) with the Discovery and Access Broker (DAB) to enable natural language interaction with distributed environmental and geospatial datasets. The work will focus on extending DAB with a Model Context Protocol (MCP) service interface, allowing AI agents to discover, access, and combine heterogeneous data sources through conversational queries.
The proposed system will support agentic workflows for environmental data analysis, including tasks such as retrieving precipitation observations, computing seasonal and annual trends, and identifying extreme rainfall events using statistical indicators. The implementation will involve technologies such as Java, Python, APIs, AI agents, and geospatial interoperability standards. The expected outcome is a prototype AI-enabled interface that demonstrates intelligent and user-friendly access to environmental information systems.
AI-Assisted Backend Components for the Discovery and Access Broker (DAB): Metadata Harmonization and Automated Data Integration
The thesis will investigate the application of Artificial Intelligence agents and Large Language Models (LLMs) to enhance backend components of the Discovery and Access Broker (DAB), with particular focus on automated data integration and metadata harmonization. The activity will explore how AI techniques can support the development of DAB accessors and profilers, enabling faster integration of heterogeneous remote services and automated mapping of external APIs and data formats to the DAB internal information model.
A second research area will address the use of LLMs for metadata augmentation, normalization, and semantic alignment with ontologies and controlled vocabularies, improving discoverability, interoperability, and semantic consistency of environmental datasets. The implementation will involve technologies such as Java, Python, AI agents, APIs, and semantic web technologies. Expected outcomes include prototype AI-enhanced backend components and an evaluation of the potential of AI-driven approaches for System-of-Systems environmental data infrastructures.
AI-Based Data Analysis and Knowledge Extraction from Environmental Data Integrated through the Discovery and Access Broker (DAB)
The thesis will investigate the use of Artificial Intelligence and Machine Learning techniques to extract knowledge and added value from environmental datasets integrated through the Discovery and Access Broker (DAB). The work will focus on the development of AI-driven methods for data quality assessment, intelligent data fusion, forecasting, and/or time series analysis within distributed environmental data infrastructures.
The proposed activities include the analysis of hydrological and meteorological time series to detect anomalies, gaps, trends, seasonality, and extreme events such as floods and droughts through state-of-the-art techniques such as geospatial foundation models. The thesis will also explore the integration of in situ observations with satellite data to improve forecasting capabilities and support early warning insights. Particular attention will be given to explainable AI approaches, providing interpretable outputs for domain experts and decision-making processes.
Technologies involved may include Python, Machine Learning frameworks, AI models, geospatial data services, and environmental data standards. Expected outcomes include prototype tools for AI-based quality control and forecasting, together with a demonstration of the role of DAB as an analytics-ready infrastructure supporting next-generation data-to-knowledge workflows in initiatives such as World Meteorological Organization Hydrological Observing System (WHOS) and Global Earth Observation System of Systems (GEOSS).