Interested in making representation learning and generative models work for structured data (e.g. tables in relational databases) to automatically retrieve relevant insights from data? Then, this 4-year PhD starting Winter 2024/2025 is for you!Goal of the DataLibra project Approximately 120 zettabytes of data has been collected worldwide but less than 1% is actually used. Structured data (relational tables and spreadsheets) is prevailing in organizations and typically informs important decisions in e.g., healthcare, governments and finance. Yet, while AI has demonstrated a high impact on applications on text and images, proportional progress on structured data is lacking. With the DataLibra project, we aim to close this gap, by developing AI models and tools for structured data (Table Representation Learning), to help organizations, of any size, domain, and level of data literacy, get insights from structured data, efficiently, accurately and securely.
Goal of this PhD project Despite the abundance of structured data, many important insights from this kind of data are left unnoticed as they are difficult to surface or organizations do not have sufficient expertise to extract them. This project will explore emerging paradigms such as table representation learning (TRL), retrieval- augmented generation, generative retrieval, agentic systems, and conversational user interfaces, to automatically surface relevant insights from relational databases and data lakes, in a robust and efficient manner. This PhD project is supervised by Dr. Madelon Hulsebos (CWI) and Prof. dr. Maarten de Rijke (UvA).
What you will be doing - Develop and execute a 4-year research agenda around insight retrieval from structured data.
- Actively collaborate with other researchers in the DataLibra project (students, PhDs, postdocs,PIs) and external collaborators.
- Communicate research outcomes through papers and presentations at conferences, workshops and other (scientific) gatherings.
- Assist in relevant teaching activities at universities, such as thesis supervision and assisting in courses.