Welcome to Prof. Dr. Marco Spruit's academic website. This dashboard highlights various aspects of my current work activities. Navigate to the appropriate pages in the 'Blue MenuBar' (above) for much more details on my work as Full Professor of Advanced Data Science in Population Health at Leiden University's Medical Center (PHEG) and its Faculty of Science (LIACS). Other pages which I actively maintain, are my Google Scholar and ORCID profile pages. Feel free to connect with me on LinkedIn.

Active Grants & Collaborations & Students

Postdocs & PhD candidates (12)


The Translational Data Science & AI Lab: https://tdslab.nl

Research grants (24)

2025-2029: UNCAN-Connect, 475K EUR (LIACS).
Decentralized Collaborative Network for Advancing Cancer Research and Innovation. HORIZON-RIA: HORIZON-MISS-2024-CANCER-01-01 (Research and Innovation actions supporting the implementation of the Mission on Cancer). Remarks: grant total: 30M EUR, 53 partners from 19 European and associated countries, comprising 6 SMEs, 3 LEs, 42 RTOs, 3 affiliated partners, and 1 NGO. Researcher(s): 1 PhD student, 1 postdoc. CfP
2025: Understanding Homicide in Indonesia
Leiden University Global Fund (LUGF) Seed Fund. Harnessing Traditional and New Media Data for Insight. Collaboration with Universitas Gadjah Mada in Yogyakarta, Indonesia. Applicants: O. Bogolyubova, M. Liem (ISGA), M. Spruit. /leiden-university-global-fund/
2024-2025: Phaeton, EUR 150K (LUMC) + EUR 50K (LIACS).
Pandemic preparedness. Portable platform as a service for crowdsourced and privacy respecting data analysis and modeling. Financer: ZonMW Modelleren voor Pandemische Paraatheid: een oproep tot innovatie en kennisontwikkeling SA 2023. Applicant(s): Bouwman,J., Haas,M., Spruit,M.. Remark: ZonMW dossier #10710062310030, grant total: 500K EUR. Researcher(s): Vinkenoog,M.
2024-2026: ECOTIP, EUR 130K (LUMC).
Identifying tipping points of the effects of living environments on ecosyndemics of lifestyle-related illnesses by ML/NLP modelling of a patient segmentation model based on EHR and environmental data. Financer(s): NWO New Science Agenda (NWA-ORC). Applicant(s): Kiefte,J., Spruit,M., Vos,R., et al. Remark: NWO dossier NWA.1518.22.151; grant total: 4.4M EUR. Researcher(s): Muizelaar,H. www.nwo.nl/en/projects/nwa151822151
2023-2026: INSAFEDARE, EUR 571K (LUMC).
Innovative applications of assessment and assurance of data and synthetic data for regulatory decision support. Generation and evaluation of a benchmarking synthetic dataset amenable to the regulatory process, analytical methods for validation of digital health applications, and components for data integration pipelines. Financer(s): Horizon Europe: HORIZON-HLTH-2022-TOOL-11-02: Tools and technologies for a healthy society. Applicant(s): Despotou,G. et al. Remark: HEU project #101095661; grant total: 4.8M EUR. Researcher(s): Achterberg,J. & Dijk,B. van 10.3030/101095661
2024: EuroQoL-LLM, 1325 EUR (LUMC).
Applying Large Language Models to Identify EQ-5D Bolt-ons Based on Patient Text Data. Financer: EuroQol Group Seed grant: 1792-SG. Applicant: van den Akker-van Marle,E., Spruit,M., et al. Remark: Grant total: 42K EUR. Researcher(s): Heijdra Suasnabar,J. et al. euroqol.org/research-at-euroqol/ our-research-portfolio/funded-projects/
2023-2024: HealthBox, EUR 66,000 (LUMC).
A personalized, home-based eHealth intervention to treat metabolic syndrome and prevent its complications by ML/NLP modelling of a patient segmentation model based on EHR and environmental data. Applicant(s): Chavannes,N., Atsma,D., Pijl,H., Vos,R., et al. Remark: grant total: 2.5M EUR. Researcher(s): Muizelaar,H. www.nwo.nl/en/projects/kich1gz0321007
2021-2024: VIPP, EUR 60K (LUMC).
Virtual Patients and Population Dataset. Develop a synthetic ELAN dataset to improve teaching data science. Financer(s): LUMC Interprofessional Education (IPE) programme. Applicant(s): Spruit,M., & Szuhai,K. Remark: Project Raamplan Implementatie Artsopleiding (PRIMA) 2020 working group deliverable wrt Theme 5 on Big Data and AI. Researcher(s): Faiq,A. healthcampusdenhaag.nl/nl/project/ virtuele-patient-en-populatie-vipp-dataset/

Research collaborations (16)

2022-2026: PreProMMF (ULEI)
Natural Language Processing in Mental Health: Detection, Prediction and Promotion with Multilingual, Multimodal and Federated Techniques. Sponsor: Arab Academy of Science, Technology & Maritime Transport (AAST). Financed as a 60% lecturer - 40% researcher contract. Researcher(s): Khalil,S.
2021-2025: Data2Bedside (LUMC)
Reusing routinely collected data from regional GP offices in ELAN to create a clinical decision support tool to identify disease progression risk levels in Type Two Diabetes Mellitus (T2DM) patients. Sponsor: Kingdom of Saudi Arabia scholarship. Researcher(s): Alfaraj,S.
2021-2026: PHA (LUMC)
Population Health Analytics. Maturity modelling for situational data infrastructure and scenario planning towards appropriate regional intelligence. Sponsor: Q-Consult Zorg. Researcher(s): Roorda,E.
2018-2024: PbD (UU)
Privacy-by-Design. How organisations can demonstrate responsible data use in information systems through Privacy-by-Design. Sponsor: P&O Rijk. Researcher(s): Dijk,F. van

Research theme

MSc students (103)

  1. Thiel,Haike van (In progress). Personalised and realistic training scenarios with artificial patients using AI models in Trauma Care. Civil-Military Centre of Expertise for Trauma Care (CETC).
  2. Rivetti,Giulia (In progress). Translation-Based Fine-Tuning of English BERT Models for Enhanced Performance in Minority Language NLP Tasks (LUMC). Daily supervisors: Hielke Muizelaar, Marcel Haas (LUMC).
  3. Schinkelshoek,Laurens (In progress). Machine learning for surgical departments. Spruit,Marco; van Nieuwenburg,Evert (LION/LIACS).
  4. Nguyen,Van (Committed). From mobile app to furry social robot: Welzijn.AI (LUMC). Daily supervisor: Bram van Dijk (LUMC).
  5. Koning,Michael de (Committed). Portable platform-as-a-service for crowdsourced and privacy respecting data analysis and modeling in pandemic response: PHAETON (LUMC/TNO). Daily supervisor: Marcel Haas (LUMC).
  6. Meng,Maggie (Committed). An evaluation of data analysis techniques in digital health applications. Daily supervisor: Jim Achterberg (LUMC).
  7. Mian,Belal (Committed). LLMs in the analysis of interviews with older people about goals of care: a pilot study. Daily supervisors: Bram van Dijk, prof. Simon Mooijaart (LUMC).
  8. de Koning,Irene (Committed). A virtual peers method for healthcare institution performance. Daily supervisor: CZ.

BSc students (62)

  1. Sanz Lozano, Rebeca (Orienting). Brain, cognition and wellbeing. Spruit,Marco (UL).
  2. Leito, Roderick (08/10/2024). Integration of the EQ5D PROM questionnaire into a natural and unobtrusive conversation using a RASA-driven chatbot. Spruit,Marco & Lefebvre,Armel (UL). [7.5]
  3. Baghdasaryan, Ruzanna (27/08/2024). Questionnaire-driven Dialogue: Utilizing Large Language Models for Hallucination-free Conversational AI in Elderly Well-being Monitoring. Spruit,Marco & Lefebvre,Armel (UL). [8.5]

Recent Publications & Talks & Committees

Journal articles (114)

  1. Alfaraj,S., Kist,J., Groenwold,R., Spruit,M., Mook-Kanamori,D., & Vos,R. (2024). External validation of SCORE2-Diabetes in the Netherlands across various Socioeconomic levels in native-Dutch and non-Dutch populations. European Journal of Preventive Cardiology, zwae354. 10.1093/eurjpc/zwae354
  2. Roorda,E., Bruijnzeels,M., Struijs,J., & Spruit,M. (2024). Business Intelligence Systems for Population Health Management: A Scoping Review. JAMIA Open, 7(4), ooae122. 10.1093/jamiaopen/ooae122
  3. Drougkas,G., Bakker,E., & Spruit,M. (2024). Multimodal Machine Learning for Language and Speech Markers Identification in Mental Health. BMC Medical Informatics and Decision Making, 24, 354. 10.1186/s12911-024-02772-0
  4. Álvarez-Chaves,H., Spruit,M., & R-Moreno,M. (2024). Improving ED admissions forecasting by using generative AI: An approach based on DGAN. Computer Methods and Programs in Biomedicine, 256, 108363. 10.1016/j.cmpb.2024.108363
  5. Achterberg,J., Haas,M., & Spruit,M. (2024). On the evaluation of synthetic longitudinal electronic health records. BMC Medical Research Methodology, 24, 181. 10.1186/s12874-024-02304-4
  6. Haastrecht,M. van, Haas,M., Brinkhuis,M., & Spruit,M. (2024). Understanding Validity Criteria in Technology-Enhanced Learning: A Systematic Literature Review. Computers & Education, 220, 105128. 10.1016/j.compedu.2024.105128
  7. Rijcken,E., Zervanou,K., Mosteiro,P., Scheepers,F., Spruit,M., & Kaymak,U. (2024). Topic Specificity: a Descriptive Metric for Algorithm Selection and Finding the Right Number of Topics. Natural Language Processing Journal, 8, 100082. 10.1016/j.nlp.2024.100082
  8. Muizelaar,H., Haas,M., van Dortmont,K., van der Putten,P., & Spruit,M. (2024). Extracting Patient Lifestyle Characteristics from Dutch Clinical Text with BERT Models. BMC Medical Informatics and Decision Making, 24, 151. 10.1186/s12911-024-02557-5
  9. Khalil, S., Tawfik,N., & Spruit,M. (2024). Federated learning for privacy-preserving depression detection with multilingual language models in social media posts. Patterns, 5, 100990. 10.1016/j.patter.2024.100990

Wordcloud

gScholar statistics

AllSince 2019
Citations48333497
h-index3730
i10-index9677

Conference proceedings (90)

  1. Van Dijk,B., Ul Islam,S., Achterberg,J., Muhammad Waseem,H., Gallos,P., Epiphaniou,G., Maple,C., Haas,M., & Spruit,M. (2024). A Novel Taxonomy for Navigating and Classifying Synthetic Data in Healthcare Applications. In Stoicu-Tivadar et al. (eds), Studies in Health Technology and Informatics, 321, Collaboration across Disciplines for the Health of People, Animals and Ecosystems. EFMI Special Topic Conference (STC 2024) (pp. 259-263), 27-29 Nov 2024, Timisoara, Romania. 10.3233/SHTI241104
  2. Lefebvre,A., de Schipper,L., Haas,M., & Spruit,M. (2024). Empowering Translational Health Data Science Capabilities in Population Health Management A Case of Building a Data Competence Center. In van de Wetering et al. (Eds.): I3E 2024, 23rd IFIP Conference e-Business, e-Services, and e-Society (I3E 2024), Lecture Notes in Computer Science, 14907. 11-13 September 2024, Heerlen, Netherlands. 10.1007/978-3-031-72234-9_33
  3. Gallos,P., Matragkas,N., Ul Islam,S., Epiphaniou,G., Hansen,S., Harrison,S., Van Dijk,B., Haas,M., Pappous,G., Brouwer,S., Torlontano,F., Farooq Abbasi,S., Pournik,O., Churm,J., Mantas,J., Luis Parra-Calderón,C., Petkousis,D., Weber,P., Dzingina,B., Mraidha,C., Maple,C., Achterberg,J., Spruit,M., Saratsioti,E., Moustaghfir,Y., & Arvanitis,T. (2024). INSAFEDARE Project: Innovative Applications of Assessment and Assurance of Data and Synthetic Data for Regulatory Decision Support. Studies in health technology and informatics, 316, 1193-1197. 34th Medical Informatics Europe Conference (MIE 2024), 25-29 Aug 2024, Athens, Greece.
  4. Haastrecht,M., Brinkhuis,M., & Spruit,M. (2024). Federated Learning Analytics: Investigating the Privacy-Performance Trade-Off in Machine Learning for Educational Analytics. In: Olney et al. (eds), Artificial Intelligence in Education (AIED 2024), Lecture Notes in Computer Science, 14830 (pp. 62-74). 8-12 July 2024, Recife, Brazil. 10.1007/978-3-031-64299-9_5
  5. Dijk,B. van, Duijn,M. van, Kloostra,L., Spruit,M., & Beekhuizen,B. (2024). Using a Language Model to Unravel Semantic Development in Children's Use of a Dutch Perception Verb. 8th Workshop on Cognitive Aspects of the Lexicon (CogALex@ LREC-COLING 2024) (pp. 98-106). 20 May 2024, Torino, Italy. 2024 - Dijk Duijn Kloostra Spruit Beekhuizen.pdf
  6. Wang,R., Verberne,S., & Spruit,M. (2024). Attend All Options at Once: Full Context Input for Multi-choice Reading Comprehension. In European Conference on Information Retrieval (ECIR 2024) (pp. 387-402). 24-28 March 2024, Glasgow, Scotland. Cham: Springer. 10.1007/978-3-031-56027-9_24

Invited talks (46)

  1. 19/09/2024: Natural language processing for enriching real world evidence from electronic health records: AI @ Health Campus The Hague. 3rd Leiden Drug Development Conference (LDDC) - "Artificial Intelligence in drug development, manufacturing and health care", 19 September 2024, ECC, Leiden [20 min] 2024 0919 lddc.pps
  2. 02/07/2024: Natuurlijke Taalverwerking in de Zorg. AI en Technologie week Geneeskunde jaar 1, LUMC. [40 min] 2024 0702 ai-week-gnk-b1 NLP.pdf
  3. 21/03/2024: Natural language processing for enriching real world evidence from electronic health records: NLP @ Health Campus The Hague. Spring Symposium Young Epidemiologists, UMC Utrecht. [30 min] 2024 0321 spruit-haga.pdf
  4. 11/03/2024: Translational Data Science in Population Health: Data Techniques and Methodology for Violence as a Public Health Problem. KIEM Pressure Cooker Workshop, 11 March 2024. [10 min] 2024 0311 Kiem-pitch-tds-en.pdf
  5. 20/02/2024: Translational Data Science & AI: A case of Natural Language Processing for Violence Risk Assessment using CRISP-DM. Lorentz workshop Criminal Justice Settings, Crime, and Reintegration, Session on New insights from computer science and economics for the study of criminal justice involved individuals, Leiden. [30 min] www.lorentzcenter.nl 2024 0220 Lorentz spruit NLP.pdf

Leiden University committees (27)

  1. 2024-present: Lead AI for Health at the Leiden AI Center of Excellence (LACE).
  2. 2024-present: Member Self Steering Committee in UNA Europa for the One Health focus area. una-europa-leiden
  3. 2024-present: Member ELAN Scientific Board.
  4. 2023-present: Member LIACS Scientific Council.
  5. 2023-present: Member PHM Scientific Council.
  6. 2023-present: Member PHEG Stuurgroep Studenten Onderwijs (SSO).
  7. 2022-present: Lead of ELAN implementation case in LUMC/Health-RI node.
  8. 2022-present: Member LUMC Student Research Award committee.
  9. 2021-present: Co-lead Special Interest Group Health Data Science (with profs. Kraaij & Fiocco).
  10. 2021-present: Member core team LUMC Clinical AI Implementation and Research Lab (CAIRELab).
  11. 2021-present: Member Advisory board of LUMC Research Facility Data Analytics.
  12. 2021-present: Member LUMC Ph.D. guidance committee for C. Li (RADI), V. van der Valk (RADI), S. Bagcik (BDS), D. Lyu (RADI).

Oppositions (24)

  1. R. Butz (OU, 12/12/2024). Enhancing Medical Decision Making with Bayesian Networks: A Journey into Interpretability and User Perception (prof. H. van Ditmarsch, prof R. Helms, A. Hommersom).
  2. M. van Buchem (LUMC, 11/12/2024). Natural Language Processing in Healthcare: Applications and Value (prof E. Steyerberg, I. Kant, M. Bauer).
  3. D. Misoo Kim (Universidad de Murcia, 3/10/2024). Simulation and Visualization of Spatial-Temporal Data in Hospital Infection Outbreaks (dr. D. Manuel Campos Martínez, dr. José Manuel Juárez Herrero).
  4. L. Yang (LIACS, 20/9/2024). Information-theoretic Partition-based Models for Interpretable Machine Learning (dr. M. van Leeuwen, prof. A. Plaat).
  5. M. Fragkiadakis (LIACS, 9/4/2024, secretary). Digital Tools for Sign Language Research: Towards Recognition and Comparison of Lexical Signs (prof M. Mous, P. van der Putten, V. Nyst).

Recent News & Observations