Welcome to Prof. Dr. Marco Spruit's academic website. This dashboard highlights various aspects of my current work activities. Navigate to the appropriate pages in the 'Blue MenuBar' (above) for much more details on my work as Full Professor of Advanced Data Science in Population Health at Leiden University's Medical Center (PHEG) and its Faculty of Science (LIACS). Other pages which I actively maintain, are my Google Scholar and ORCID profile pages. Feel free to connect with me on LinkedIn.

Active Grants & Collaborations

Recent Publications

Students & Talks & Committees

Research grants (22)

2024-2025: Phaeton, EUR 150K (LUMC) + EUR 50K (LIACS).
Pandemic preparedness. Portable platform as a service for crowdsourced and privacy respecting data analysis and modeling. Financer: ZonMW Modelleren voor Pandemische Paraatheid: een oproep tot innovatie en kennisontwikkeling SA 2023. Remark: ZonMW dossier #10710062310030, grant total: 500K EUR.
2024-2026: ECOTIP, EUR 130K (LUMC).
Identifying tipping points of the effects of living environments on ecosyndemics of lifestyle-related illnesses by ML/NLP modelling of a patient segmentation model based on EHR and environmental data. Applicant(s): Kiefte,J., Spruit,M., Vos,R., et al. Remark: grant total: 4.4M EUR. Researcher(s): Muizelaar,H. www.nwo.nl/en/projects/nwa151822151
2023-2026: INSAFEDARE, EUR 571K (LUMC).
Innovative applications of assessment and assurance of data and synthetic data for regulatory decision support. Generation and evaluation of a benchmarking synthetic dataset amenable to the regulatory process, analytical methods for validation of digital health applications, and components for data integration pipelines. Financer(s): Horizon Europe: HORIZON-HLTH-2022-TOOL-11-02: Tools and technologies for a healthy society. Applicant(s): Despotou,G. et al. HEU project #101095661; grant total: 4.8M EUR. Researcher(s): Achterberg,J. & Dijk,B. van 10.3030/101095661
2024: EuroQoL-LLM, 1325 EUR (LUMC).
Applying Large Language Models to Identify EQ-5D Bolt-ons Based on Patient Text Data. Financer: EuroQol Group Seed grant: 1792-SG. Applicant: van den Akker-van Marle,E., Spruit,M., et al. Remark: Grant total: 42K EUR. Researcher(s): Heijdra Suasnabar,J. et al. euroqol.org/research-at-euroqol/ our-research-portfolio/funded-projects/
2023-2024: HealthBox, EUR 66,000 (LUMC).
A personalized, home-based eHealth intervention to treat metabolic syndrome and prevent its complications by ML/NLP modelling of a patient segmentation model based on EHR and environmental data. Applicant(s): Chavannes,N., Atsma,D., Pijl,H., Vos,R., et al. Remark: grant total: 2.5M EUR. Researcher(s): Muizelaar,H. www.nwo.nl/en/projects/kich1gz0321007
2021-2024: VIPP, EUR 60K (LUMC).
Virtual Patients and Population Dataset. Develop a synthetic ELAN dataset to improve teaching data science. Financer(s): LUMC Interprofessional Education (IPE) programme. Applicant(s): Spruit,M., & Szuhai,K. Remark: Project Raamplan Implementatie Artsopleiding (PRIMA) 2020 working group deliverable wrt Theme 5 on Big Data and AI. Researcher(s): Faiq,A. healthcampusdenhaag.nl/nl/project/ virtuele-patient-en-populatie-vipp-dataset/

Research theme

Research collaborations (16)

2022-2026: PreProMMF (ULEI)
Natural Language Processing in Mental Health: Detection, Prediction and Promotion with Multilingual, Multimodal and Federated Techniques. Sponsor: Arab Academy of Science, Technology & Maritime Transport (AAST). Financed as a 60% lecturer - 40% researcher contract. Researcher(s): Khalil,S.
2021-2025: Data2Bedside (LUMC)
Reusing routinely collected data from regional GP offices in ELAN to create a clinical decision support tool to identify disease progression risk levels in Type Two Diabetes Mellitus (T2DM) patients. Sponsor: Kingdom of Saudi Arabia scholarship. Researcher(s): Alfaraj,S.
2021-2026: PHA (LUMC)
Population Health Analytics. Maturity modelling for situational data infrastructure and scenario planning towards appropriate regional intelligence. Sponsor: Q-Consult Zorg. Researcher(s): Roorda,E.
2018-2024: PbD (UU)
Privacy-by-Design. How organisations can demonstrate responsible data use in information systems through Privacy-by-Design. Sponsor: P&O Rijk. Researcher(s): Dijk,F. van

Journal articles (111)

  1. Álvarez-Chaves,H., Spruit,M., & R-Moreno,M. (2024). Improving ED admissions forecasting by using generative AI: An approach based on DGAN. Computer Methods and Programs in Biomedicine, 256, 108363. 10.1016/j.cmpb.2024.108363
  2. Achterberg,J., Haas,M., & Spruit,M. (2024). On the evaluation of synthetic longitudinal electronic health records. BMC Medical Research Methodology, 24, 181. 10.1186/s12874-024-02304-4
  3. Haastrecht,M. van, Haas,M., Brinkhuis,M., & Spruit,M. (2024). Understanding Validity Criteria in Technology-Enhanced Learning: A Systematic Literature Review. Computers & Education, 220, 105128. 10.1016/j.compedu.2024.105128
  4. Rijcken,E., Zervanou,K., Mosteiro,P., Scheepers,F., Spruit,M., & Kaymak,U. (2024). Topic Specificity: a Descriptive Metric for Algorithm Selection and Finding the Right Number of Topics. Natural Language Processing Journal, 8, 100082. 10.1016/j.nlp.2024.100082
  5. Muizelaar,H., Haas,M., van Dortmont,K., van der Putten,P., & Spruit,M. (2024). Extracting Patient Lifestyle Characteristics from Dutch Clinical Text with BERT Models. BMC Medical Informatics and Decision Making, 24, 151. 10.1186/s12911-024-02557-5
  6. Khalil, S., Tawfik,N., & Spruit,M. (2024). Federated learning for privacy-preserving depression detection with multilingual language models in social media posts. Patterns, 5, 100990. 10.1016/j.patter.2024.100990

Wordcloud

gScholar statistics

AllSince 2019
Citations46353317
h-index3730
i10-index9475

Conference proceedings (88)

  1. Van Dijk,B., Ul Islam,S., Achterberg,J., Muhammad Waseem,H., Gallos,P., Epiphaniou,G., Maple,C., Haas,M., & Spruit,M. (In press). A Novel Taxonomy for Navigating and Classifying Synthetic Data in Healthcare Applications. EFMI Special Topic Conference (STC 2024), 27-29 Nov 2024, Timisoara, Romania.
  2. Lefebvre,A., de Schipper,L., Haas,M., & Spruit,M. (2024). Empowering Translational Health Data Science Capabilities in Population Health Management A Case of Building a Data Competence Center. In van de Wetering et al. (Eds.): I3E 2024, 23rd IFIP Conference e-Business, e-Services, and e-Society (I3E 2024), Lecture Notes in Computer Science, 14907. 11-13 September 2024, Heerlen, Netherlands. 10.1007/978-3-031-72234-9_33
  3. Gallos,P., Matragkas,N., Ul Islam,S., Epiphaniou,G., Hansen,S., Harrison,S., Van Dijk,B., Haas,M., Pappous,G., Brouwer,S., Torlontano,F., Farooq Abbasi,S., Pournik,O., Churm,J., Mantas,J., Luis Parra-Calderón,C., Petkousis,D., Weber,P., Dzingina,B., Mraidha,C., Maple,C., Achterberg,J., Spruit,M., Saratsioti,E., Moustaghfir,Y., & Arvanitis,T. (2024). INSAFEDARE Project: Innovative Applications of Assessment and Assurance of Data and Synthetic Data for Regulatory Decision Support. Studies in health technology and informatics, 316, 1193-1197. 34th Medical Informatics Europe Conference (MIE 2024), 25-29 Aug 2024, Athens, Greece.
  4. Haastrecht,M., Brinkhuis,M., & Spruit,M. (2024). Federated Learning Analytics: Investigating the Privacy-Performance Trade-Off in Machine Learning for Educational Analytics. In: Olney et al. (eds), Artificial Intelligence in Education (AIED 2024), Lecture Notes in Computer Science, 14830 (pp. 62-74). 8-12 July 2024, Recife, Brazil. 10.1007/978-3-031-64299-9_5
  5. Dijk,B. van, Duijn,M. van, Kloostra,L., Spruit,M., & Beekhuizen,B. (2024). Using a Language Model to Unravel Semantic Development in Children's Use of a Dutch Perception Verb. 8th Workshop on Cognitive Aspects of the Lexicon (CogALex@ LREC-COLING 2024) (pp. 98-106). 20 May 2024, Torino, Italy. 2024 - Dijk Duijn Kloostra Spruit Beekhuizen.pdf
  6. Wang,R., Verberne,S., & Spruit,M. (2024). Attend All Options at Once: Full Context Input for Multi-choice Reading Comprehension. In European Conference on Information Retrieval (ECIR 2024) (pp. 387-402). 24-28 March 2024, Glasgow, Scotland. Cham: Springer. 10.1007/978-3-031-56027-9_24

Postdocs & PhD candidates (11)

MSc students (97)

  1. Thiel,Haike van (In progess). Personalised and realistic training scenarios with artificial patients using AI models in Trauma Care. e Civil-Military Centre of Expertise for Trauma Care (CETC).
  2. Nguyen,Van (Committed). From mobile app to furry social robot: Welzijn.AI.
  3. Drougkas,Georgios (25/06/2024). Multimodal Machine Learning for Language/Speech Markers Identification in Mental Health. Spruit,Marco, & Bakker,Erwin (UL). [8.0]
  4. Rameshchandra,Ramya Tumkur (25/06/2024). Unsupervised machine learning methods to understand the social and psychological effects of prescription opioids. Spruit,Marco, & Baratchi,Mitra (UL). [7.5]
  5. Tomassen,Floris (05/02/2024). LLM-Based Data Generation techniques for end-to-end models of grammatical error correction applied to Dutch Care Text. Spruit,Marco; Wijnholds,Gijs. (Prime Vision). [8.5]

BSc students (61)

  1. Leito, Roderick (08/10/2024). Integration of the EQ5D PROM questionnaire into a natural and unobtrusive conversation using a RASA-driven chatbot. Spruit,Marco & Lefebvre,Armel (UL). [7.5]
  2. Baghdasaryan, Ruzanna (27/08/2024). Questionnaire-driven Dialogue: Utilizing Large Language Models for Hallucination-free Conversational AI in Elderly Well-being Monitoring. Spruit,Marco & Lefebvre,Armel (UL). [8.5]
  3. Tanoesemito, Charma (01/03/2024). Reconstructing family relationships using routine primary care Electronic Health Record database. Life Sciences and Technology (LST) programme. Spruit,Marco; Marian Beekman, Niels van den Berg (MOLEPI). [8.0]
  4. Lelasseux, Maxine (05/02/2024). Analyzing offenses against life data: a machine learning approach on data extracted from the Human Relations Area Files (HRAF) database. Spruit,Marco; Liem,Marieke; Syme,Katharina (FGGA/ISGA). [6.5]

Invited talks (46)

  1. 19/09/2024: Natural language processing for enriching real world evidence from electronic health records: AI @ Health Campus The Hague. 3rd Leiden Drug Development Conference (LDDC) - "Artificial Intelligence in drug development, manufacturing and health care", 19 September 2024, ECC, Leiden [20 min] 2024 0919 lddc.pps
  2. 02/07/2024: Natuurlijke Taalverwerking in de Zorg. AI en Technologie week Geneeskunde jaar 1, LUMC. [40 min] 2024 0702 ai-week-gnk-b1 NLP.pdf
  3. 21/03/2024: Natural language processing for enriching real world evidence from electronic health records: NLP @ Health Campus The Hague. Spring Symposium Young Epidemiologists, UMC Utrecht. [30 min] 2024 0321 spruit-haga.pdf
  4. 11/03/2024: Translational Data Science in Population Health: Data Techniques and Methodology for Violence as a Public Health Problem. KIEM Pressure Cooker Workshop, 11 March 2024. [10 min] 2024 0311 Kiem-pitch-tds-en.pdf
  5. 20/02/2024: Translational Data Science & AI: A case of Natural Language Processing for Violence Risk Assessment using CRISP-DM. Lorentz workshop Criminal Justice Settings, Crime, and Reintegration, Session on New insights from computer science and economics for the study of criminal justice involved individuals, Leiden. [30 min] www.lorentzcenter.nl 2024 0220 Lorentz spruit NLP.pdf

Leiden University committees (26)

  1. 2024-present: Self Steering Committee Member in UNA Europa, One Health Focus Area. .../una-europa-leiden/self-steering-committees
  2. 2024-present: Member ELAN Scientific Board.
  3. 2023-present: Member LIACS Scientific Council.
  4. 2023-present: Member PHM Scientific Council.
  5. 2023-present: Member PHEG Stuurgroep Studenten Onderwijs (SSO).
  6. 2022-present: Lead of ELAN implementation case in LUMC/Health-RI node.
  7. 2022-present: Member LUMC Student Research Award committee.
  8. 2021-present: Co-lead Special Interest Group Health Data Science (with profs. Kraaij & Fiocco).
  9. 2021-present: Member core team LUMC Clinical AI Implementation and Research Lab (CAIRELab).
  10. 2021-present: Member Advisory board of LUMC Research Facility Data Analytics.
  11. 2021-present: Member LUMC Ph.D. guidance committee for C. Li (RADI), V. van der Valk (RADI), S. Bagcik (BDS), D. Lyu (RADI).

Oppositions (21)

  1. D. Misoo Kim (Universidad de Murcia, 3/10/2024). Simulation and Visualization of Spatial-Temporal Data in Hospital Infection Outbreaks (dr. D. Manuel Campos Martínez, dr. José Manuel Juárez Herrero).
  2. L. Yang (LIACS, 20/9/2024). Information-theoretic Partition-based Models for Interpretable Machine Learning (dr. M. van Leeuwen, prof. A. Plaat).
  3. M. Fragkiadakis (LIACS, 9/4/2024, secretary). Digital Tools for Sign Language Research: Towards Recognition and Comparison of Lexical Signs (prof M. Mous, P. van der Putten, V. Nyst).
  4. M. Lao (LIACS, 28/11/2023). Exploring Deep Learning for Multimodal Understanding (prof M. Lew, prof A. Plaat).
  5. R. Turner (ULEI/MI, 14/11/2023). Safe Anytime-Valid Inference: from Theory to Implementation in Psychiatry Research (prof P. Grünwald, prof F. Scheepers, A. Harma).