Guilherme S. Imai Aldeia, MSc, PhD

LinkedIn: https://linkedin.com/in/guilherme-aldeia GitHub: https://github.com/gAldeia Instagram: https://instagram.com/guilhermeseidyo Instagram: https://orcid.org/0000-0002-0102-4958

Guilherme earned his PhD in December 2025 and researches genetic programming and large language models for health applications. He completed three majors at a top Brazilian federal university known for academic quality and international partnerships. His work includes developing methods for symbolic regression, applying machine learning to fMRI data, and advancing explainable AI. He has experience in digital signal processing and has led computational experiments resulting in publications. His interests include evolutionary computing, model interpretability, healthcare prediction, brain data analysis, and projects that improve quality of life.

EDUCATION

PhD in Computer Science
Universidade Federal do ABC

Santo André, SP, BrazilGPA: 4.0/4.0 Feb 2023 - Dec 2025

M.Sc in Computer Science
Universidade Federal do ABC

Santo André, SP, BrazilGPA: 4.0/4.0 Feb 2020 - Dec 2021

B.Sc in Neuroscience
Universidade Federal do ABC

Santo André, SP, Brazil May 2015 - Apr 2023

B.Sc in Computer Science
Universidade Federal do ABC

Santo André, SP, Brazil May 2015 - Feb 2020

B.Sc in Science and Technology
Universidade Federal do ABC

Santo André, SP, Brazil May 2015 - Dec 2019

EXPERIENCE

Postdoctoral Fellow
Boston Children's Hospital, Harvard Medical School

Boston, MA, USA 2026 - ongoing

Postdoctoral researcher at Cavalab, BCH, and Pediatrics, HMS. Contributed to multiple research projects.

Latin-American Summer School in Computational Neuroscience
Universidad de Valparaíso

Valparaíso, Chile 2025 (1 month)

Attended a three-week summer school. Developed a project simulating neuron populations in the rat visual cortex, based on literature review of neuron types and pathways.

Teaching Assistant
FIAP

São Paulo, SP, Brazil 2023 - 2025

Taught AI and ML courses online and in person. Prepared materials, exams, and mentored annual projects in partnership with tech companies. Emphasized hands-on, practical learning.

Visitor PhD Student
Boston Children's Hospital

Boston, MA, USA 2023 (6 months)

Participated in lab meetings, journal clubs, and health science projects. Worked with data, received HIPAA/PHI training, contributed to grant reports, and gained SLURM experience.

Teaching Internship
Universidade Federal do ABC

Santo André, SP, Brazil 2022 (3 months)

Assisted in Algorithm Complexity Course. Prepared exercises and answer keys, taught mathematical proofs, and provided weekly student support.

Teaching Internship
Universidade Federal do ABC

Santo André, SP, Brazil 2020 (3 months)

Developed programming exercises and answer keys in Python and Java. Created test cases for automated grading. Collaborated weekly with the professor on course content.

UFABC Rocket Design
Universidade Federal do ABC

Santo André, SP, Brazil 2017 - 2019

Developed Arduino code for sensor integration and telemetry in rocket embedded systems. Managed flight data collection, communication, and recovery system activation.

Heuristics, Analysis, and Learning (HAL) Laboratory
Universidade Federal do ABC

Santo André, SP, Brazil 2017 - 2025

Conducted research in genetic programming for symbolic regression. Published and collaborated on multiple projects. Designed experiments and analyzed results.

RESEARCH SUMMARY

Published Papers

14 publications, 190+ citations since 2020.
Two journal articles (IEEE Trans. Evol. Comput., Genet. Program. Evolvable Mach.)
Presented at international conferences (GECCO, ML4HC, WCCI CEC).

Peer Review

12 journal and 13 conference reviews since 2022.
Journals: IEEE Trans. Evol. Comput., Royal Society.
Conferences: IEEE WCCI, IEEE CAI.

Science Communication and Outreach

Evaluator for three years at Feira Brasileira de Ciência e Engenharia (FEBRACE), the largest Latin American science fair.
Contributed to Met@Aprendizagem, a project to improve programming skills for undergrad students of different majors.
Volunteered for two years at UFABC para todos, encouraging students to pursue higher education.

TECHNICAL SKILLS

Programming Languages

Fluent in Python, C, C++, Julia, Java, JavaScript, Arduino.
Experienced in Haskell, Matlab, R, Octave, TypeScript, HTML, CSS, SASS, PHP, C#, .NET, SQL.
Comfortable with object-oriented, functional, imperative, descriptive, meta-programming, and multiple dispatch paradigms.

Tools & Systems

Linux user for 10+ years (Bash, system management, virtual environments).
Git for collaborative development, project management, code review.
LaTeX for writing and formatting; cloud computing with SSH, Docker, SLURM.

Soft Skills

Quick learner, critical thinking, problem-solving, adaptability, proactive, self-motivated.

Hard Skills

Data analysis and visualization (advanced plotting of multimodal data).
Statistics (t-test, ANOVA, p-value correction, sample size, confidence intervals, Fisher information).
Big data: mining, pipelines, standardization, harmonization, structured and unstructured databases.
Integrations: building and consuming REST APIs, webhooks, CI/CD.
Complex projects: contributed to codebases with 100+ files and 30k+ lines. Experience with test-driven development and large test suites. Developed and published Python libraries.
Reproducibility: reading papers, source code, and replicating computational experiments.

Languages

Portuguese (native), English (fluent), Spanish (basic).

PROJECTS

Large Language Models for Computable Phenotypes and Medical Calculators

Boston Children's Hospital 2025 - ongoing

Developed computable phenotypes and medical calculators from EHR data using large language models. Evaluated both proprietary and open-source models for medical tasks. Combined black-box optimization with LLM-generated code to improve sensitivity and specificity.

Learning Computable Phenotypes for Hypertension and it's subtypes

Boston Children's Hospital 2023 - ongoing

Generated computable phenotypes for hypertension from EHR data using symbolic regression and language models. Designed and evaluated prompts, and combined symbolic regression with LLMs for code generation. Assessed phenotype performance on EHR data.

Signal and Noise Correlation in Simulated Neural Activity

Universidade Federal do ABC 2024 - 2024

Simulated neural populations to estimate Fisher information and study how correlations encode auditory stimuli. Modeled brain pathways using neuron populations with varying firing rates.

Understanding Pediatric Headaches with ML and Explainable AI

Boston Children's Hospital 2023 - ongoing

Trained and interpreted machine learning models on pediatric fMRI data to study pain mechanisms. Used data augmentation, various processing pipelines, and explainable AI to identify key brain regions. Validated findings by predicting brain age and pain scores.

Current Challenges in Symbolic Regression, PhD Thesis

Universidade Federal do ABC 2022 - 2025

Enhanced accuracy and interpretability in symbolic regression. Developed new methods for parent selection, parameter optimization, and function simplification. Conducted large-scale benchmarking to assess computational cost and solution quality.

Interpretability in Symbolic Regression, M.Sc Dissertation

Universidade Federal do ABC 2020 - 2022

Developed new explanation methods for symbolic regression using partial effects. Created a Python library for symbolic regression and robust explanations. Benchmarked explanatory methods for stability and reliability in feature importance.

Functional Connectivity Using Graph Theory to Predict Brain Development, Undergraduate Research

Universidade Federal do ABC 2020 - 2021

Predicted brain development three years ahead using graph theory and fMRI data. Applied machine learning to functional connectivity and graph centralities. Prepared fMRI data using standard preprocessing techniques. Predicted future psychopathology scores.

Digital Signal Processing of Spatial Audio for Source Localization

Universidade Federal do ABC 2019 (six months)

Applied seismic signal processing to localize sound sources using an array of eight microphones. Designed algorithms to identify victims in hard-to-access areas for drone-assisted rescue. Performed sensitivity analysis and submitted to IEEE DSP competition.

Evolutionary Algorithms for Symbolic Regression Using Constrained Representations, Undergraduate Research

Universidade Federal do ABC 2019 - 2020

Proposed the Interaction-Transformation Evolutionary Algorithm (ITEA) for symbolic regression. Generated closed-form expressions and recovered known physics equations from data. Developed an online interface for model editing and visualization.

AWARDS

Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2025
Association for Computing Machinery

2025

Fellowship, Latin-American Summer School in Computational Neuroscience
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

2025

Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2024
Association for Computing Machinery

2024

Doctoral Sandwich Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

2023

Doctoral Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

2022

Master's Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

2020

Undergraduate Research Fellowship in Neuroscience
Universidade Federal do ABC

2020

1st Place, Undergraduate Final Project
Brazilian Symposium on Information Systems (CTCCSI)

2020

Extension Project Fellowship, Met@Aprendizagem
Universidade Federal do ABC

2018

PUBLICATIONS

Aldeia, G. S. I., Romano, J. D., de França, F. O., Herman, D. S., & La Cava, W. G. (2025). Towards symbolic regression for interpretable clinical decision scores.
Aldeia, G. S. I., Herman, D. S., & La Cava, W. G. (2025). Iterative Learning of Computable Phenotypes for Treatment Resistant Hypertension using Large Language Models. Proceedings of Machine Learning Research, 298, 1–31. https://proceedings.mlr.press/v298/aldeia25a.html
Aldeia, G. S. I., Zhang, H., Bomarito, G., Cranmer, M., Fonseca, A., Burlacu, B., La Cava, W. G., & de França, F. O. (2025). Call for Action: towards the next generation of symbolic regression benchmark. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’25 Companion), 2529–2538. https://doi.org/10.1145/3712255.3734309
Aldeia, G. S. I., Moon, C., Shulman, J., Sethna, N., Smith, A., Lebel, A., La Cava, W. G., & Holmes, S. (2025). Application of Artificial Neural Networks and Functional Brain Connectivity to Inform Pediatric Headache. The Journal of Pain, 29. https://doi.org/10.1016/j.jpain.2025.105140
Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 896–904. https://doi.org/10.1145/3638529.3654147
Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Minimum variance threshold for epsilon-lexicase selection. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 905–913. https://doi.org/10.1145/3638529.3654149
Aldeia, G. S. I., & de França, F. O. (2022). Interpretability in symbolic regression: a benchmark of explanatory methods using the Feynman data set. Genetic Programming and Evolvable Machines, 23, 309–349. https://doi.org/10.1007/s10710-022-09435-x
Aldeia, G. S. I., & de França, F. O. (2022). Interaction-transformation evolutionary algorithm with coefficients optimization. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’22), 2274–2281. https://doi.org/10.1145/3520304.3533987
de França, F. O., & Aldeia, G. S. I. (2021). Interaction–Transformation Evolutionary Algorithm for Symbolic Regression. Evolutionary Computation, 29(3), 367–390. https://doi.org/10.1162/evco_a_00285
Aldeia, G. S. I., & de França, F. O. (2020). A Parametric Study of Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC48606.2020.9185521
Spadini, T., Aldeia, G. S. I., & others. (2019). On the application of SEGAN for the attenuation of the ego-noise in the speech sound source localization problem. 2019 Workshop on Communication Networks and Power Systems (WCNPS), 1–4. https://doi.org/10.1109/WCNPS.2019.8896308
Aldeia, G. S. I., & de França, F. O. (2018). Lightweight Symbolic Regression with the Interaction-Transformation Representation. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC.2018.8477951