Avatar

Guilherme S. Imai Aldeia, MSc, PhD

Guilherme earned his PhD in Computer Science in December 2025, focusing on the computational study and algorithmic improvement of symbolic regression methods and benchmarks. He also holds B.S. degrees in Computer Science and Neuroscience, all from the Federal University of ABC in Brazil. He is currently a postdoctoral researcher at CavaLab, where he joined in February 2026. His current work includes developing methods and applications of symbolic regression and large language models for healthcare, with an emphasis on interpretable decision-making and predictive modeling. He also works on machine learning models using fMRI data and neural network applications for temporal data. His research interests include model interpretability, data visualization, explainable AI, and healthcare prediction and modeling.

EDUCATION


PhD in Computer Science
Universidade Federal do ABC
M.Sc in Computer Science
Universidade Federal do ABC
B.Sc in Neuroscience
Universidade Federal do ABC
B.Sc in Computer Science
Universidade Federal do ABC
B.Sc in Science and Technology
Universidade Federal do ABC

EXPERIENCE


Postdoctoral Fellow
Boston Children's Hospital, Harvard Medical School
  • Postdoctoral researcher at Cavalab, BCH, and Pediatrics, HMS. Contributed to multiple research projects.
Latin-American Summer School in Computational Neuroscience
Universidad de Valparaíso
  • Attended a three-week summer school. Developed a project simulating neuron populations in the rat visual cortex, based on literature review of neuron types and pathways.
Teaching Assistant
FIAP
  • Taught AI and ML courses online and in person. Prepared materials, exams, and mentored annual projects in partnership with tech companies. Emphasized hands-on, practical learning.
Visitor PhD Student
Boston Children's Hospital
  • Participated in lab meetings, journal clubs, and health science projects. Worked with data, received HIPAA/PHI training, contributed to grant reports, and gained SLURM experience.
Teaching Internship
Universidade Federal do ABC
  • Assisted in Algorithm Complexity Course. Prepared exercises and answer keys, taught mathematical proofs, and provided weekly student support.
Teaching Internship
Universidade Federal do ABC
  • Developed programming exercises and answer keys in Python and Java. Created test cases for automated grading. Collaborated weekly with the professor on course content.
UFABC Rocket Design
Universidade Federal do ABC
  • Developed Arduino code for sensor integration and telemetry in rocket embedded systems. Managed flight data collection, communication, and recovery system activation.
Heuristics, Analysis, and Learning (HAL) Laboratory
Universidade Federal do ABC
  • Conducted research in genetic programming for symbolic regression. Published and collaborated on multiple projects. Designed experiments and analyzed results.

RESEARCH SUMMARY


Published Papers
  • 14 publications, 210+ citations since 2020.
  • Three journal articles (Philos. Trans. R. Soc. A., IEEE Trans. Evol. Comput., Genet. Program. Evolvable Mach.)
  • Presented at international conferences (GECCO, ML4HC, WCCI CEC).
Peer Review
  • 12 journal, 13 conference, and 1 workshop reviews since 2022.
  • Journals: IEEE Trans. Evol. Comput., Royal Society.
  • Conferences: IEEE WCCI, IEEE CAI.
  • Workshop: IEEE WCCI SymReg.
Science Communication and Outreach
  • Evaluator for three years at Feira Brasileira de Ciência e Engenharia (FEBRACE), the largest Latin American science fair.
  • Contributed to Met@Aprendizagem, a project to improve programming skills for undergrad students of different majors.
  • Volunteered for two years at UFABC para todos, encouraging students to pursue higher education.

TECHNICAL SKILLS


Programming Languages
  • Fluent in Python, C, C++, Julia, Java, JavaScript, Arduino.
  • Experienced in Haskell, Matlab, R, Octave, TypeScript, HTML, CSS, SASS, PHP, C#, .NET, SQL.
  • Comfortable with object-oriented, functional, imperative, descriptive, meta-programming, and multiple dispatch paradigms.
Tools & Systems
  • Linux user for 10+ years (Bash, system management, virtual environments).
  • Git for collaborative development, project management, code review.
  • LaTeX for writing and formatting; cloud computing with SSH, Docker, SLURM.
Soft Skills
  • Quick learner, critical thinking, problem-solving, adaptability, proactive, self-motivated.
Hard Skills
  • Data analysis and visualization (advanced plotting of multimodal data).
  • Statistics (t-test, ANOVA, p-value correction, sample size, confidence intervals, Fisher information).
  • Big data: mining, pipelines, standardization, harmonization, structured and unstructured databases.
  • Integrations: building and consuming REST APIs, webhooks, CI/CD.
  • Complex projects: contributed to codebases with 100+ files and 30k+ lines. Experience with test-driven development and large test suites. Developed and published Python libraries.
  • Reproducibility: reading papers, source code, and replicating computational experiments.
Languages
  • Portuguese (native), English (fluent), Spanish (basic).

PROJECTS


Large Language Models for Computable Phenotypes and Medical Calculators
  • Developed computable phenotypes and medical calculators from EHR data using large language models. Evaluated both proprietary and open-source models for medical tasks. Combined black-box optimization with LLM-generated code to improve sensitivity and specificity.
Learning Computable Phenotypes for Hypertension and it's subtypes
  • Generated computable phenotypes for hypertension from EHR data using symbolic regression and language models. Designed and evaluated prompts, and combined symbolic regression with LLMs for code generation. Assessed phenotype performance on EHR data.
Signal and Noise Correlation in Simulated Neural Activity
  • Simulated neural populations to estimate Fisher information and study how correlations encode auditory stimuli. Modeled brain pathways using neuron populations with varying firing rates.
Understanding Pediatric Headaches with ML and Explainable AI
  • Trained and interpreted machine learning models on pediatric fMRI data to study pain mechanisms. Used data augmentation, various processing pipelines, and explainable AI to identify key brain regions. Validated findings by predicting brain age and pain scores.
Current Challenges in Symbolic Regression, PhD Thesis
  • Enhanced accuracy and interpretability in symbolic regression. Developed new methods for parent selection, parameter optimization, and function simplification. Conducted large-scale benchmarking to assess computational cost and solution quality.
Interpretability in Symbolic Regression, M.Sc Dissertation
  • Developed new explanation methods for symbolic regression using partial effects. Created a Python library for symbolic regression and robust explanations. Benchmarked explanatory methods for stability and reliability in feature importance.
Functional Connectivity Using Graph Theory to Predict Brain Development, Undergraduate Research
  • Predicted brain development three years ahead using graph theory and fMRI data. Applied machine learning to functional connectivity and graph centralities. Prepared fMRI data using standard preprocessing techniques. Predicted future psychopathology scores.
Digital Signal Processing of Spatial Audio for Source Localization
  • Applied seismic signal processing to localize sound sources using an array of eight microphones. Designed algorithms to identify victims in hard-to-access areas for drone-assisted rescue. Performed sensitivity analysis and submitted to IEEE DSP competition.
Evolutionary Algorithms for Symbolic Regression Using Constrained Representations, Undergraduate Research
  • Proposed the Interaction-Transformation Evolutionary Algorithm (ITEA) for symbolic regression. Generated closed-form expressions and recovered known physics equations from data. Developed an online interface for model editing and visualization.

AWARDS


Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2025
Association for Computing Machinery
Fellowship, Latin-American Summer School in Computational Neuroscience
Universidad de Valparaíso
Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2024
Association for Computing Machinery
Doctoral Sandwich Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Doctoral Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Master's Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Undergraduate Research Fellowship in Neuroscience
Universidade Federal do ABC
1st Place, Undergraduate Final Project
Brazilian Symposium on Information Systems (CTCCSI)
Extension Project Fellowship, Met@Aprendizagem
Universidade Federal do ABC

PUBLICATIONS


  1. Imai Aldeia, G. S., Romano, J. D., Olivetti de França, F., Herman, D. S., & La Cava, W. G. (2026). Towards symbolic regression for interpretable clinical decision scores. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 384(2317), 20240588. https://doi.org/10.1098/rsta.2024.0588

  2. Aldeia, G. S. I., Herman, D. S., & La Cava, W. G. (2025). Iterative Learning of Computable Phenotypes for Treatment Resistant Hypertension using Large Language Models. Proceedings of Machine Learning Research, 298, 1–31. https://proceedings.mlr.press/v298/aldeia25a.html

  3. Aldeia, G. S. I., Zhang, H., Bomarito, G., Cranmer, M., Fonseca, A., Burlacu, B., La Cava, W. G., & de França, F. O. (2025). Call for Action: towards the next generation of symbolic regression benchmark. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’25 Companion), 2529–2538. https://doi.org/10.1145/3712255.3734309

  4. Aldeia, G. S. I., Moon, C., Shulman, J., Sethna, N., Smith, A., Lebel, A., La Cava, W. G., & Holmes, S. (2025). Application of Artificial Neural Networks and Functional Brain Connectivity to Inform Pediatric Headache. The Journal of Pain, 29. https://doi.org/10.1016/j.jpain.2025.105140

  5. Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 896–904. https://doi.org/10.1145/3638529.3654147

  6. Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Minimum variance threshold for epsilon-lexicase selection. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 905–913. https://doi.org/10.1145/3638529.3654149

  7. Aldeia, G. S. I., & de França, F. O. (2022). Interpretability in symbolic regression: a benchmark of explanatory methods using the Feynman data set. Genetic Programming and Evolvable Machines, 23, 309–349. https://doi.org/10.1007/s10710-022-09435-x

  8. Aldeia, G. S. I., & de França, F. O. (2022). Interaction-transformation evolutionary algorithm with coefficients optimization. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’22), 2274–2281. https://doi.org/10.1145/3520304.3533987

  9. de França, F. O., & Aldeia, G. S. I. (2021). Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. Evolutionary Computation, 29(3), 367–390. https://doi.org/10.1162/evco_a_00285

  10. Aldeia, G. S. I., & de França, F. O. (2020). A Parametric Study of Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC48606.2020.9185521

  11. Spadini, T., Aldeia, G. S. I., & others. (2019). On the application of SEGAN for the attenuation of the ego-noise in the speech sound source localization problem. 2019 Workshop on Communication Networks and Power Systems (WCNPS), 1–4. https://doi.org/10.1109/WCNPS.2019.8896308

  12. Aldeia, G. S. I., & de França, F. O. (2018). Lightweight Symbolic Regression with the Interaction-Transformation Representation. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC.2018.8477951

Hardcoded version, updated April 15, 2026

A collection of my posters and presentations can be found here.