Avatar

Guilherme S. Imai Aldeia, MSc, PhD

Guilherme earned his PhD in December 2025 and researches genetic programming and large language models for health applications. He completed three majors at a top Brazilian federal university known for academic quality and international partnerships. His work includes developing methods for symbolic regression, applying machine learning to fMRI data, and advancing explainable AI. He has experience in digital signal processing and has led computational experiments resulting in publications. His interests include evolutionary computing, model interpretability, healthcare prediction, brain data analysis, and projects that improve quality of life.

EDUCATION


PhD in Computer Science
Universidade Federal do ABC
M.Sc in Computer Science
Universidade Federal do ABC
B.Sc in Neuroscience
Universidade Federal do ABC
B.Sc in Computer Science
Universidade Federal do ABC
B.Sc in Science and Technology
Universidade Federal do ABC

EXPERIENCE


Postdoctoral Fellow
Boston Children's Hospital, Harvard Medical School
  • Postdoctoral researcher at Cavalab, BCH, and Pediatrics, HMS. Contributed to multiple research projects.
Latin-American Summer School in Computational Neuroscience
Universidad de Valparaíso
  • Attended a three-week summer school. Developed a project simulating neuron populations in the rat visual cortex, based on literature review of neuron types and pathways.
Teaching Assistant
FIAP
  • Taught AI and ML courses online and in person. Prepared materials, exams, and mentored annual projects in partnership with tech companies. Emphasized hands-on, practical learning.
Visitor PhD Student
Boston Children's Hospital
  • Participated in lab meetings, journal clubs, and health science projects. Worked with data, received HIPAA/PHI training, contributed to grant reports, and gained SLURM experience.
Teaching Internship
Universidade Federal do ABC
  • Assisted in Algorithm Complexity Course. Prepared exercises and answer keys, taught mathematical proofs, and provided weekly student support.
Teaching Internship
Universidade Federal do ABC
  • Developed programming exercises and answer keys in Python and Java. Created test cases for automated grading. Collaborated weekly with the professor on course content.
UFABC Rocket Design
Universidade Federal do ABC
  • Developed Arduino code for sensor integration and telemetry in rocket embedded systems. Managed flight data collection, communication, and recovery system activation.
Heuristics, Analysis, and Learning (HAL) Laboratory
Universidade Federal do ABC
  • Conducted research in genetic programming for symbolic regression. Published and collaborated on multiple projects. Designed experiments and analyzed results.

RESEARCH SUMMARY


Published Papers
  • 14 publications, 190+ citations since 2020.
  • Two journal articles (IEEE Trans. Evol. Comput., Genet. Program. Evolvable Mach.)
  • Presented at international conferences (GECCO, ML4HC, WCCI CEC).
Peer Review
  • 12 journal and 13 conference reviews since 2022.
  • Journals: IEEE Trans. Evol. Comput., Royal Society.
  • Conferences: IEEE WCCI, IEEE CAI.
Science Communication and Outreach
  • Evaluator for three years at Feira Brasileira de Ciência e Engenharia (FEBRACE), the largest Latin American science fair.
  • Contributed to Met@Aprendizagem, a project to improve programming skills for undergrad students of different majors.
  • Volunteered for two years at UFABC para todos, encouraging students to pursue higher education.

TECHNICAL SKILLS


Programming Languages
  • Fluent in Python, C, C++, Julia, Java, JavaScript, Arduino.
  • Experienced in Haskell, Matlab, R, Octave, TypeScript, HTML, CSS, SASS, PHP, C#, .NET, SQL.
  • Comfortable with object-oriented, functional, imperative, descriptive, meta-programming, and multiple dispatch paradigms.
Tools & Systems
  • Linux user for 10+ years (Bash, system management, virtual environments).
  • Git for collaborative development, project management, code review.
  • LaTeX for writing and formatting; cloud computing with SSH, Docker, SLURM.
Soft Skills
  • Quick learner, critical thinking, problem-solving, adaptability, proactive, self-motivated.
Hard Skills
  • Data analysis and visualization (advanced plotting of multimodal data).
  • Statistics (t-test, ANOVA, p-value correction, sample size, confidence intervals, Fisher information).
  • Big data: mining, pipelines, standardization, harmonization, structured and unstructured databases.
  • Integrations: building and consuming REST APIs, webhooks, CI/CD.
  • Complex projects: contributed to codebases with 100+ files and 30k+ lines. Experience with test-driven development and large test suites. Developed and published Python libraries.
  • Reproducibility: reading papers, source code, and replicating computational experiments.
Languages
  • Portuguese (native), English (fluent), Spanish (basic).

PROJECTS


Large Language Models for Computable Phenotypes and Medical Calculators
  • Developed computable phenotypes and medical calculators from EHR data using large language models. Evaluated both proprietary and open-source models for medical tasks. Combined black-box optimization with LLM-generated code to improve sensitivity and specificity.
Learning Computable Phenotypes for Hypertension and it's subtypes
  • Generated computable phenotypes for hypertension from EHR data using symbolic regression and language models. Designed and evaluated prompts, and combined symbolic regression with LLMs for code generation. Assessed phenotype performance on EHR data.
Signal and Noise Correlation in Simulated Neural Activity
  • Simulated neural populations to estimate Fisher information and study how correlations encode auditory stimuli. Modeled brain pathways using neuron populations with varying firing rates.
Understanding Pediatric Headaches with ML and Explainable AI
  • Trained and interpreted machine learning models on pediatric fMRI data to study pain mechanisms. Used data augmentation, various processing pipelines, and explainable AI to identify key brain regions. Validated findings by predicting brain age and pain scores.
Current Challenges in Symbolic Regression, PhD Thesis
  • Enhanced accuracy and interpretability in symbolic regression. Developed new methods for parent selection, parameter optimization, and function simplification. Conducted large-scale benchmarking to assess computational cost and solution quality.
Interpretability in Symbolic Regression, M.Sc Dissertation
  • Developed new explanation methods for symbolic regression using partial effects. Created a Python library for symbolic regression and robust explanations. Benchmarked explanatory methods for stability and reliability in feature importance.
Functional Connectivity Using Graph Theory to Predict Brain Development, Undergraduate Research
  • Predicted brain development three years ahead using graph theory and fMRI data. Applied machine learning to functional connectivity and graph centralities. Prepared fMRI data using standard preprocessing techniques. Predicted future psychopathology scores.
Digital Signal Processing of Spatial Audio for Source Localization
  • Applied seismic signal processing to localize sound sources using an array of eight microphones. Designed algorithms to identify victims in hard-to-access areas for drone-assisted rescue. Performed sensitivity analysis and submitted to IEEE DSP competition.
Evolutionary Algorithms for Symbolic Regression Using Constrained Representations, Undergraduate Research
  • Proposed the Interaction-Transformation Evolutionary Algorithm (ITEA) for symbolic regression. Generated closed-form expressions and recovered known physics equations from data. Developed an online interface for model editing and visualization.

AWARDS


Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2025
Association for Computing Machinery
Fellowship, Latin-American Summer School in Computational Neuroscience
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Student Presenter, Genetic and Evolutionary Computation Conference (GECCO) 2024
Association for Computing Machinery
Doctoral Sandwich Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Doctoral Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Master's Fellowship
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Undergraduate Research Fellowship in Neuroscience
Universidade Federal do ABC
1st Place, Undergraduate Final Project
Brazilian Symposium on Information Systems (CTCCSI)
Extension Project Fellowship, Met@Aprendizagem
Universidade Federal do ABC

PUBLICATIONS


  1. Aldeia, G. S. I., Romano, J. D., de França, F. O., Herman, D. S., & La Cava, W. G. (2025). Towards symbolic regression for interpretable clinical decision scores.

  2. Aldeia, G. S. I., Herman, D. S., & La Cava, W. G. (2025). Iterative Learning of Computable Phenotypes for Treatment Resistant Hypertension using Large Language Models. Proceedings of Machine Learning Research, 298, 1–31. https://proceedings.mlr.press/v298/aldeia25a.html

  3. Aldeia, G. S. I., Zhang, H., Bomarito, G., Cranmer, M., Fonseca, A., Burlacu, B., La Cava, W. G., & de França, F. O. (2025). Call for Action: towards the next generation of symbolic regression benchmark. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’25 Companion), 2529–2538. https://doi.org/10.1145/3712255.3734309

  4. Aldeia, G. S. I., Moon, C., Shulman, J., Sethna, N., Smith, A., Lebel, A., La Cava, W. G., & Holmes, S. (2025). Application of Artificial Neural Networks and Functional Brain Connectivity to Inform Pediatric Headache. The Journal of Pain, 29. https://doi.org/10.1016/j.jpain.2025.105140

  5. Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 896–904. https://doi.org/10.1145/3638529.3654147

  6. Aldeia, G. S. I., de França, F. O., & La Cava, W. G. (2024). Minimum variance threshold for epsilon-lexicase selection. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO ’24), 905–913. https://doi.org/10.1145/3638529.3654149

  7. Aldeia, G. S. I., & de França, F. O. (2022). Interpretability in symbolic regression: a benchmark of explanatory methods using the Feynman data set. Genetic Programming and Evolvable Machines, 23, 309–349. https://doi.org/10.1007/s10710-022-09435-x

  8. Aldeia, G. S. I., & de França, F. O. (2022). Interaction-transformation evolutionary algorithm with coefficients optimization. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO ’22), 2274–2281. https://doi.org/10.1145/3520304.3533987

  9. de França, F. O., & Aldeia, G. S. I. (2021). Interaction–Transformation Evolutionary Algorithm for Symbolic Regression. Evolutionary Computation, 29(3), 367–390. https://doi.org/10.1162/evco_a_00285

  10. Aldeia, G. S. I., & de França, F. O. (2020). A Parametric Study of Interaction-Transformation Evolutionary Algorithm for Symbolic Regression. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC48606.2020.9185521

  11. Spadini, T., Aldeia, G. S. I., & others. (2019). On the application of SEGAN for the attenuation of the ego-noise in the speech sound source localization problem. 2019 Workshop on Communication Networks and Power Systems (WCNPS), 1–4. https://doi.org/10.1109/WCNPS.2019.8896308

  12. Aldeia, G. S. I., & de França, F. O. (2018). Lightweight Symbolic Regression with the Interaction-Transformation Representation. IEEE Congress on Evolutionary Computation (CEC), 1–8. https://doi.org/10.1109/CEC.2018.8477951