MSc Bioinformatics student at the University of Bologna, decoding the language of life through machine learning, LLMs, and computational biology. Bridging the gap between biology and AI — one sequence at a time.
Applying large language models to DNA sequence annotation, functional prediction, and uncovering latent patterns in large-scale biological datasets.
Exploring ESM, ProtTrans and AlphaFold integration for protein structure prediction and understanding protein–ligand interactions.
Building CNNs and Transformer architectures for DNA/RNA sequence classification, promoter analysis, and cancer genomics.
Applying GNNs to protein–protein interaction networks, molecular property prediction, and computational biology graphs (CS224W, Stanford).
Fusing genomics, transcriptomics, and proteomics data — including single-cell & spatial omics — to model biological systems holistically.
Using AI to identify optimal gene-editing targets, improve CRISPR precision, and mine biomedical text for novel drug candidates.
Classified promoter DNA sequences using SVM, Neural Networks, KNN, AdaBoost, and Naive Bayes. Achieved 96.3% accuracy with RBF-kernel SVM, further optimized via Particle Swarm Optimization.
View on GitHubRandom Forest model trained on ClinVar data to predict KCNB1 gene variant pathogenicity. Benchmarked against PolyPhen and SIFT in-silico tools using LOOCV and comprehensive performance analysis.
View on GitHubPredictive modeling pipeline for identifying signal peptides — critical for understanding protein secretion mechanisms and subcellular localization — using ML on protein sequence features.
View on GitHubStatistical analysis of fluorescent intensity data and methylation statuses from Illumina arrays in R. Covers beta values, M-values, probe characteristics, and differential methylation visualization.
View on GitHubBuilt a Profile Hidden Markov Model for the Kunitz-type protease inhibitor domain using HMMER and multiple sequence alignment — a rigorous structural bioinformatics pipeline.
View on GitHubExploring LLMs in genomics, deep reinforcement learning for sequence alignment, and generative models for protein design.
View All on GitHubOpen to research collaborations, PhD opportunities, and conversations about AI in genomics, LLMs in biology, or any exciting project at the intersection of computation and life.
Erfanzohrabi.ez@gmail.com