Bioinformatics Experience


R (Fluent) | Python (Intermediate)
Bioinformatics Service

I established and lead the phenotypic screening facility’s bioinformatics service, providing bespoke computational support to academic and industry users. This chargeable service supports a wide range of life science disciplines, including cancer biology, immunology, neuroscience, developmental biology, and skin research.

I specialise in deriving insights from complex imaging and omics datasets and in developing scalable, reproducible analysis pipelines to advance cutting-edge research.

I also promote best practice and version control through the facility GitHub.

Machine Learning (ML) and Artificial Intelligence (AI) Techniques:
  • Supervised learning: Logistic regression, Lasso regularisation, Elastic Net, Support Vector Machines (SVM), Random Forest, Multiple Discriminant Analysis, Neural Networks
  • Unsupervised learning: Hierarchical clustering, k-means clustering, Principal Component Analysis (PCA), Exploratory Factor Analysis (EFA)
Projects:
  • SAMP-Score: Ensemble machine learning model for senescence detection to support drug discovery in cancer - Project Link
  • SenPred: Senescence classification model for single-cell RNA sequencing transcriptomic data developed from 3D in vitro cell culture models - Project Link
  • Prognostic multiplexed immunofluorescence model utilising the spatial assessment in oral epithelial dysplasia (manuscript in preparation)
ML Reviews:
High-Content Image Analysis Techniques:
  • Analysis: Z-Scores, cluster-analysis, IC50s, batch normalisation
  • Visualisations: Heatmaps, frequency distributions, dimensionality reduction (UMAP/TSNE)
Projects:
  • Unsupervised characterisation of senescence via phenotypic assessment of morphology - Project Link
  • Phenocopying methodology comparing a genome-wide siRNA screen to a novel compound for target identification (in evaluation phase with commercial partner) - Press Release
  • Developed reusable pipelines and HTML guides for users to standardise imaging data analysis - GitHub
HCA Reviews:
Spatial Biology

Develop bespoke and sophisticated analysis pipelines for users of our Cell DIVE and Phenocycler Fusion multiplex immunofluorescence platforms via the HALO digital pathology software. I have also datamined published spatial transcriptomics datasets for users as a service.

Techniques:
  • Cluster-based cell phenotyping utilising dimensionality reduction and subject knowledge to assign cell types
  • K-nearest neighbour analysis to determine cell type tissue distribution
  • Cell Neighbourhood analysis to determine cell type enrichment within tissue
  • Spatial distance and proximity assessments between cells and tissue architecture
High Performance Computing

I have experience analysing both my own and user projects through the university's High Performance Computing (HPC) cluster via both a web interface and the command line.

Genomic / Proteomic Analysis Projects:
  • Proteomic assessment of mass spectrometry data from exosomes and conditioned media - Project Link
  • sc-RNAseq transcriptomics analysis using example datasets from satijalab - Project Link
  • Formal Courses
    • DataCamp “Data Scientist with R” track (88-hour online course).
    • Biochemical Society “R for Biochemists 101” (5-week online introduction to R).
    • LinkedIn Learning “Become a Data Scientist” (17-hour introductory course).
    • DataCamp “Data Analyst with Python” track (36-hour online course).
    • DataCamp "Machine Learning Fundamentals with Python" track (16-hour online course).