David Nasonov

David Nasonov

MSc Applied Bioinformatics Graduate looking for work

About Me

Hello, I'm David! I am a recent graduate in Applied Bioinformatics, with working experience in NGS processing and analysis in Python, bash and R. I built this page so that you can easily find information about my past projects and experience, as well as information about what Im working on currently. I am looking for work in bioinformatics and computational biology, and have recently been diving into the world of machine learning. Currently reading: "Comprehensive benchmarking of large language models for RNA secondary structure prediction" (Zablocki et al., 2025). When I'm not bashing my head trying to debug a piece of code, I enjoy hobbies such as reading and competing in Brazilian Jiu-Jitsu!

Experience

Research Placement

University of Nottingham - Nottingham

Jul 2023 - Aug 2023

- Pre-processing Single cell RNA-Seq data, obtained from different types of pluripotent stem cells in mice (Bash) - Data Visualization (volcano plots, PCA, heatmaps) - Differential Gene Expression Analysis, GO-term analysis, pathway mapping (kegg/reactome) performed using Python and Bash - Using the analysis and existing literature, key genes for the pluripotency differentiation were identified - Boolean Gene Regulation Network recreating different phenotypes based on extracellular signals. - Model used as a prototype for a grant appeal Under Prof. Dov Stekel

Python DeSEQ2 FASTQ PCA GO-term KEGG Gene Regulation Network

Constulancy group leader

SnazzyBird - Nottingham

Feb 2023 - Mar 2023

- Lead a group of 5 students working on a consultancy project for a real client. - Scheduled, organized and lead weekly meetings with the group and the client - Set the timeline for the project, and adjusted it according to circumstance - Conducted market research and analysed market data, created figures in MS Excel and Python - Prepared a report and presented it to the client at the end of the project - Excellent mark on the module, and overwhelmingly positive feedback from the client

MS Office Trello forms

Private Tutor

1:2:1 Private - Russia

Oct 2019 - May 2021

- Taught 6 students in preparation for the Russian State Exam in English, as well as the Cambridge B2 English exam - 2 of the students also received tutoring in the school subject of Biology - Managed scheduling students for 2 classes a week, while also preparing for my own exams - All students passed the exams with excellent marks allowing them to continue into the desired higher education programs

Scientific Communication Time Management

Featured Projects

MSc Applied Bioinformatics Dissertation

MSc Applied Bioinformatics Dissertation

Part 1: Developed an aggregate model, combining results from 3 unsupervised learning algorithms (Iso Forest, DBSCAN, Autoencoder) to flag athletes suspected in growth hormone abuse based on biomarkers Part 2: Built a Liquid Chromatography - Mass Spectrometry raw data pre-processing pipeline. Used the pre-processed data to train a Siamese Neural Network model to identify if two samples come from the same person or not. AUC = 92.3%, F1-score = 0.91

Machine Learning Python Nextflow Docker Conda R pandas numpy PyTorch TensorFlow LC-MS
Boolean Gene Regulation Network

Boolean Gene Regulation Network

- Processed the data obtained from single cell RNA-Seq of three different types of pluripotent stem cells in mice (raw counts) - Visualised the data, using volcano plots, PCA and heatmaps (seaborn, matplotlib) - Downstream analysis including Differential Gene Expression, GO-term analysis, pathway mapping (kegg/reactome) was performed in python and bash - Using the results from the analysis, and the existing literature, key set of genes for pluripotent differentiation and their relationships were determined - A Boolean GRN was constructed, where by manipulating the presence or absence of 5 extracellular signals, the gene activation state was modelled accurately

Python Bash matplotlib numpy pandas DeSEQ2 KEGG pyBoolNet graph networks
BSc Biotechnology Dissertation

BSc Biotechnology Dissertation

Heat stress during flowering poses a significant threat to oilseed rape (Brassica napus) production, necessitating a comprehensive understanding of the molecular mechanisms underlying plant responses to heat stress. Oilseed rape is a globally important crop with diverse applications, including as a source of vegetable oil, animal feed, and biofuel. However, its susceptibility to heat stress presents a challenge to sustainable cultivation. In this study, the genetic factors mediating heat stress response during flowering in oilseed rape were investigated. Here we show that heat shock factors (HSFs) and heat shock proteins (HSPs) play pivotal roles in the plant's ability to withstand heat stress. Complementing previous studies in other species, our findings reveal the specific upregulation of HSP20-like chaperones superfamily proteins, CLPBs, GolS1, BAG6, HOP3 and MBF1C inn response to heat stress, underscoring their potential as targets for genetic modification to enhance heat tolerance in oilseed rape. These results lay the groundwork for further study of heat stress response in Brassica and the potential development of genetically engineered varieties with improved heat tolerance. By elucidating the genetic and molecular basis of heat stress response in oilseed rape, this study has broader implications for sustainable crop production in the face of escalating environmental challenges associated with climate change.

R DEG Brassica heat stress KEGG GO knock-out
Mouse Phenotyping data analysis

Mouse Phenotyping data analysis

- Raw data from the IMPC, in the form of 180,000 individual csv files, containing information about a single analysis, assessing effect of knocking out a gene on a given phenotype parameter was collated and cleaned - Parameters were grouped to reduce parameter space (275 -->19) - A scalable and comprehensive SQL database was created to efficiently store and query the data(Architecture seen on GitHub) - R Shiny app created to visualise a) the effect of the knockout every genes on a parameter, and the effects of knocking out one particular gene on every parameter ( Figure below and more on the GitHub page) - Project received 85% mark

R R Shiny SQL mySQL

Let's Connect

I'm always interested in discussing new opportunities and exciting projects.

Email: davidnasonov@gmail.com