Bioinformatics / Biostatistics student
for MSc thesis research internship project
(6-9 months)

The Department of Genetic Identification at Erasmus University Medical Center is looking for a motivated bioinformatics Master’s student to conduct a 6-9 month MSc thesis internship on forensic microbiome research. Starting date shall be end March/beginning April 2019 (perhaps later if needed).

The central scientific question of this research project is: “Can smoking habits be predicted from the human oral microbiome data?”. This research is part of a broader PhD project focused on the investigation of the microbiome as a potential investigative tool in forensic genetics to address some of the open questions in forensic genetics that cannot be answered by conventional human DNA- or RNA-based profiling.

Predicting smoking habits from crime scene DNA may allow to narrow down the pool of potential suspects in forensic cases without known suspects. There is ongoing research on this topic in our department to use human epigenetic markers to differentiate between smokers and non-smokers. However, there also is emerging knowledge in the literature that smoking impacts on the oral microbiome and this project will investigate if this knowledge can be utilized for smoking habit prediction from microbiome data in future forensic applications. Available oral microbiome sequencing data from different studies will be harmonized for downstream analysis with the aim of identifying microbial biomarkers that differ between smokers and non-smokers. Subsequent prediction modelling will be carried out. If successful, this work of this project will be published in a high-impact scientific journal including the student as coauthor, and may be presented at conferences.

We are looking for a creative, independent, highly motivated bioinformatics / biostatistics student who will be part of a multidisciplinary international team, including molecular biologists, biotechnologists, bioinformaticians, geneticists and forensic scientists. Good working skills with Linux, R and bash, as well as good knowledge of statistical inference and its application are required. Experience with (microbiome) sequencing data processing and analysis are a plus. Some of the packages used for the project will be (but not limited to): DADA2 (R package) > to process sequencing data: filter and trim, error rates, dereplication, merge paired reads, removal of chimeras, construct amplicon sequence variant (ASV) tables. Decontam (R package) > to remove potential bacterial contaminant reads. Phyloseq/Deseq(2)/edgeR/metagenomeSeq (R packages)    > to analyze microbiome data: distance measures (un-/weighed UniFrac, Bray-Curtis), multivariate analysis of variance with permutation (PERMANOVA), analysis of group similarities (ANOSIM), multi-response permutation procedures (MRPP), Mantel’s test (MANTEL). Analysis of composition data, ANOVA-like differential express (ALDEx), ANCOM.

For more information about the project contact:
Dr Athina Vidaki:, and      
Celia Díez López: