In recent years, the human microbiome has been characterised in great detail in several large-scale studies as a key player in intestinal and non-intestinal diseases, e.g. inflammatory bowel disease, diabetes and liver cirrhosis, along with brain development and behaviour. As more associations between microbiome and phenotypes are elucidated, research focus is now shifting towards causality and clinical use for diagnostics, prognostics and therapeutics, where some promising applications have recently been showcased. Microbiome data are inherently convoluted, noisy and highly variable, and non-standard analytical methodologies are therefore required to unlock its clinical and scientific potential. While a range of statistical modelling and Machine Learning (ML) methods are now available, sub-optimal implementation often leads to errors, over-fitting and misleading results, due to a lack of good analytical practices and ML expertise in the microbiome community. Thus, this COST Action network will create productive symbiosis between discovery-oriented microbiome researchers and data-driven ML experts, through regular meetings, workshops and training courses. Together, it will first optimise and then standardise the use of said techniques, following the creation of publicly available benchmark datasets.
Correct usage of these approaches will allow for better identification of predictive and discriminatory ‘omics’ features, increase study repeatability, and provide mechanistic insights into possible causal or contributing roles of the microbiome. This Action will also investigate automation opportunities and define priority areas for novel development of ML/Statistics methods targeting microbiome data.