Research on Optimization of Species Classification Algorithms for Metagenomic Data Based on Deep Learning

Authors

  • Peiyu Zheng Imperial College London, UK

DOI:

https://doi.org/10.56397/JPEPS.2025.06.08

Keywords:

deep learning, metagenomic species, classification algorithm optimization

Abstract

Traditional methods for the analysis of complex microbial communities are subject to several limitations, including the difficulty of distinguishing closely related species due to high sequence similarity, the susceptibility to data noise from sequencing errors and host DNA contamination, and the limited ability of existing deep learning models to effectively extract features from long sequences. In order to address these challenges, this study investigates deep learning-based optimization strategies for metagenomic species classification algorithms. It is suggested that an enhanced approach be adopted, incorporating k-mer frequency statistics in conjunction with sequence truncation, with the objective of mitigating noise interference. Additionally, an attention mechanism is to be integrated into a CNN framework, with the intention of enhancing the weighting of critical features. Furthermore, the introduction of Focal Loss is proposed, with the aim to address class imbalance in species classification. Our tests using both artificial and natural metagenomic samples show clear improvements with the enhanced method. The upgraded algorithm works better than standard machine learning techniques and the basic CNN model. It identifies and classifies microbial species more accurately across all tested datasets. Performance gains appear consistently in all evaluation metrics. The method’s superior capability is particularly evident when handling complex, real-world microbiome data. These results confirm the practical value of our optimization approach for microbial community analysis.

Downloads

Published

2025-07-04

How to Cite

Peiyu Zheng. (2025). Research on Optimization of Species Classification Algorithms for Metagenomic Data Based on Deep Learning. ournal of rogress in ngineering and hysical cience, 4(3), 60–66. https://doi.org/10.56397/JPEPS.2025.06.08

Issue

Section

Articles