Abstract
In-depth investigations into the characteristics of high-risk oncogenic viruses are critical for the early prevention and control of related cancers and the development of effective vaccines. The mechanism of viral carcinogenesis involves numerous risk factors such as viral genomic variations, lifestyle, and environmental influences. Based on literature data on eight oncogenic viruses, we have created a large-scale, semantically rich corpus of viral carcinogenic factors, including 551715 abstracts and 5821308 entities, using natural language processing technology combined with expert knowledge. We also developed a semantic filter to improve entity recognition performance. Moreover, transcriptomic data related to oncogenic viruses were collected. We performed gene differential expression analysis, feature gene identification, and immune microenvironment analysis. A visual knowledge platform, an open-source dataset, and a tool for automatically identifying internal and external semantic factors related to viral carcinogenesis are available at http://www.biomedinfo.cn:8281/. This study provides new insights into the key factors involved in the viral carcinogenesis process and helps researchers and clinicians quickly obtain clues for further experimental research and clinical validation.
| Original language | English |
|---|---|
| Article number | baaf038 |
| Journal | Database |
| Volume | 2025 |
| DOIs | |
| Publication status | Published - 24 Sept 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
Keywords
- Humans
- Carcinogenesis/genetics
- Oncogenic Viruses/genetics
- Natural Language Processing
- Semantics
- Databases, Genetic
- Molecular Sequence Annotation
ASJC Scopus subject areas
- Information Systems
- General Biochemistry,Genetics and Molecular Biology
- General Agricultural and Biological Sciences
Fingerprint
Dive into the research topics of 'An open-source multi-semantic annotation dataset and automated recognition tool for viral carcinogenesis factors'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver