GEO数据库联合机器学习识别阿尔茨海默病候选标志物及实验验证

张慧娥1 , 肖梦丽1 , 冀瑾瑾1 , 程玉荣2 , 陆芳1

神经损伤与功能重建 ›› 0

PDF(3434 KB)
中国科技核心期刊
美国《化学文摘》CAS数据库收录
日本科学技术振兴机构数据库收录
湖北省优秀期刊
中国知网网络首发期刊
PDF(3434 KB)
神经损伤与功能重建 ›› 0 DOI: 10.16780/j.cnki.sjssgncj.20250944
论著

GEO数据库联合机器学习识别阿尔茨海默病候选标志物及实验验证

  • 张慧娥1 ,肖梦丽1 ,冀瑾瑾1 ,程玉荣2 ,陆芳1
作者信息 +

Identification of Candidate Markers for Alzheimer’s Disease by Combined Machine Learning with GEO Database and Experimental Validation

  • ZHANG Huie 1 , XIAO Mengli 1 , JI Jinjin 1 , CHENG Yurong 2 , LU Fang
Author information +
文章历史 +

摘要

目的:利用GEO(Gene Expression Omnibus)数据库联合机器学习筛选阿尔茨海默病(Alzheimer’s dis
ease, AD)生物标志物。方法:总共纳入339个样本,包括168例AD样本及171例正常健康人样本,GEO数
据库筛选数据集得出差异表达基因,通过最小绝对收缩和选择算子(LASSO)逻辑回归和随机森林(Ran
dom forest,RF)2种算法筛选候选的基因模型,绘制ROC曲线评价模型。利用临床数据集(含多组 AD 患者
与健康对照样本)验证预测基因。逆转录定量聚合酶链反应(RT-qPCR)定量分析AD细胞模型正常组和模
型组候选标志物的表达。结果:LASSO得出84个关键标志物,RF算法确定7个基因。Venn图筛选得出2种
算法的重叠基因,包括PLSCR4、GLIS3、PHYHD1和HVCN1。测试集中ROC曲线显示这4个候选基因的曲
线下面积均>0.7,验证集中ROC曲线也显示其中的4个候选基因的曲线下面积>0.7,其中GLIS3(AUC=
0.891)和HVCN1(AUC=0.953)表现出优异的诊断性能(AUC>0.89)。RT-qPCR法发现与正常对照组相比,
PLSCR4、GLIS3、PHYHD1和HVCN1在AD细胞模型中相对表达量升高(均P<0.01),结果与生物信息学预
测结果相一致。结论:PLSCR4、GLIS3、PHYHD1和HVCN1可作为AD临床诊断候选分子标志物。

Abstract

To screen Alzheimer’s disease (AD) biomarkers using GEO (Gene Expression
Omnibus) database combined with machine learning. Methods: A total of 339 samples were included, including
168 AD samples and 171 samples from normal healthy people. The GEO database screened the datasets to derive
the differentially expressed genes, screened the predictive gene models by two algorithms: least absolute
shrinkage and selection operator (LASSO) logistic regression and random forest (RF), and plotted ROC curves to
evaluate the models. Clinical datasets (including multiple groups of AD patients and healthy control samples)
were used to validate the predicted genes. RT-qPCR quantitatively analyzed the expression of the predicted genes
in the normal and model groups of the AD cell model. Result: LASSO yielded 84 key markers, and RF
algorithm identified 7 genes. Venn diagram screening yielded overlapping genes for 2 algorithms, including
PLSCR4, GLIS3, PHYHD1, and HVCN1. ROC curves in the test set showed that the area under the curve of
these 4 candidate genes was greater than 0.7, and ROC curves in the validation set also showed that the area
under the curve of 3 of these candidates was greater than 0.7, among which GLIS3 (AUC=0.891) and HVCN1
(AUC=0.953) exhibited excellent diagnostic performance (AUC>0.89). The RT-qPCR method revealed that the
relative expression of PLSCR4, GLIS3, PHYHD1 and HVCN1 was elevated in the AD cell model compared
with the normal control group (all P<0.01), and the results were consistent with the bioinformatic predictions.
Conclusion: PLSCR4, GLIS3, PHYHD1 and HVCN1 may be used as molecular markers for clinical diagnosis
of AD.

关键词

阿尔茨海默病
/ 机器学习 / 诊断标志物 / 实验验证

Key words

Alzheimer’s disease
/ machine learning / diagnostic markers / experimental validation

引用本文

导出引用
张慧娥1 , 肖梦丽1 , 冀瑾瑾1 , 程玉荣2 , 陆芳1.
GEO数据库联合机器学习识别阿尔茨海默病候选标志物及实验验证
[J]. 神经损伤与功能重建. 0 https://doi.org/10.16780/j.cnki.sjssgncj.20250944
ZHANG Huie 1 , XIAO Mengli 1 , JI Jinjin 1 , CHENG Yurong 2 , LU Fang 1.
Identification of Candidate Markers for Alzheimer’s Disease by Combined Machine Learning with GEO Database and Experimental Validation
[J]. Neural Injury and Functional Reconstruction. 0 https://doi.org/10.16780/j.cnki.sjssgncj.20250944

基金

研究型病房卓越临床研究计划平行项目(No. BRWEP2024Z014170102)

PDF(3434 KB)

Accesses

Citation

Detail

段落导航
相关文章

/