By integrating bioinformatics and machine learning methods, we aimed to
systematically screen core genes linking the environmental pollutant perfluorooctane sulfonate (PFOS) with
Alzheimer's Disease (AD) and explore their potential as diagnostic and prognostic biomarkers. Methods: The
GSE95587 dataset was obtained from the GEO database, and differential expression analysis was performed
using the limma package. The differentially expressed genes were intersected with PFOS-related genes from the
CTD database to identify common candidate genes. GO and KEGG enrichment analyses of the intersected genes
were conducted using clusterProfiler. LASSO regression was employed to screen for core genes, and a logistic
regression model was constructed to evaluate their diagnostic performance and ability to predict the progression
of Braak staging. Results: A total of 281 differentially expressed genes in AD were identified, of which 18
overlapped with PFOS-related genes. Enrichment analysis revealed that these genes were significantly involved
in pathways such as neuroinflammation, astrocyte activation, and the JAK-STAT signaling pathway. LASSO
regression identified five key genes (MIR338, CCDC198, MMP13, FGF12, IL1β). Expression analysis showed
significant differences between the AD and control groups for these genes. The five-gene signature achieved an
AUC of 0.800 in distinguishing AD from controls, and PCA demonstrated clear separation between the two
groups. The signature predicted the progression of Braak staging (Braak≥4) with an AUC of 0.746, and the
high-risk group exhibited a significantly higher risk of disease progression than the low-risk group.
Conclusion: PFOS may participate in the progression of AD by regulating pathways related to
neuroinflammation and glial cell function. The identified five-gene signature demonstrated good performance in
AD diagnosis and prognosis prediction.