We downloaded the transcriptome statistics for LUAD from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/, until October 29th, 2022), consisting of 59 normal samples and 526 tumor samples (585 samples), including FPKM data and count data. The count data were log2-transformed using the “limma” package14. Additionally, for further study, the survival information, clinical information, and simple nucleotide variation (SNP) information for LUAD were retrieved from TCGA. The term “immunogenic cell death” was searched in GeneCards (https://www.genecards.org/, until October 29th, 2022), and we selected 138 ICD-related genes (IRGs) based on correlation scores > 35. All statistical analyses were carried out in accordance with R4.2.1.
Investigation of ICD-related lncRNAs with differential expression
Our method is displayed in a flowchart (Fig. 1). To identify ICD-related lncRNAs that are differentially expressed, we isolated lncRNAs expressed in LUAD for Pearson analysis with IRGs, selected ICD-related lncRNAs based on P < 0.05 and |correlation coefficient| > 0.3, and then performed difference analysis with the criteria |log2FoldChange| > 1 and P < 0.05.
Constructing the model
All cases with normal or no survival details were removed, leaving a total of 477 tumor samples. At a random ratio of 1:1, the 477 patients were divided into two groups, with 239 samples in the training subgroup and 238 samples in the test subgroup. The training cohort was used for model construction. To identify ICD-associated lncRNAs that play a major role in LUAD, we carried out univariate Cox analysis to identify ICD-related lncRNAs linked to prognosis (P < 0.05, HR > 1), followed by LASSO analysis to avoid overfitting. Then, we performed multivariate Cox analysis to confirm the prognostic ICD-related lncRNAs to construct the model. RiskScore = EXPgene1 * genecoef1 + EXPgene2 * genecoef2 + ⋯ + EXPgenen * genecoefn.
Verifying the model’s feasibility
The test cohort and entire cohort were used for model validation, and employing the RiskScore algorithm, the risk score for each individual was computed. Then, the median of each cohort was used to classify each cohort into high- and low-risk groups. All three cohorts underwent Kaplan‒Meier (K‒M) analysis15, receiver operating characteristic (ROC) analysis16, risk mapping, heatmapping to visualize expression, and principal component analysis (PCA)17 to assess the predictive effect and determine whether there was a discernible distinction in life expectancy between the two risk groups. Stratified analysis can help us understand whether this prognostic model is applicable to the entire population. To confirm whether the constructed model was an independent predictor, we used age, gender, stage, T stage, N stage, and M stage together with the risk score and conducted univariate and multivariate Cox analyses18.
Structuring the nomogram
We investigated the link between the risk score and other clinical factors. To identify whether the risk score has the potential to outperform other clinical indicators as an independent predictor, the predictive effect was compared based on the area under the ROC curve (AUC). Stage and risk score (P < 0.001, HR > 1) were incorporated into a nomogram to predict survival rates at 1 year, 3 years, and 5 years19. Two calibration curves were made, one containing risk and stage and one without risk, and then, the fit of the two calibration curves was compared. Decision curve analysis (DCA) was developed to evaluate whether a prognostic model that included a risk score would increase clinical benefit20.
Gene set enrichment analysis (GSEA)
After dividing all samples into two risk groups, we used KEGG, BIOCARTA, PID, GO, REACTOME, and WIKIPATHWAYS in GSEA4.3.2 software to derive different pathways enriched in the two risk groups using P < 0.05 and FDR < 0.25 as criteria to help understand the different mechanisms of LUAD progression and to provide guidance for treatment ideas21.
Evaluation of the tumor immune microenvironment
The stromal cell and immune cell concentrations in the tumors were compared between the two risk groups using the ESTIMATE method to determine whether there were any variations. Overall, 16 immune cell types and 13 immune-associated regulatory mechanisms were included in the 29 immunological markers measured and compared by single sample gene set enrichment analysis (ssGSEA) in the two risk groups using the “GSVA” R package22. Finally, we constructed an immune-related heatmap. In timer 2.0 (http://timer.cistrome.org/, until October 29th, 2022), the proportion of immune cells was determined using seven different methods: TIMER, CIBERSORT, CIBERSORT-ABS, QUANTISEQ, XCELL, EPIC, and MCPCOUNTER. Additionally, scatter plots were generated using the CIBERSORT method (P < 0.05) to display the interactions between immune cells and the risk score23.
Somatic cell mutation analysis
Simple nucleotide variation (SNP) data were downloaded from TCGA for LUAD. We utilized the “maftools” package to determine the mutation status of individuals and to estimate the tumor mutation burden (TMB) score for every individual. We then compared the TMB between the two risk groups24.
Tumor immune dysfunction and exclusion (TIDE) (http://tide.dfci.harvard.edu/, until October 29th, 2022) is a method for simulating immunological escape of tumors and predicts sensitivity to PD1 drugs and CTLA4 drugs, and the probability of immune evasion is positively associated with the TIDE score25. We applied the “oncoPredict” R package to predict the susceptibility of LUAD patients to 198 chemotherapeutic agents and to search for the most effective chemotherapeutic drugs for the two risk groups. We defined drugs with P < 0.05 as sensitive drugs26.
Real-time quantitative polymerase chain reaction (RT-qPCR) for human lung adenocarcinoma
At the Second Affiliated Hospital of Nanchang University, we collected 18 lung adenocarcinoma tissues from patients after surgical resection. The samples were quickly frozen and kept in liquid nitrogen at − 80 °C. Informed consent was obtained from all participants, and the study was authorized by the ethics committee of the Second Affiliated Hospital of Nanchang University.
All samples were divided into a high-risk group and a low-risk group. According to the manufacturer’s recommendations, RNA was obtained from LUAD tissues using TRlzol reagent (Life Technologies CA, USA) and then randomly assigned for RT-qPCR analysis. This was followed by reverse transcription using the SureScript First-Strand cDNA Synthesis kit (GeneCopropol, Guangzhou, China) at 45 °C for 1 h. The Applied Biosystems 7500 Fast Real-Time PCR System and BlazeTaq SYBR Green qPCR Master Mix (GeneCopropol, Guangzhou, China) were used to complete the RT-qPCR analysis (Applied Biosystems). To determine the level of RNA expression in each sample, we employed 2ΔΔCt values.
In the Human Protein Atlas (HPA) (https://www.proteinatlas.org/, until January 20th, 2023), we analyzed the variations in protein expression of selected ICD-related genes in normal tissues and LUAD tissues, which provides some evidence for the role of ICDs in LUAD.
Ethics approval and consent
The current study investigated the publicly available data, and was also approved by the ethics committee of the Second Affiliated Hospital of Nanchang University (Nanchang, China). All methods were carried out in accordance with the Declaration of Helsinki.
Written informed consent was obtained from all participants included in the study.