Last updated: 2025-05-29

Checks: 7 0

Knit directory: Lung_scMultiomics_paper/

This reproducible R Markdown analysis was created with workflowr (version 1.7.1). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


Great! Since the R Markdown file has been committed to the Git repository, you know the exact version of the code that produced these results.

Great job! The global environment was empty. Objects defined in the global environment can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment.

The command set.seed(20250512) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version 8e90a14. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    analysis/figure/

Untracked files:
    Untracked:  ArchRLogs/
    Untracked:  Lung_scMultiomics_paper.Rproj
    Untracked:  _workflowr.yml
    Untracked:  analysis/ArchRLogs/
    Untracked:  analysis/about.knit.md
    Untracked:  analysis/archive.Rmd
    Untracked:  analysis/figures_for_grant_application.Rmd
    Untracked:  analysis/link_peaks_to_genes.Rmd
    Untracked:  code/run_GO_enrichment.R
    Untracked:  data/asthma_related_CREs_full.RDS
    Untracked:  data/comparing_min_pct_GO.RData
    Untracked:  data/p2g_res/
    Untracked:  data/peak2gene_all_ArchR.RData
    Untracked:  data/peak2gene_lung_CD4_T_ArchR.RData
    Untracked:  data/peak2gene_lung_CD8_T_ArchR.RData
    Untracked:  data/peak2gene_lung_Memory_B_ArchR.RData
    Untracked:  data/peak2gene_lung_NK_ArchR.RData
    Untracked:  data/peak2gene_lung_Naive_B_ArchR.RData
    Untracked:  data/peak2gene_lung_T_ArchR.RData
    Untracked:  data/peak2gene_lung_Treg_ArchR.RData
    Untracked:  data/peak2gene_onlyT_ArchR.RData
    Untracked:  data/raw_p2g_lung_CD4_T_ArchR.RData
    Untracked:  data/raw_p2g_lung_CD4_T_joint_ArchR.RData
    Untracked:  data/raw_p2g_lung_CD8.CD4_T_ArchR.RData
    Untracked:  data/raw_p2g_lung_CD8_T_ArchR.RData
    Untracked:  data/raw_p2g_lung_Memory_B_ArchR.RData
    Untracked:  data/raw_p2g_lung_NK_ArchR.RData
    Untracked:  data/raw_p2g_lung_Naive_B_ArchR.RData
    Untracked:  data/raw_p2g_lung_Th17.CD4_T_ArchR.RData
    Untracked:  data/raw_p2g_lung_Treg_ArchR.RData
    Untracked:  data/u19_full_atac_cell_metadata.RDS
    Untracked:  output/u19_multiomics

Unstaged changes:
    Modified:   README.md
    Modified:   analysis/heritability_enrichment_for_lung_open_chromatin.Rmd
    Modified:   analysis/test.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/identify_lung_specific_transcriptomic_features.Rmd) and HTML (docs/identify_lung_specific_transcriptomic_features.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd 8e90a14 Jing Gu 2025-05-29 check how effect sizes of DE genes correlated
html f5d7da7 Jing Gu 2025-05-15 Build site.
Rmd 08fb865 Jing Gu 2025-05-15 added comments
html 0af8c12 Jing Gu 2025-05-15 Build site.
Rmd 3cae897 Jing Gu 2025-05-15 added comments
html 53899ad Jing Gu 2025-05-15 Build site.
Rmd dab59d1 Jing Gu 2025-05-15 updated DEG analyses
Rmd 1c96702 Jing Gu 2025-05-14 fixed errors for table
html 497b42c Jing Gu 2025-05-14 Build site.
Rmd 4faa63e Jing Gu 2025-05-14 DEG analyses
html 9008bd6 Jing Gu 2025-05-14 Build site.
Rmd c40f0da Jing Gu 2025-05-14 DEG analyses

Differential gene expression analyses across tissue

Wilcoxon ranksum test at single-cell leveL gives more conservative results.

Summarizing DE genes by selecting p-value cutoffs

A table of cell counts by tissue and cell-type.

          
           lungs spleens
  Other     1654     104
  Treg      1336      47
  Th17      2732      68
  CD4_T     6980     886
  CD8_T    12210     421
  NK        8067     464
  Memory_B  5287   10507
  Naive_B   1174    1710

A barplot for number of DE genes detected for each cell type except for Th17 and Treg, due to low number of cells in spleen.

Version Author Date
53899ad Jing Gu 2025-05-15
497b42c Jing Gu 2025-05-14
9008bd6 Jing Gu 2025-05-14

Compare and contrast DE genes across immune subsets

A Venn diagram for DE genes shared across cell types other than memory B cells implies DE genes are cell-type specific.

Version Author Date
9008bd6 Jing Gu 2025-05-14

A full summary of shared and unique DE genes across cell types

UpSet-style plot only shows the count of elements specific to each intersection.

Lung up-regulated genes

Version Author Date
9008bd6 Jing Gu 2025-05-14

Check the lung up-regulated genes shared across all immune subsets

[1] "HSP90AA1 HSPA1A HSP90AB1 HSPD1 RPS26 DNAJB1 HSPA6 HSPA1B HSPH1 HSPE1 HSPB1 CACYBP HSPA8 UBC BAG3 STIP1 ABHD3 ZFAND2A FKBP4 GBP2 CGAS HSPA4 AHSA1"

Spleen up-regulated genes

Version Author Date
9008bd6 Jing Gu 2025-05-14

Check whether effect sizes for common genes are correlated

Version Author Date
f5d7da7 Jing Gu 2025-05-15
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

Check whether effect sizes for unique genes are not correlated

GO enrichment results

Lung Up-regulated genes

Overall, lung up-regulated genes are enriched for immune-related hallmark gene sets and GO terms for T cell activation, response to cytokines, and regulation of cell adhesion. Particularly, lung-specific genes in CD4+T and memory B cells are enriched for Th1/Th2 differentiation.

Comparing GO enrichment set between different min.pct

  • Set1 - min.pct = 0.1

  • Set2 - min.pct = 0.01

community-contributed_Hallmark50

Setting lower min.pct mainly affects enrichment results from B cells

KEGG Pathway

Setting lower min.pct mainly affects enrichment results from B cells and CD4+T Cells.

min.pct = 0.1 and log2FC >= 1

[[1]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[2]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
53899ad Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[3]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
53899ad Jing Gu 2025-05-15

[[4]]


[[5]]

min.pct = 0.01 and log2FC >= 1

[[1]]

Version Author Date
0af8c12 Jing Gu 2025-05-15

[[2]]

Version Author Date
0af8c12 Jing Gu 2025-05-15

[[3]]


[[4]]


[[5]]

Spleen Up-regulated genes

Overall, spleen up-regulated genes are less enriched for GO terms at a higher threshold at FDR < 0.05.

The DE genes down-regulated in lung detected from memory B cells are significantly enriched for asthma risk genes from KEGG pathway. The overlapped genes are HLA genes and CD40. Their function in B cells might be enhancing subsequent interaction with T cells.

min.pct = 0.1 and log2FC >= 1

[[1]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[2]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[3]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[4]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

[[5]]

Version Author Date
0af8c12 Jing Gu 2025-05-15
9008bd6 Jing Gu 2025-05-14

min.pct = 0.01 and log2FC >= 1

[[1]]


[[2]]


[[3]]


[[4]]


[[5]]

Dot plots for genes in selected GO terms

[assets/dotplot_spleen_upreg_asthma_genes.png]

Summarizing DE genes by effect sizes

We performed K-means clustering over log2FC for all genes with at most one NA across cell types.

[1] "Number of genes in each k-mean cluster:"

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
  35  847  158  634  718  469  292  816   35  160  683  865  921 1198  207  873 
[1] "The pair-wise correlation of genes for most clusters form a distribution skewed to 1."

Heatmap for average log2FC for each cluster

Clustering for effect sizes does not show cell type specificity except for memory B cells.

Heatmap for average log2FC for each cell-type (Memory B excluded)

Clustering for effect sizes shows stronger cell-type specificity.

Several clusters were selected by having distinct cluster mean in one cell type compared to the rest to perform GSEA. The pattern is less clear to me. Currently all genes contained in each cluster were used to perform GSEA, so I may try top hundreds of genes for the analysis.

[[1]]


[[2]]


[[3]]


[[4]]


[[5]]

`


R version 4.2.0 (2022-04-22)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /software/openblas-0.3.13-el7-x86_64/lib/libopenblas_haswellp-r0.3.13.so

locale:
 [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
 [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
 [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
[10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 

attached base packages:
[1] stats4    grid      stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
 [1] RVenn_1.1.0                 ggVennDiagram_1.5.2        
 [3] SingleCellExperiment_1.20.1 cowplot_1.1.3              
 [5] ComplexHeatmap_2.14.0       htmltools_0.5.8.1          
 [7] scales_1.4.0                colorRamp2_0.1.0           
 [9] tidyr_1.3.1                 dplyr_1.1.4                
[11] rhdf5_2.42.1                SummarizedExperiment_1.28.0
[13] Biobase_2.58.0              MatrixGenerics_1.10.0      
[15] Rcpp_1.0.14                 Matrix_1.6-5               
[17] GenomicRanges_1.50.2        GenomeInfoDb_1.34.9        
[19] IRanges_2.32.0              S4Vectors_0.36.2           
[21] BiocGenerics_0.44.0         matrixStats_1.5.0          
[23] data.table_1.17.4           stringr_1.5.1              
[25] plyr_1.8.9                  magrittr_2.0.3             
[27] ggplot2_3.5.2               gtable_0.3.6               
[29] gtools_3.9.5                gridExtra_2.3              
[31] ArchR_1.0.2                

loaded via a namespace (and not attached):
 [1] bitops_1.0-9           fs_1.6.6               doParallel_1.0.17     
 [4] RColorBrewer_1.1-3     rprojroot_2.0.4        tools_4.2.0           
 [7] bslib_0.9.0            DT_0.33                R6_2.6.1              
[10] colorspace_2.1-1       rhdf5filters_1.10.1    GetoptLong_1.0.5      
[13] withr_3.0.2            tidyselect_1.2.1       compiler_4.2.0        
[16] git2r_0.33.0           cli_3.6.5              Cairo_1.6-2           
[19] DelayedArray_0.24.0    labeling_0.4.3         sass_0.4.10           
[22] yulab.utils_0.2.0      digest_0.6.37          rmarkdown_2.29        
[25] XVector_0.38.0         dichromat_2.0-0.1      pkgconfig_2.0.3       
[28] fastmap_1.2.0          htmlwidgets_1.6.4      rlang_1.1.6           
[31] GlobalOptions_0.1.2    rstudioapi_0.17.1      gridGraphics_0.5-1    
[34] shape_1.4.6            jquerylib_0.1.4        farver_2.1.2          
[37] generics_0.1.4         jsonlite_2.0.0         crosstalk_1.2.1       
[40] RCurl_1.98-1.17        ggplotify_0.1.2        GenomeInfoDbData_1.2.9
[43] patchwork_1.3.0        Rhdf5lib_1.20.0        lifecycle_1.0.4       
[46] stringi_1.8.4          whisker_0.4.1          yaml_2.3.10           
[49] zlibbioc_1.44.0        parallel_4.2.0         promises_1.3.2        
[52] forcats_1.0.0          crayon_1.5.3           lattice_0.22-7        
[55] circlize_0.4.15        knitr_1.50             pillar_1.10.2         
[58] rjson_0.2.23           codetools_0.2-20       glue_1.8.0            
[61] evaluate_1.0.3         ggfun_0.1.8            png_0.1-8             
[64] vctrs_0.6.5            httpuv_1.6.16          foreach_1.5.2         
[67] purrr_1.0.4            clue_0.3-66            cachem_1.1.0          
[70] xfun_0.52              later_1.4.2            viridisLite_0.4.2     
[73] tibble_3.2.1           aplot_0.2.5            iterators_1.0.14      
[76] workflowr_1.7.1        cluster_2.1.8.1