Example Workflow: CELLxGENE • laminr

Introduction

This vignette demonstrates a basic workflow for accessing and analysing single-cell RNA-seq data from the CELLxGENE repository using {laminr}. CZ CELLxGENE Discover is a standardised collection of scRNA-seq datasets and LaminDB makes it easy to query and access data in this repository. We will go through the steps of finding and downloading a dataset using {laminr}, performing some simple analysis using {Seurat} and saving the results your own LaminDB database.

Before we start

Before we go begin, please take some time to check out the Getting Started vignette (vignette("laminr", package = "laminr")). In particular, make sure you have run the commands in the “Initial Setup” section.

Once that is done, we can load the {laminr} library.

library(laminr)

Connecting to LaminDB

The first thing we need to do is connect to the LaminDB database. For this tutorial, we will connect a default instance (where we will store results) and the CELLxGENE instance that we will search for datasets.

Connect to the default instance

We will start by connecting to your default LaminDB instance. You can set set the default instance using the lamin CLI on the command line:

lamin connect <owner>/<name>

Once a default instance has been set, we can connect to it with {laminr}:

db <- connect()
#> ! schema module 'bionty' is not installed → no access to its labels & registries (resolve via `pip install bionty`)
#> → connected lamindb: laminlabs/cellxgene
db
#> cellxgene
#>   Core registries
#>     $Run
#>     $User
#>     $Param
#>     $ULabel
#>     $Feature
#>     $Storage
#>     $Artifact
#>     $Transform
#>     $Collection
#>     $FeatureSet
#>     $ParamValue
#>     $FeatureValue
#>   Additional modules
#>     bionty

This gives us an object we can use to interact with the database.

Note that only the default instance can create new records. This tutorial assumes you have access to an instance where you have permission to add data.

Track data provenance

Before we start, we will track the code that is run in this notebook.

db$track("I8BlHXFXqZOG0000", path = "example_workflow.Rmd")

Tip: The ID should be obtained by running db$track(path = "example_workflow.Rmd") and copying the ID from the output.

Connect to the CELLxGENE instance

We can connect to other instances by providing a slug to the connect() function. Instances connected to in this way can be used to query data but cannot make any changes. Let’s connect to the CELLxGENE instance:

cellxgene <- connect("laminlabs/cellxgene")
cellxgene
#> cellxgene
#>   Core registries
#>     $Run
#>     $User
#>     $Param
#>     $ULabel
#>     $Feature
#>     $Storage
#>     $Artifact
#>     $Transform
#>     $Collection
#>     $FeatureSet
#>     $ParamValue
#>     $FeatureValue
#>   Additional modules
#>     bionty

Downloading a dataset

In Lamin, artifacts are objects that contain information (single-cell data, images, data frames etc.) as well as associated metadata. You can see what artifacts are available using the database instance object.

cellxgene$Artifact$df(limit = 5)
#>     id suffix X_accessor n_objects visibility
#> 1 2846        tiledbsoma       290          1
#> 2 3665        tiledbsoma       330          1
#> 3 1270  .h5ad    AnnData        NA          1
#> 4 2840 .ipynb       <NA>        NA          0
#> 5 2842  .html       <NA>        NA          0
#>                                                                      key
#> 1                                            cell-census/2023-12-15/soma
#> 2                                            cell-census/2024-07-01/soma
#> 3 cell-census/2023-07-25/h5ads/7a0a8891-9a22-4549-a55b-c2aca23c3a2a.h5ad
#> 4                                                                   <NA>
#> 5                                                                   <NA>
#>                    uid         size                   hash
#> 1 FYMewVq5twKMDXVy0000 635848093433 Mfyw8VuqftX5REITfQH_yg
#> 2 FYMewVq5twKMDXVy0001 870700998221 bzrXBPNvitSVKvb3GG38_w
#> 3 tczTlSHFPOcAcBnfyxKA   1297573950 UlsVvBz9kMzn2r9RdoAAOg
#> 4 JIIPyQX5l9qELPl42d75        36297 gNdUkonYgQJP_Mi3xLzt_g
#> 5 Whyxwf3k2GjJwTPCl1FK       716529 BDGZac3qU3oLVFpO035Qhg
#>                            description n_observations is_latest X_hash_type
#> 1                    Census 2023-12-15       68683222     FALSE       md5-d
#> 2                    Census 2024-07-01      115556140      TRUE       md5-d
#> 3      Supercluster: Hippocampal CA1-3          74979     FALSE       md5-n
#> 4 Source of transform G69jtgzKO0eJ6K79             NA     FALSE         md5
#> 5   Report of run UAAiLAi0BrLvlKnsuvP3             NA     FALSE         md5
#>      type                       created_at X_key_is_virtual
#> 1 dataset 2024-07-12T12:12:16.091881+00:00            FALSE
#> 2 dataset 2024-07-16T12:52:01.424629+00:00            FALSE
#> 3    <NA> 2023-11-28T21:46:12.685907+00:00            FALSE
#> 4    <NA> 2024-01-29T08:32:13.311741+00:00             TRUE
#> 5    <NA> 2024-01-29T08:32:18.346499+00:00             TRUE
#>                         updated_at    version
#> 1 2024-09-17T13:00:13.714256+00:00 2023-12-15
#> 2 2024-09-17T13:01:23.739635+00:00 2024-07-01
#> 3 2024-01-24T07:10:21.725547+00:00 2023-07-25
#> 4 2024-01-29T08:32:13.311792+00:00          0
#> 5 2024-01-30T09:12:06.027928+00:00          1

This is useful, but it’s not the nicest or easiest way to find a particular dataset. Instead, we will use the Lamin Hub website to find the data we want to load.

Open a browser and go to https://lamin.ai/laminlabs/cellxgene
On the top toolbar, click the “Artifacts” tab
Use the search field and the filters to find a dataset you are interested in.

We use the “Suffix” filter to find .h5ad files and search for “renal cell carcinoma”

Select the entry for the dataset you want to load to open a page with more details
Click the copy button at the top right, this copies a command including the ID for the artifact

Once we have the artifact ID, we can load information about the artifact, similar to what we see on the website. Notice that we use a slightly different command to what we copied from the website.

artifact <- cellxgene$Artifact$get("7dVluLROpalzEh8mNyxk")
artifact
#> Artifact(uid='7dVluLROpalzEh8mNyxk', description='Renal cell carcinoma, pre aPD1, kidney Puck_200727_12', key='cell-census/2023-12-15/h5ads/02faf712-92d4-4589-bec7-13105059cf86.h5ad', id=1742, run_id=22, hash='YNYuokfAoDFxdaRILjmU9w', size=13997860, suffix='.h5ad', storage_id=2, version='2023-12-15', _accessor='AnnData', is_latest=TRUE, transform_id=16, _hash_type='md5-n', created_at='2024-01-11T09:13:23.143694+00:00', created_by_id=1, updated_at='2024-01-24T07:17:47.009288+00:00', visibility=1, n_observations=17612, _key_is_virtual=FALSE)

So far we have only retrieved the metadata about this object. To download the data itself we need to run another command.

adata <- artifact$load()
#>   |                                                                              |                                                                      |   0%  |                                                                              |                                                                      |   1%  |                                                                              |=                                                                     |   1%  |                                                                              |=                                                                     |   2%  |                                                                              |==                                                                    |   2%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |===                                                                   |   5%  |                                                                              |====                                                                  |   5%  |                                                                              |====                                                                  |   6%  |                                                                              |=====                                                                 |   6%  |                                                                              |=====                                                                 |   7%  |                                                                              |=====                                                                 |   8%  |                                                                              |======                                                                |   8%  |                                                                              |======                                                                |   9%  |                                                                              |=======                                                               |   9%  |                                                                              |=======                                                               |  10%  |                                                                              |=======                                                               |  11%  |                                                                              |========                                                              |  11%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |==========                                                            |  15%  |                                                                              |===========                                                           |  15%  |                                                                              |===========                                                           |  16%  |                                                                              |============                                                          |  17%  |                                                                              |============                                                          |  18%  |                                                                              |=============                                                         |  18%  |                                                                              |=============                                                         |  19%  |                                                                              |==============                                                        |  19%  |                                                                              |==============                                                        |  20%  |                                                                              |==============                                                        |  21%  |                                                                              |===============                                                       |  21%  |                                                                              |===============                                                       |  22%  |                                                                              |================                                                      |  22%  |                                                                              |================                                                      |  23%  |                                                                              |================                                                      |  24%  |                                                                              |=================                                                     |  24%  |                                                                              |=================                                                     |  25%  |                                                                              |==================                                                    |  25%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |===================                                                   |  28%  |                                                                              |====================                                                  |  28%  |                                                                              |====================                                                  |  29%  |                                                                              |=====================                                                 |  29%  |                                                                              |=====================                                                 |  30%  |                                                                              |=====================                                                 |  31%  |                                                                              |======================                                                |  31%  |                                                                              |======================                                                |  32%  |                                                                              |=======================                                               |  32%  |                                                                              |=======================                                               |  33%  |                                                                              |========================                                              |  34%  |                                                                              |========================                                              |  35%  |                                                                              |=========================                                             |  35%  |                                                                              |=========================                                             |  36%  |                                                                              |==========================                                            |  37%  |                                                                              |==========================                                            |  38%  |                                                                              |===========================                                           |  38%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  39%  |                                                                              |============================                                          |  40%  |                                                                              |============================                                          |  41%  |                                                                              |=============================                                         |  41%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |==============================                                        |  44%  |                                                                              |===============================                                       |  44%  |                                                                              |===============================                                       |  45%  |                                                                              |================================                                      |  45%  |                                                                              |================================                                      |  46%  |                                                                              |=================================                                     |  46%  |                                                                              |=================================                                     |  47%  |                                                                              |=================================                                     |  48%  |                                                                              |==================================                                    |  48%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  49%  |                                                                              |===================================                                   |  50%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=====================================                                 |  52%  |                                                                              |=====================================                                 |  53%  |                                                                              |======================================                                |  54%  |                                                                              |======================================                                |  55%  |                                                                              |=======================================                               |  55%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |========================================                              |  58%  |                                                                              |=========================================                             |  58%  |                                                                              |=========================================                             |  59%  |                                                                              |==========================================                            |  59%  |                                                                              |==========================================                            |  60%  |                                                                              |==========================================                            |  61%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |============================================                          |  62%  |                                                                              |============================================                          |  63%  |                                                                              |============================================                          |  64%  |                                                                              |=============================================                         |  64%  |                                                                              |=============================================                         |  65%  |                                                                              |==============================================                        |  65%  |                                                                              |==============================================                        |  66%  |                                                                              |===============================================                       |  66%  |                                                                              |===============================================                       |  67%  |                                                                              |===============================================                       |  68%  |                                                                              |================================================                      |  68%  |                                                                              |================================================                      |  69%  |                                                                              |=================================================                     |  69%  |                                                                              |=================================================                     |  70%  |                                                                              |=================================================                     |  71%  |                                                                              |==================================================                    |  71%  |                                                                              |==================================================                    |  72%  |                                                                              |===================================================                   |  72%  |                                                                              |===================================================                   |  73%  |                                                                              |===================================================                   |  74%  |                                                                              |====================================================                  |  74%  |                                                                              |====================================================                  |  75%  |                                                                              |=====================================================                 |  75%  |                                                                              |=====================================================                 |  76%  |                                                                              |======================================================                |  76%  |                                                                              |======================================================                |  77%  |                                                                              |======================================================                |  78%  |                                                                              |=======================================================               |  78%  |                                                                              |=======================================================               |  79%  |                                                                              |========================================================              |  79%  |                                                                              |========================================================              |  80%  |                                                                              |========================================================              |  81%  |                                                                              |=========================================================             |  81%  |                                                                              |=========================================================             |  82%  |                                                                              |==========================================================            |  82%  |                                                                              |==========================================================            |  83%  |                                                                              |===========================================================           |  84%  |                                                                              |===========================================================           |  85%  |                                                                              |============================================================          |  85%  |                                                                              |============================================================          |  86%  |                                                                              |=============================================================         |  87%  |                                                                              |=============================================================         |  88%  |                                                                              |==============================================================        |  88%  |                                                                              |==============================================================        |  89%  |                                                                              |===============================================================       |  89%  |                                                                              |===============================================================       |  90%  |                                                                              |===============================================================       |  91%  |                                                                              |================================================================      |  91%  |                                                                              |================================================================      |  92%  |                                                                              |=================================================================     |  92%  |                                                                              |=================================================================     |  93%  |                                                                              |=================================================================     |  94%  |                                                                              |==================================================================    |  94%  |                                                                              |==================================================================    |  95%  |                                                                              |===================================================================   |  95%  |                                                                              |===================================================================   |  96%  |                                                                              |====================================================================  |  96%  |                                                                              |====================================================================  |  97%  |                                                                              |====================================================================  |  98%  |                                                                              |===================================================================== |  98%  |                                                                              |===================================================================== |  99%  |                                                                              |======================================================================|  99%  |                                                                              |======================================================================| 100%
adata
#> AnnData object with n_obs × n_vars = 17612 × 23254
#>     obs: 'n_genes', 'n_UMIs', 'log10_n_UMIs', 'log10_n_genes', 'Cell_Type', 'cell_type_ontology_term_id', 'organism_ontology_term_id', 'tissue_ontology_term_id', 'assay_ontology_term_id', 'disease_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'donor_id', 'is_primary_data', 'suspension_type', 'cell_type', 'assay', 'disease', 'organism', 'sex', 'tissue', 'self_reported_ethnicity', 'development_stage'
#>     var: 'gene', 'n_beads', 'n_UMIs', 'feature_is_filtered', 'feature_name', 'feature_reference', 'feature_biotype'
#>     uns: 'Cell_Type_colors', 'schema_version', 'title'
#>     obsm: 'X_spatial'

This dataset has been stored as an AnnData object. In the next sections we will convert it to a Seurat object and perform some simple analysis.

Convert to Seurat

There are various approaches for converting between different single-cell objects, some of which are described in the Interoperability chapter of the Single-cell Best Practices book.

Because we already have the data loaded in memory, the simplest option is to extract the information we need and create a new Seurat object.

seurat <- SeuratObject::CreateSeuratObject(
  counts = Matrix::t(adata$X),
  meta.data = adata$obs,
)
#> Warning: Data is of class dgRMatrix. Coercing to dgCMatrix.
seurat
#> An object of class Seurat 
#> 23254 features across 17612 samples within 1 assay 
#> Active assay: RNA (23254 features, 0 variable features)
#>  1 layer present: counts

Analysis

We could perform any normal analysis using {Seurat} but as an example we will calculate marker genes for each of the annotated cell types. To make things a bit quicker we only test the first 1000 genes but if you have a few minutes you can get results for all features.

# Set cell identities to the provided cell type annotation
SeuratObject::Idents(seurat) <- "Cell_Type"
# Normalise the data
seurat <- Seurat::NormalizeData(seurat)
#> Normalizing layer: counts
# Test for marker genes
markers <- Seurat::FindAllMarkers(
  seurat,
  features = SeuratObject::Features(seurat)[1:1000]
)
#> Calculating cluster Epithelial
#> Calculating cluster Fibroblast
#> For a (much!) faster implementation of the Wilcoxon Rank Sum Test,
#> (default method for FindMarkers) please install the presto package
#> --------------------------------------------
#> install.packages('devtools')
#> devtools::install_github('immunogenomics/presto')
#> --------------------------------------------
#> After installation of presto, Seurat will automatically use the more 
#> efficient implementation (no further action necessary).
#> This message will be shown once per session
#> Calculating cluster Myeloid
#> Calculating cluster Tumor
#> Warning: The following tests were not performed:
#> Warning: When testing Epithelial versus all:
#>  Cell group 1 has fewer than 3 cells
# The output is a data.frame
head(markers)
#>                        p_val avg_log2FC pct.1 pct.2    p_val_adj    cluster
#> ENSG00000164283 1.030703e-89  2.7485040 0.205 0.048 2.396797e-85 Fibroblast
#> ENSG00000116016 3.606838e-38  2.0721038 0.152 0.051 8.387340e-34 Fibroblast
#> ENSG00000074800 5.097282e-25 -0.9810317 0.185 0.366 1.185322e-20 Fibroblast
#> ENSG00000112715 6.663398e-18 -1.1826785 0.078 0.202 1.549507e-13 Fibroblast
#> ENSG00000140416 1.844156e-17 -0.6994000 0.175 0.326 4.288400e-13 Fibroblast
#> ENSG00000125810 8.916133e-15  1.8102270 0.057 0.019 2.073358e-10 Fibroblast
#>                            gene
#> ENSG00000164283 ENSG00000164283
#> ENSG00000116016 ENSG00000116016
#> ENSG00000074800 ENSG00000074800
#> ENSG00000112715 ENSG00000112715
#> ENSG00000140416 ENSG00000140416
#> ENSG00000125810 ENSG00000125810

Store the results in LaminDB

Now that we have our results, we can save them to the LaminDB instance.

seu_path <- tempfile(fileext = ".rds")
saveRDS(seurat, seu_path)

db$Artifact$from_df(
  markers,
  description = "Marker genes for renal cell carcinoma dataset"
)$save()

db$Artifact$from_path(
  seu_path,
  description = "Seurat object for renal cell carcinoma dataset"
)$save()

Close the connection

Finally, we can close the connection to the database.

db$finish()

Render and upload the notebook

You can render this notebook to HTML:

In RStudio, click the “Knit” button

From the command line, run:

Rscript -e 'rmarkdown::render("example_workflow.Rmd")'

Or use the rmarkdown package in R:

rmarkdown::render("example_workflow.Rmd")

And then save it to your LaminDB instance using the lamin CLI:

lamin save example_workflow.Rmd