The application of spatially resolved omics technology is revolutionizing our comprehension of biological tissues. Large data quantities, variety in data types, and a lack of adaptable, spatially aware data structures make processing uni- and multimodal spatial omics datasets difficult. Researchers offer SpatialData, a framework that creates a common coordinate system alignment, a lazy representation of data larger than memory, and a unified and extensible multiplatform file format. The usefulness of spatial annotations and cross-modal aggregation and analysis is demonstrated through a number of vignettes, one of which is an integrative analysis of a multimodal Xenium and Visium breast cancer study. SpatialData makes these tasks easier. 

Introduction

Technological developments in imaging and spatial molecular profiling are making it possible to analyze biological tissues in detail, with an emphasis on structure and composition. These technologies provide spatial resolution of morphological traits at length scales ranging from subcellular to whole organisms by quantifying DNA, RNA, protein, and metabolite abundances in situ. With trade-offs in spatial resolution, molecular multiplexing, and detection sensitivity, among other restrictions, spatial omics technologies are developing quickly. Comprehensive understandings of biological systems depend on the effective integration of data from several spatial omics modalities.

The variety of data types and file formats makes integrating uni- and multimodal spatial omics data extremely difficult. Other important difficulties are the alignment of geographical regions for data acquisition in a tissue and the spatial resolution of individual spatial omics modalities. Creating global common coordinate frameworks (CCFs) requires the transformation and alignment of data to a common coordinate system (CCS) to overcome these issues. Large-scale interactive data exploration and annotation demand specialized knowledge and drive to untangle the complexity of multimodal spatial omics datasets. Creating a unified programmatic interface to store, explore, analyze, and annotate data across all the spatial omics technologies is crucial for unlocking the potential of spatial multiomics studies.

Understanding SpatialData

A Python toolkit called the SpatialData framework makes it easier to integrate multimodal spatial omics data in a way that is findable, accessible, interoperable, and reusable (FAIR). It uses a language-independent storage format that improves data source compatibility and standardized access for many kinds of data. The SpatialData format contains five primitive elements (pictures, labels, points, shapes, and tables) and supports all major spatial omics technologies and derived values. The file format tracks coordinate transformation or alignment procedures applied to individual datasets to facilitate joint integrative analysis. Utilizing the Zarr file format for effective, interoperable access, the SpatialData format expands upon the Open Microscopy Environment–Next-Generation File Format (OME–NGFF) requirements.

Key Findings of the Study

  • Processing a variety of uni- and multimodal datasets is made easier by SpatialData. Vignettes illustrating further use cases are included in the SpatialData online documentation.
  •  To assist the training of deep learning models and enable downstream analysis utilizing spatial interpretation tools like Squidpy, researchers, for instance, demonstrate how SpatialData may act as a backend.
  • The study additionally offers preformatted SpatialData objects from more than 40 datasets obtained by eight technologies, which can serve as a foundation for utilizing SpatialData in conjunction with various technologies. 
  • It is possible to execute interactive annotation on datasets with single and multiple modalities. 
  • Ultimately, the study mapped 12 Visium slides to a large prostate section to examine how SpatialData can align numerous fields of view into a global reference coordinate system. 
  • More details may be found online at https://spatialdata.scverse.org, where you can also find extensive documentation of the SpatialData Python library, tutorials, example datasets, and a contributor guide.

SpatialData Framework

The SpatialData framework is compatible with Python 3.9 and above and consists of the core package, spatial data, related satellite packages, napari-spatialdata, spatialdata-io, and spatialdata-plot. The permissive “BSD 3-Clause Licence” governs all code, which is part of the scverse organization and accessible on GitHub. Unit tests and pre-commit checks are implemented in a continuous integration environment by the project structures, which derive from the scverse cookiecutter and the napari plugin cookiecutter. The documentation is available on Read the Docs and is created with Sphinx. It has descriptions of application programming interfaces (APIs), sample notebooks, and a table that links to spatial omics datasets that may be downloaded. You can download all of the datasets in one zip file or even retrieve them straight from the cloud using public S3 storage. 

SpatialData Python Library

The SpatialData Python library allows lazy loading for larger-than-memory data by representing this format as SpatialData objects in memory. Aside from offering flexible functionalities for modifying and retrieving SpatialData objects and defining CCSs of biological tissues, the library also offers reader functions for popular spatial omics technologies. In short, affine transformations and composite operations are coordinate transformations specific to a modality and are linked to each dataset. When a set of datasets is aligned, they can be combined and searched, for instance, by applying spatial annotations at different scales (such as cells, grids, or anatomical regions), both within and between modalities. To facilitate access, selective data sharing, and exploration, the query and aggregation interfaces also enable the development of new datasets from extensive dataset collections that are organized by biologically informed parameters.

Features of SpatialData

  • Napari-spatialdata is a napari plugin for interactive annotation in SpatialData. Using the napari-spatialdata plugin, one can add landmarks to help with multidataset registration or interactively define spatial annotations like zones of interest. The spatialdata-plot package can be used to generate static figures and visualizations.
  • The SpatialData module builds upon conventional scientific Python data types, allowing smooth integration with the Python ecosystem. To successfully train deep learning models directly from SpatialData objects, the researchers have built a PyTorch Dataset class. Furthermore, analysis programs in the scverse ecosystem like Scanpy, Squidpy, and scvi-tools can be used to analyze SpatialData objects because of the modular nature of the data format. The SpatialData framework, when combined, offers the necessary infrastructure for the analysis and integration of spatial omics data.
  • When doing a study with Xenium tests and hematoxylin and eosin photos, SpatialData is an effective tool for multimodal integration. In this study, SpatialData was used to represent and handle data from a breast cancer tumor using spatial transcriptomics and in situ sequencing datasets. SpatialData was utilized to align all datasets and generate landmark locations to identify similar spatial areas across datasets. The efficiency and precision of the study are improved by this multimodal integration strategy.

Conclusion

A community standards-based platform, SpatialData, was explicitly created for spatial omics, which enables the processing, annotating, and storage of data from different geographical omics technologies. Because of its adaptability, standard coordinate systems may be established, enabling reliable comparison and sample reuse across research. The flexibility and accessibility of SpatialData improve the repeatability of integrated geographical analysis and open up new avenues for study. With additional advances increasing its interoperability with R/Bioconductor, allowing multiscale point and polygon representations, and supporting cloud-based data access, its utility will only rise with further usage. All things considered, SpatialData is a universal and open data architecture for spatial omics.

Article Source: Reference Paper | Reference Article | SpatialData is available as a Python package via pip. Examples and tutorials can be accessed from the documentation on this Website

Learn More:

Website | + posts

Deotima is a consulting scientific content writing intern at CBIRT. Currently she's pursuing Master's in Bioinformatics at Maulana Abul Kalam Azad University of Technology. As an emerging scientific writer, she is eager to apply her expertise in making intricate scientific concepts comprehensible to individuals from diverse backgrounds. Deotima harbors a particular passion for Structural Bioinformatics and Molecular Dynamics.

LEAVE A REPLY

Please enter your comment!
Please enter your name here