Long-read Single-Cell RNA-seq analysis tutorial
2025-11-28
Chapter 1 Introduction
Welcome to the FLAMESv2 analysis tutorial!
In this tutorial, we demonstrate how to process and analyze long-read single-cell RNA sequencing data using outputs from the FLAMES package (Tian et al., 2021; Wang et al., 2025). FLAMES enables the identification and quantification of isoform-level expression in single cells, providing a unique opportunity to uncover transcriptomic complexity that is often undetectable in short-read data.
We will demonstrate how to load and explore FLAMES outputs in Seurat and other popular single-cell analysis tools. By following this workflow, you’ll learn how to:
Preprocess long-read single-cell data
Visualize isoform expression patterns and isoform structure
Identify differentially expressed isoforms across cell types
Detect novel isoforms with potential functional impact
If you’re familiar with short-read data processing, much of the pre-processing workflow will feel intuitive. However, long-read single-cell sequencing provides isoform-level information which enables you to explore isoform dynamics in single cells. This can be useful for exploring complex developmental systems or disease pathogenesis.
1.1 Prerequisites
This tutorial assumes you have already processed your long-read single-cell data using FLAMES, either through the SingleCellPipeline or MultiSampleSCPipeline. Please ensure that the following parameters in your configuration file are set to TRUE to enable isoform identification and quantification with Bambu (Chen et al., 2023) and Oarfish (Jousheghani & Patro, 2024):
"bambu_isoform_identification": [true]"oarfish_quantification": [true]
While FLAMES is optimized for use with specific quantification and isoform discovery tools, much of this workflow can be adapted for use with other tools which FLAMES supports. We recommend using Bambu and Oarfish as they have been validated for the type of analysis demonstrated here.
Additionally, we provide an optional step for users interested in removing empty droplets and ambient RNA contamination. If you plan to use this feature, ensure that you have previously calculated the ambient RNA profile. Detailed instructions for this step can be found here ??
1.2 Getting Started with the Data
To follow along with this tutorial, you can download the data from GitHub.
Download it using the following command:
Code
curl -L -o data.zip https://github.com/Sefi196/FLAMESv2_LR_sc_tutorial/releases/latest/download/data.zip \
&& unzip data.zip Simply unzip all files to begin. If you prefer to run the tutorial using your own output from FLAMES, there is no need to unzip your files. However, be sure to use the correct GTF file. The GTF file used during FLAMES processing must be the same one used for downstream analyses. The current version utilized in this tutorial can be downloaded using the following command:
Code
wget https://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_47/gencode.v47.annotation.gtf.gz 1.3 Dataset Information
This tutorial uses data generated by the Clark Lab at the University of Melbourne. It consists of two datasets designed to demonstrate different aspects of analysis in the FLAMESv2 workflow.
The single-sample dataset is a small, self-contained collection of approximately 400 cells, obtained at Day 55 of an excitatory neural differentiation protocol. It provides a lightweight example that users can run quickly to gain an understanding of the major analysis steps, including removing sources of variation, building a Seurat object with gene and isoform counts, and exploring interesting isoform structures and their expression profiles. The dataset is also compact enough for users to run entirely through FLAMESv2, allowing them to experience the complete analysis workflow from raw FASTQ data to informative isoform-level outcomes. The raw data are available for download at ENA: PRJEB100552.
The multi-sample dataset extends the workflow to a comparative setting and demonstrates how FLAMESv2 can be used to explore isoform-level changes across samples and developmental stages in a neuronal differentiation context. It forms part of the neuronal differentiation dataset described in the manuscript Wang et al. (2025), and for the purposes of the tutorial, it has been simplified to include only a subset of cells. This version allows users to perform sample integration, comparative isoform analysis, and trajectory-based exploration of isoform expression across neural developmental time points, all within a manageable runtime.
More details on the dataset, sequencing methodology, and the differentiation protocol can be found in the following publications: (Wang et al., 2025; You et al., 2023)
1.4 Citation
If you find this tutorial useful, please cite Wang et al. (2025) and Tian et al. (2021).
1.5 Contact
For questions or suggestions, please feel free to email us at sefi.prawer@unimelb.edu.au or leave a comment on our GitHub page: https://github.com/Sefi196/FLAMESv2_LR_sc_tutorial.