Add Relative Log Expression (RLE) Plots #348

jonathjd · 2024-12-07T19:15:07Z

Reference Issue or PRs

Issue Reference: #320

What does your PR implement? Be specific.

The Relative Log Expression (RLE) is a useful diagnostic plot to visualize the differences in count distributions between samples. The x-axis is each sample from the count matrix and the y-axis is the log difference between each gene and the median expression of that gene across all samples.

Given:

gene_ij is the expression of gene j in sample i
median_j is the median expression of gene j across all samples
The RLE for gene j in sample i is calculated as:

RLE_ij = Log2(gene_ij/median_j)

Where:

gene_{ij} is the count of gene j in sample i
median_j is the median count of gene j across all samples

This issue takes in the raw counts self.X, a normalize boolean, design_matrix.index for the sample_ids, and a save_path and produces an RLE plot.

The normalize boolean is set to False by default but can be set to True to normalize the raw counts before plotting

This example was produced with the synthetic data in the ./datasets/ dir

BorisMuzellec

Thanks a lot for this PR @jonathjd !

I just have one suggestion to make: unless I missed something,I think we should avoid re-computing size factors when we already have them.

Happy to merge once this is fixed!

BorisMuzellec · 2024-12-11T08:39:38Z

pydeseq2/utils.py

+    if normalize:
+        print("Plotting normalized RLE plot...")
+        geometric_mean = np.exp(np.mean(np.log(count_matrix + 1), axis=0))
+        size_factors = np.median(count_matrix / geometric_mean, axis=1)
+        count_matrix = count_matrix / size_factors[:, np.newaxis]


This is recomputing median-of-ratios size factors, right? (up to the +1 in the log)

In that case, it might be better to add a size_factors argument to make_rle_plot.

From there, in dds.plot_rle, if normalize=True, we would first check whether size factors (self.obsm["size_factors"]) were already computed. If so, we pass them directly. If not, we call self.fit_size_factors first.

Hi @BorisMuzellec ,

For some reason when I use the size factors computed in dds.fit_size_factors() I do not get an RLE plot with the sample medians centered around 0.

But when I compute the sizefactors internally I do get an RLE plot with the sample medians centered around 0

Do you know what could be causing this?

I'm not sure why we would expect the sample medians to be zero given that RLE plot subtracts the gene medians. Also, I would be wary of making any conclusion from the test data.

That being said what you wrote is what is implemented in R's plotRLE method (https://rdrr.io/github/davismcc/scater/src/R/plotRLE.R). If it's standard, I'm happy to keep it as is.

@jonathjd I think we may still have some consistency issues though: if we're using log(counts + 1) for size factors, shouldn't we also do the same everywhere (gene medians, and plotting)

pydeseq2/dds.py

Add RLE plot to dds and utils

8d9dfef

jonathjd requested review from BorisMuzellec, maikia and umarteauowkin as code owners December 7, 2024 19:15

docs: fix typehint

7012b83

BorisMuzellec requested changes Dec 11, 2024

View reviewed changes

BorisMuzellec added 5 commits December 17, 2024 09:53

test: add plot_rle test

9c60aee

refactor: use obs_names

5141dea

refactor: fix type hints and remove print

6e797e2

docs: add plots to docs

d563a3e

fix: test

6c93fed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Relative Log Expression (RLE) Plots #348

Add Relative Log Expression (RLE) Plots #348

jonathjd commented Dec 7, 2024

BorisMuzellec left a comment

BorisMuzellec Dec 11, 2024

jonathjd Dec 12, 2024

BorisMuzellec Dec 17, 2024

BorisMuzellec Dec 17, 2024

Add Relative Log Expression (RLE) Plots #348

Are you sure you want to change the base?

Add Relative Log Expression (RLE) Plots #348

Conversation

jonathjd commented Dec 7, 2024

Reference Issue or PRs

What does your PR implement? Be specific.

BorisMuzellec left a comment

Choose a reason for hiding this comment

BorisMuzellec Dec 11, 2024

Choose a reason for hiding this comment

jonathjd Dec 12, 2024

Choose a reason for hiding this comment

BorisMuzellec Dec 17, 2024

Choose a reason for hiding this comment

BorisMuzellec Dec 17, 2024

Choose a reason for hiding this comment