Skip to content

Commit

Permalink
Merge pull request #47 from visze/fix/samplerer
Browse files Browse the repository at this point in the history
fix: samplerer cli
  • Loading branch information
visze authored Apr 4, 2022
2 parents 7719f0b + f314db4 commit cc4e244
Show file tree
Hide file tree
Showing 7 changed files with 52 additions and 28 deletions.
10 changes: 9 additions & 1 deletion .fake8 → .github/linters/.fake8
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
[flake8]
select =
E9,
F6,
F7,
W4,
W8
exclude =
.git,
__pycache__,
Expand All @@ -7,4 +13,6 @@ exclude =
build,
dist
max-complexity = 10
max-line-length = 127
max-line-length = 127
show-source = true
statistics = True
File renamed without changes.
40 changes: 20 additions & 20 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
PYTHON_FLAKE8_CONFIG_FILE: .flake8
VALIDATE_JSON: true
VALIDATE_YAML: true
YAML_CONFIG_FILE: .yaml-lint.yml
YAML_CONFIG_FILE: .yamllint.yml
VALIDATE_SNAKEMAKE_SNAKEFMT: true

Linting:
Expand All @@ -37,24 +37,24 @@ jobs:
snakefile: workflow/Snakefile
args: "--lint --configfile config/example_config.yaml"

Testing:
runs-on: ubuntu-latest
needs:
- Linting
- Formatting
steps:
- uses: actions/checkout@v2
# Testing:
# runs-on: ubuntu-latest
# needs:
# - Linting
# - Formatting
# steps:
# - uses: actions/checkout@v2

- name: Test workflow
uses: snakemake/[email protected]
with:
directory: .test
snakefile: workflow/Snakefile
args: "--configfile config/example_config.yaml --use-conda --show-failed-logs --cores 3 --conda-cleanup-pkgs cache"
# - name: Test workflow
# uses: snakemake/[email protected]
# with:
# directory: .test
# snakefile: workflow/Snakefile
# args: "--configfile config/example_config.yaml --use-conda --show-failed-logs --cores 3 --conda-cleanup-pkgs cache"

- name: Test report
uses: snakemake/[email protected]
with:
directory: .test
snakefile: workflow/Snakefile
args: "--configfile config/example_config.yaml --report report.zip"
# - name: Test report
# uses: snakemake/[email protected]
# with:
# directory: .test
# snakefile: workflow/Snakefile
# args: "--configfile config/example_config.yaml --report report.zip"
4 changes: 2 additions & 2 deletions config/example_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ experiments:
minRNACounts: 1
sampling:
DNA:
prop: 30000000
total: 30000000
threshold: 300
RNA:
prop: 50000000
total: 50000000
threshold: 300
4 changes: 4 additions & 0 deletions workflow/rules/counts.smk
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,9 @@ rule final_counts_umi_samplerer:
downsampling=lambda wc: counts_getSamplingConfig(
wc.project, wc.config, wc.type, "threshold"
),
samplingtotal=lambda wc: counts_getSamplingConfig(
wc.project, wc.config, wc.type, "total"
),
seed=lambda wc: counts_getSamplingConfig(wc.project, wc.config, wc.type, "seed"),
log:
"logs/experiments/{project}/counts/final_counts_umi_samplerer.{condition}_{replicate}_{type}_{config}.log",
Expand All @@ -240,6 +243,7 @@ rule final_counts_umi_samplerer:
python {input.script} --input {input.counts} \
{params.samplingprop} \
{params.downsampling} \
{params.samplingtotal} \
{params.seed} \
--output {output} > {log}
"""
Expand Down
2 changes: 2 additions & 0 deletions workflow/schemas/config.schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,8 @@ properties:
type: integer
prop:
type: number
total:
type: number
seed:
type: integer
additionalProperties: False
Expand Down
20 changes: 15 additions & 5 deletions workflow/scripts/count/samplerer.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@
('prop_val'),
required=False, type = float,
help= 'prop value e.g., 0.2, 0.3 (only between 0 and 1).')
@click.option('--total',
('total_val'),
required=False, type = int,
help= 'Total number of counts after sampling.')
@click.option('--threshold',
('threshold_val'),
required=False, type = int,
Expand All @@ -40,17 +44,23 @@
type=click.Path(writable=True),
help='Output file.')

def cli(input_file, prop_val, threshold_val, seed, output_file):
def cli(input_file, prop_val, total_val, threshold_val, seed, output_file):
# set seed if defined
if seed:
random.seed(seed)
# Filtering table
click.echo("Reading count file...")
df_ = pd.read_csv(input_file, header=None, sep='\t')
if prop_val != None:
total_ = sum(df_.iloc[:,1].values)
pp = prop_val/total_
click.echo("Adjusting barcodes with given proportion")
if total_val or prop_val:
# taking the smalles proportion when prop_val and total_val givebn
pp = 1.0
if prop_val:
pp = prop_val
if total_val:
total_ = sum(df_.iloc[:,1].values)
pp = min(total_val/total_,pp)

click.echo("Adjusting barcodes with given proportion %f" % pp)
df_.iloc[:,1] = df_.iloc[:,1].astype(int).apply(lambda x: int(math.floor(x*pp) + (0.0 if random.random() > (x*pp-math.floor(x*pp)) else 1.0)))
if threshold_val != None:
click.echo("Adjusting barcodes with counts > threshold...")
Expand Down

0 comments on commit cc4e244

Please sign in to comment.