Releases: iqbal-lab-org/make_prg
Version 0.5.0
Version 0.4.0
0.4.0 - 02/12/2022
Added
make_prg update
command, that updates PRGs without requiring to rebuild MSAs and the PRG itself from scratch;- Trace (
-vv
) logging level, to track make_prg behaviour (intended for developers only); - Multithreading support (
-t
parameter); - A sample example;
- 365 new tests (from 116 to 481 total tests), with test coverage >99% in non-argument parsing code;
- Precompiled binary;
Changed
-
make_prg from_msa
: input can now be a single file or a directory. If it is a single file,
a<prefix>.prg.bin
, a<prefix>.prg.fa
, a<prefix>.prg.gfa
and a<prefix>.update_DS.zip
files are created. If it is a directory, all files in the directory are scanned and the same
execution for a single file is done for each input file found. The output files are a collection of the single-input
execution: a<prefix>.prg.bin.zip
file will contain a collection of.prg.bin
files, similar to
<prefix>.prg.gfa.zip
and<prefix>.update_DS.zip
;<prefix>.prg.fa
will be a multi fasta; -
Other
make_prg from_msa
CLI changes (please runmake_prg from_msa -h
for a full description of the new parameters):- Parameters removed:
--prg_name
,--seqid
,--no_overwrite
; - Parameters added:
-s, --suffix
,-F, --force
,-t, --threads
,-g, --output-graphs
; - Parameters changed: Replaced
--outdir
by--output_prefix
;
- Parameters removed:
-
The recursive clustering and collapse algorithm is now explicitly represented as a tree with internal data
structures that remember the multiple sequence subalignment at any point of the recursion, as well as several other
internal data, allowing the serialization and deserialization of the recursion tree at any point. Thus updates can
be done avoiding any recomputation by firstly saving the state of the recursion tree to disk, and then loading this
recursion tree, adding denovo sequences to some specific nodes, and triggering recomputation of the modified nodes.
Any preorder traversal of the recursion tree yields the same order of recursive calls of the previous algorithm,
thus allowing us to translate the algorithms in the previous version as preorder traversals with custom visit
operations. -
Moved from
setup.py
topyproject.toml
Fixed
- Several minor bugs;
- Heavy refactoring of almost the whole codebase;
Removed
- Dropped support for
python 3.7
, supportedpython
versions are:3.8
,3.9
,3.10
,3.11
. - Dropped support for
Mac OS X
;
Version 0.2.0
New command-line
-
Added:
-S
,--seqid
option to name the PRG sequence, which by default uses the file name.-N
shortcut for max nesting-L
shortcut for min match length--log
to enable specifying log file should go to path. Default behaviour is now that
log goes to stderr by default-O
,--output-type
option to specify what output files are required. Defaults to
all
-
Removed:
- summary file
-
Changed:
--prefix
CLI parameter offrom_msa
subcommand removed in favor of CLI parameters--outdir
and--prg_name
, with sensible defaults (current working directory and MSA file name stem respectively).
This allows finer control over where to place output files.
Output files
- Output files:
- No longer contain 'max_nesting' and 'min_match_length' in their names; these appear in the log files,
and in the.prg
fasta header. .bin
file now stores even integer markers at site ends; this is the format used by gramtools.- Summary file not written by default
- No longer contain 'max_nesting' and 'min_match_length' in their names; these appear in the log files,
Bug fixes
Version 0.1.1
0.1.1 - 2021-01-27
Added
- Dockerfile
-V
option to get version
Changed
- A test that was clustering all unique 5-mers was reduced to all 4-mers as the memory
usage of all 5-mers was causing a segfault when trying to run the tests during the
docker image build.
Removed
- Singularity file as it is redundant with the new Dockerfile (that will be hosted on
quay.io) scipy
dependency. We never actually explicitly usescipy
.