-
Notifications
You must be signed in to change notification settings - Fork 181
/
CHANGELOG
441 lines (342 loc) · 17.5 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
***********************************
version-3.1.0
SIGNIFICANT USER-VISIBLE CHANGES
- Move to iced 0.5.10 (#433)
- update conda env
BUG FIXES
- Fix small bug in digest_genome.py (#429)
- PR split_reads by kevm2 (#401)
- extract_snps is expected to return phased SNPs (0|1) (#351)
- Add tbb module in conda recipe (#413)
***********************************
version-3.0.0
NEW FEATURES
- vcf file can be zipped in vcf.gz
- Add a Dockerfile for automatic Docker build on DockerHub
- Set up Travis continuous testing
- iced has been removed from HiC-Pro and is now an independant module
- Add a environment.yml file for installation with conda
- HiC-Pro is now based on python3. /!\ Python2 is no longer maintained /!\
SIGNIFICANT USER-VISIBLE CHANGES
- SORT_RAM is not divided by N_CPU for samtools (#369)
- Fix bug in make_viewpoints.py if multiple viewpoints are specified in the bed file
- Add '--float' option for hicpro2higlass to manage normalized Hi-C data
- Update of the Singularity image
***********************************
version-2.11.4
BUG FIXES
- Fix major bug in parallel mode from 2.11.3
***********************************
version-2.11.3
NEW FEATURES
- 'N' is now support in the digest_genome.py utility
- 'N' is now supported in the ligation motif (for Arima Hi-C kit)
SIGNIFICANT USER-VISIBLE CHANGES
- Update ice usage. Pull request from NVaroquaux (#323)
- Update of the sparse2dense.py utils to include insulation score (Crane et al.) format
- Update of Singularity image
- Update hicpro2juicebox.sh script for memory issue (#246)
- Update hicpro2higlass.sh script (#254)
BUG FIXES
- Check if annotation files are readable (#282)
- If RM_DUP is unset, the duplicates are not filtered (#256)
- Support both .fastq.gz and .fq.gz files (#311)
- Bug in R plots legends (#257, #250, #240)
- Fix small printing bugs in mapped_2hic_fragments.py
- N_CPU for LSF usage (#250) and bowtie2 (#262)
- Update doc (#307, #273)
***********************************
version-2.11.2
SIGNIFICANT USER-VISIBLE CHANGES
- add SORT_RAM parameter to set the maximum memory per thread used by samtools.
Default is 768M (pull request F. Reinecke)
- update of Singularity file adding MultiQC
BUG FIXES
- fix bug in bowtie2 cpu usage (pull request F. Reinecke)
- Fix bug in build_contact_maps.sh when capture and allele-specific analysis are combined
***********************************
version-2.11.1
SIGNIFICANT USER-VISIBLE CHANGES
- Update ice_mod source organization
BUG FIXES
- Fix bug in mapping_2hic_fragments.py. 'dist' was not defined for filtered pairs (issue #185)
***********************************
version-2.11.0
NEW FEATURES
- Improved capture-Hi-C analysis
- Add MultiQC support for HiC-Pro (issue #144)
- New utils: split_sparse.py allowing to split a genome-wide matrix into per-chromosome sparse matrices
SIGNIFICANT USER-VISIBLE CHANGES
- Filtered pairs based on min/max insert/fragment size are now distinguished from the dumped pairs
- Reads are now reported based on the 5' end and not on their middle position
- _allValidPairs extension is now changed to .allValidpairs
- Defulat MAPQ filter in the configuration file is now set to 10
- Update log reports
- Stats file per sample are not put in hic_results/stats for clarity reasons
- GET_PROCESS_SAM now generates a bam file, instead of a sam file
- FORMAT variable in the configuration is now deprecated as currently all sequencing data are encoded in phred33
- Only one BED file is now generated by build_matrix (issue #158)
- Update of Singularity image
- Small changes during installation process to deal with --user
- Update layout of quality control plots
- hicpro2juicebox - add temporary folder for sorting, and update juicer path variable (issue#147)
BUG FIXES
- Fix bug when RM_SINGLETON=0
- Fix bug in hicpro2higlass.sh - added option --balance if normalization is specified
- Fix bug in read name when merging SAM files (issue #153)
- Fix inversion between cis/long range contacts in plot_hic_contacts.R
- Fix bug in mapped_2hic_fragments.py (issue #134)
- Fix shebang in make_viewpoints.py
***********************************
version-2.10.0
NEW FEATURES
- New utility - hicpro2higlass.sh to convert HiC-Pro output into higlass .cool files
- HiC-Pro is now availabe as a Singularity container !
- hicpro2juicebox.sh utility now supports alof HiC-Pro format (< 2.7.5)
SIGNIFICANT USER-VISIBLE CHANGES
- N_CPU parameter is now correcly used for the mapping step. R1 and R2 reads are now mapped using N_CPU/2 CPUs
- add simple script for unit testing 'test-op'
- udpate R scripts to be compatible with the lastest ggplot2 version (>2.2.1) and fix graphical bugs in quality controls
- add new checks on input files and configuration files
- Only provide HindIII annotation files for Mouse and Human as examples.
BUG FIXES
- Remove the --user option during iced installation
- R sessions are no longer saved and restored
- hicpro2fithic - bug fix when no -o option specified
- Fix bug to avoid floating values in valid pair positions
- Fix bug in order of samtools sort parameter in bowtie_combine.sh
- Fix bug in output option of makeViewpoints
- fix bugs in plots of no trans interactions were detected
- fix bug in split_reads.py when -o is specified
- fix issue in hicpro2juicebox.sh. The first column was duplicated, and position are now 1-based
- fix bug in ice_norm.sh when HiC-Pro is run with -s ice_norm option
***********************************
version-2.9.0
NEW FEATURES
- update iced_0.4.2
SIGNIFICANT USER-VISIBLE CHANGES
- Fix issues with floating values in hicpro2fithic.py (F. Ay)
- Update hicpro2fithic.py script. New '-r' option to set up the data resolution (F. Ay)
- samtools >1.0 is now required
- The -s build_contact_maps option now directly creates matrix files from .allValidPairs file. It does not merge the .validPairs anymore. Please use -s merge_persample to merge .validPairs files and remove duplicates.
- Update manual and correct errors
BUG FIXES
- Fix bug for SLURM support
- Fix python path in merge_valid_interaction.sh
- Fix bug in hicpro2juicebox when chromosome names have "_"
- Fix a bug in detection of religation events
- Fix some issue with sort -T. This option is no longer used and replaced by setting the TMPDIR variable
***********************************
version-2.8.0
NEW FEATURES
- Return bias vector after IC normalization
- New utils hicpro2fithic.py to convert HiC-Pro output into fitHiC input
SIGNIFICANT USER-VISIBLE CHANGES
- Update iced version (-> 0.4.0). This version includes a major changes on matrix filtering before iced normalization
- sparseToDense.py utils. Create output file in the current folder by default
BUG FIXES
- Fix bug in the help message of digest_genome.py
***********************************
version-2.7.9
NEW FEATURES
- Update for capture Hi-C. New variable CAPTURE_TARGET allowing to restrict the analysis to the captured region(s)
SIGNIFICANT USER-VISIBLE CHANGES
- Update iced version (-> 0.3.0)
- Exit with an error if the restriction fragment file is set but not found. In the previous version, HiC-Pro automatically switched to DNAse mode.
- Optimisation of python scripts
***********************************
version-2.7.8
NEW FEATURES
- Religation are now discarded and considered as invalid pairs. Religation are defined as FR read pairs involving contiguous restriction fragments
SIGNIFICANT USER-VISIBLE CHANGES
- Installation process has been changed ! Please read the manual !
- Update of sparseTodense utils. It now provides a ---perchr option to generate intrachromosomal dense contact maps
- Clean hicpro2juicebox.sh utils
BUG FIXES
- Update script allowing to check the python version
- Bug fix in R code in case of NA values
***********************************
version-2.7.7
NEW FEATURES
- New utility : sparseToDense.py - Convert a sparse symmetric matrix file in dense format for further analysis
- Pull request from J. Brayet for installation process
SIGNIFICANT USER-VISIBLE CHANGES
- Update of the hicpro2juicebox.sh utils. -g option now requires the chromosome size files from HiC-Pro. This version now works for any organism
- Update of the manual
- Check python module version during installation
- Pull request NV - udpate iced version. Note that this version of iced rescales automatically the normalized matrix such that the total number of counts is identical to the original matrix
- Add -c option in hicpro2juicebox.sh utils to write the "chr" prefix before the chromosome name. The use of "chr" depends on the reference genome of juicebox
***********************************
version-2.7.6
NEW FEATURES
- New utils : hicpro2juicebox. HiC-Pro results can now be converted in Juicebox input for visualization (A. Barrera, N. Servant)
- New utils : make_viewpoints.py. Allow to generate viewpoint from the list of valid interaction products. Initially developed for capture-C analysis
SIGNIFICANT USER-VISIBLE CHANGES
- validPairs output format was updated for Juicebox compatibility. New columns were added with restriction fragment names and mapping quality
BUG FIXES
- Fix bug in hic_inc.sh for numeric comparison
- Fix bug in bam/sam extension removal
***********************************
version-2.7.4
NEW FEATURES
- Support for LSF scheduler thanks to J. Phlipps-Cremins
SIGNIFICANT USER-VISIBLE CHANGES
- Update singleton reporting in pairing mode (pull request N. Varoquaux, F. Ay)
- Additional checks on input files to report more "user-friendly" errors
BUG FIXES
- Fix bugs in singleton and multiple hits reporting
- Fix bug when fastq.gz files are processed without parallel mode but using a cluster
***********************************
version-2.7.3b
BUG FIXES
- Bug fixed if space in the configuration file
- Bug fixed in plot_pairing_portion.R (pull request M. Blum)
- Update samtools sort for compatibility with version >1.1
SIGNIFICANT USER-VISIBLE CHANGES
- Update python access in utils (pull request D. Vanichkina)
- Update iced version (0.2.1 official release)
***********************************
version-2.7.2
NEW FEATURES
- HiC-Pro is now compatible with the HiCPlotter viewer (see https://github.com/kcakdemir/HiCPlotter)
- Add support for the SLURM scheduler thanks to A. D'Ippolito !
- Add support for the SGE scheduler thanks to G. Li !
SIGNIFICANT USER-VISIBLE CHANGES
- be careful - configuration files for installing and running HiC-Pro have been updated to manage multiple schedulers !
***********************************
version-2.7.1
NEW FEATURES
- Add new stepwise option "merge_persample". Migth be useful in some case to merge the different samples and remove
the duplicates if specified in the config files. This step is also run in the "build_contact_map" steps, before generating the contact maps.
- Can now generate raw contact maps at the restriction fragment resolution (build_matrix v1.2). To do so, specified a BIN_SIZE=-1.
Note that in pratice ICE will be run on these matrices too although its assumption of the equal visibility across bins may require further exploration in the present case.
- New version of the iced package 0.2.1
SIGNIFICANT USER-VISIBLE CHANGES
- Change in the calculation of insert size distribution. The length of DNA fragment after ligation are now calculated using the read start.
In the previous version, we used the middle of the read as starting point. The insert size was therefore shifted by a factor equal to the read length
- Change default behavior of split_reads.py utility (output = "./", nreads=20e6)
- ALLELE_SPECIFIC_SNP file will be detected using the full absolute path or in the annotation folder
- R1 and R2 reads are now ordered by genomic position in the validPairs files. This might have an impact on the duplicates level
- BIN_STEP variable from the configuration file is now deprecated
- Update some graphical labels
BUG FIXES
- Bug fix in C++ compilation. Add sys/stat.h inclusion
- Fix bug in ice normalization. FILTER_HIGH_COUNT_PERC were not considered in ice_norm.sh
***********************************
version-2.7.0
NEW FEATURES
- New parameters - FILTER_HIGH_COUNT_PERC / FILTER_LOW_COUNT_PERC to filter out extreme count before normalization.
Note that the parameter SPARSE_FILTERING is now deprecated and replaced by FILTER_LOW_COUNT_PERC
- New utility ; digest_genome.py which take a fasta file and the name(s) or sequence(s) of the restriction enzyme(s) in order to generate the list of restriction fragments after genome digestion.
- HiC-Pro is now able to process data generate from protocols without restriction enzyme such as DNase Hi-C. See the manual for more information
SIGNIFICANT USER-VISIBLE CHANGES
- The mapping step2 is now optional and will be run only if the ligation site is specified
- New parameter MIN_CIS_DIST in order to discard all contacts below the specified distance. This is mainly used for DNase Hi-C to remove artefact which are likely to be self ligation product.
BUG FIXES
- Fix bug in plot scale
***********************************
version-2.6.0
NEW FEATURES
- Allele specific version of HiC-Pro
- New utility to build the VCF file for allele specific analysis
- New manual version
- Manual is now online at http://nservant.github.io/HiC-Pro/
SIGNIFICANT USER-VISIBLE CHANGES
- New configuration file with ALLELE_SPECIFIC_SNP field. If specified HiC-Pro is run in allele specific mode.
- Improve stepwise analysis. Check input files according to the specified step
- The split_reads.py utility can now handle fastq.gz files
BUG FIXES
- Fix bug in build_raw_maps.sh in --chrsize (reported by D. Robelin)
- Fix bugs in CIS short/long ranges interaction reporting
- Fix bugs in bowtie pairing - option -m and -s
- Fix bug in mapped_2hic_fragments when singleton are not removed during the pairing
***********************************
version-2.5.2
NEW FEATURES
- New quality control plots, i.e. duplicates level, short/long range interactions, fragment size
SIGNIFICANT USER-VISIBLE CHANGES
- HiC-Pro can now be run from BAM aligned files
- Stepwise analysis can now be run in addition to the parallel mode
- Additional checks when running the pipeline
- Quality control plots are now run after each step instead of once at the end of the pipeline
BUG FIXES
- Bug fixed when HiC-Pro is run with relative path
- Bug fixed in mergeSAM.py in case of duplicated reference in the SAM header
- Check PREFIX installation variable during installation process
***********************************
version-2.5.1
SIGNIFICANT USER-VISIBLE CHANGES
- samtools 0.1.19 or higher is required
- Any character after "\" in read names are now discarded so that myread\1 == myread\2
- Warning is printed during the restriction fragment assignment is a chromosome is not defined in the annotation file
BUG FIXES
- Add pyton path in ice_norm.sh
- Bug fixed in iced genomic intervals symlink
- Add missing ${PYTHON_PATH} variable in scripts
- Add pysam check during installation
- Bug in install process
- Bug in python PATH
***********************************
version-2.5.0
SIGNIFICANT USER-VISIBLE CHANGES
- MergeSAM.pl is now in python based on pysam library
- All SAM outputs are converted into BAM files
- Check dependencies version number during installation
***********************************
version-2.4.2
NEW FEATURES
- Add HiC-Pro utilities - split_reads
***********************************
version-2.4.1
SIGNIFICANT USER-VISIBLE CHANGES
- Add PREFIX variable in installation configuration to set the installation directory
- Improve logs reporting and organization
- Remove SAM files by default when running the complete workflow
BUG FIXES
- Bug fixed in rmdup
***********************************
version-2.4.0
BUG FIXES
- Bug fixed if input path is not absolute
NEW FEATURES
- Support both short and long options
SIGNIFICANT USER-VISIBLE CHANGES
- ORGANISM option in config file replaced by REFERENCE_GENOME
- CUT_SITE_5OVER option in config file replaced by LIGATION_SITE
- GENOME_SIZE and GENOME_FRAGMENT files are detected as they are specified or from the annotation folder
***********************************
version-2.3.0
NEW FEATURES
- Update error and log reporting
- Change scripts folder
- Automatic installation process, using make install
- HiC-Pro can now be run using the bin/HiC-Pro bash script
- Add licence in all scripts
- Bug fixed in build_map - new option matrix_format asis/lower/upper/complete
- New version of ICE normalization
***********************************
version-2.2.0
NEW FEATURES
- Add ICE normalization
- Change mapping strategy. Local mapping is replaced by a global approach on trimmed reads
***********************************
version-2.1.0
NEW FEATURES
- Manage .fastq.gz files
- New pictures for mapping, pairing and valid pairs results
- Check cutSite for local mapped reads (mergeSAM.pl)
- New output option for overlapMapped2HiCFragment.py --samOutput
- Change read position used for fragment overlap (middle instead of 5' end to avoid cutting site sequencing)
- Remove file conversion sam to .aln
- Additional filters on MAPQ (mergeSAM.pl)
- Additional check to discard invalid reads pair (mergeSAM.pl)
***********************************
version-2.0.0
NEW FEATURES
- Major release by E. Viara
- New parallelized version of the pipeline. The workflow is parallelized by read pairs from the alignment, to the list of valid interactions
- New version of build_matrix tool written in C++
***********************************
version-1.0.0
NEW FEATURES
- First version of the pipeline