Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get alignment positions #39

Open
FlorianBrnrd opened this issue Jun 6, 2019 · 4 comments
Open

get alignment positions #39

FlorianBrnrd opened this issue Jun 6, 2019 · 4 comments
Labels

Comments

@FlorianBrnrd
Copy link

FlorianBrnrd commented Jun 6, 2019

Hi,
I'm performing semi-global alignments and I was wondering if there's any way to directly get the position of the beginning and end of the alignment on the reference ? Or do I have to use the cigar string to get the information ?
Thanks a lot for your help.
Florian.

@jeffdaily
Copy link
Owner

jeffdaily commented Jun 6, 2019

The end positions of the alignment for both query and reference is readily available using the Result properties end_query and end_ref because those are calculated during the alignment step. You would need to use the Cigar object to get the beginning positions beg_query and beg_ref because they are calculated during the traceback.

@FlorianBrnrd
Copy link
Author

Thank you for the very quick answer ! I managed to get what I wanted thanks to your help, however I seem to have a encountered a strange behavior while looking at the start positions of both the query and reference.

I used this bit of code:

aln = pair_align(SEQ, ref)
cigar = aln.cigar
print(aln.traceback.ref)
print(aln.traceback.comp)
print(aln.traceback.query)
print('\n')
print('query end = {}'.format(aln.end_query))
print('ref end = {}'.format(aln.end_ref))
print('query start = {}'.format(cigar.beg_query))
print('ref start = {}'.format(cigar.beg_ref)) 

And here's what it returns:

Capture d’écran 2019-06-06 à 19 36 57

I wonder why it it returns 0 for the beginning of the reference alignment when I can clearly see the alignment starts around 10 ? Or did I misunderstood something ?

Thank you.

@mustafa0x
Copy link

I have this same question — cigar.beg_query and cigar.beg_ref are 0.

@cavelandiah
Copy link

cavelandiah commented Jun 27, 2024

Still when using cigar.beg_ref is 0. Here is a partial solution:

import re
import parasail

# Perform the sequence Align
result = parasail.sw_trace_striped_64(read, ref, 3, 2, parasail.dnafull)
# Decode CIGAR using UTF-8
new_cigar = result.cigar.decode.decode('utf-8')
# 'D' since is a deletion when compared to reference
no_align_pattern = r'(\d+)D'
# Search pattern if exists
match = re.search(no_align_pattern, new_cigar)
# If a match is found, extract the number of 'deleted' positions and convert to integer
if match:
    no_align_positions = match.group(1)
    new_start = int(no_align_positions)
else:
   new_start = 0

I would like to have some updates in that regard. Many thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants