Skip to content

KyleStiers/mmtf-fragment-search

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MMTF Fragment Search

This program is meant to enable comprehensive querying of the PDB utilizing a fragmentation of every structure (and each chain) looking for matches based on backbone geometry - initially independent of all other properties. Under the hood the Java Apache Spark API is used in conjunction with the mmtf-spark project.

Conceptually, this allows for a strictly spatial search of fragments that utilize similar geometry to accomplish important chemistry.

Window (fragment) size can be varied, and the only required input is a PDB file of the query fragment. The query fragment should be identical in length to the window size chosen.

Example Output Example output of top 1000 hits

Planned improvements:

  • Improve scoring metric and add statistics that guide a threshold or cutoff
  • Output results in more parsable format (CSV)
  • Make executable build
  • Add sequence motif filter (regex for fragments)
  • Add filter to only let a PDB ID occur once (currently implemented in python PML generating script)

Potential future improvements:

  • GUI
  • Automated checking of pre-fragmented HADOOP sequence files
  • Automated updating of HADOOP sequence files with additions/changes to PDB
  • Inline creation of PML script to open results in PyMOL (currently implemented with a separate python script)

Many thanks to @lafita for all his work on this project and the UCSD actively developing mmtf-spark. This software is in development and as such is provided as is, with no guarantees.

About

Search similar fragments across the PDB using MMTF

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%