Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic polymer #1170

Merged
merged 20 commits into from
Apr 16, 2024
Merged

Periodic polymer #1170

merged 20 commits into from
Apr 16, 2024

Conversation

jaclark5
Copy link
Contributor

@jaclark5 jaclark5 commented Mar 8, 2024

PR Summary:

Resolves #1171

PR Checklist


  • Includes appropriate unit test(s)
  • Appropriate docstring(s) are added/updated
  • Code is (approximately) PEP8 compliant
  • Issue(s) raised/addressed?

@jaclark5 jaclark5 requested a review from chrisjonesBSU March 8, 2024 15:09
Copy link

codecov bot commented Mar 8, 2024

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 87.35%. Comparing base (2bb047e) to head (e5a1364).
Report is 2 commits behind head on main.

❗ Current head e5a1364 differs from pull request most recent head a36afc7. Consider uploading reports for the commit a36afc7 to get more accurate results

Files Patch % Lines
mbuild/lib/recipes/polymer.py 92.85% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1170      +/-   ##
==========================================
+ Coverage   87.32%   87.35%   +0.02%     
==========================================
  Files          62       62              
  Lines        6525     6540      +15     
==========================================
+ Hits         5698     5713      +15     
  Misses        827      827              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chrisjonesBSU
Copy link
Contributor

chrisjonesBSU commented Mar 11, 2024

@jaclark5 Funnily enough, just last week a co-worker and I were discussing how we could do this to help with a problem we are facing, so thank you for getting this started!

I'm reading through the lines 202 to 221 where the error is raised. It seems like the goal here is to align the chain backbone along the z-axis so that this periodic bond is roughly <0, 0, 1>, and the periodicity of the compound is [False, False, True], is that correct?

This seems to work great with the CH2() class since the atoms are conveniently set up perfectly, but issues come up with monomers that aren't as curated.

For example, if I try using SMILES to create a polyethylene chain:

import mbuild as mb
from mbuild.lib.recipes import Polymer

mon = mb.load("CC", smiles=True)
poly = Polymer()
poly.add_monomer(compound=mon, indices=[2, 6], separation=0.15)
poly.build(n=10, add_hydrogens=False, make_periodic=True)

then this line:

212             self.periodic = list(
213                 ~np.isclose(
214                     head_port.pos - tail_port.pos, 0.0, np.finfo(float).eps
215                 )
216             )

returns [True, True, True] and the error is raised.

If z-axis alignment is going to be enforced, maybe we should call the z_axis_transform method first (which is in mbuild.coordinate_transform):

        if make_periodic:
            if np.any([x is not None for x in self._end_groups]):
                raise ValueError(
                    "Keyword 'make_periodic' cannot be defined if 'end_groups' are provided."
                )
            z_axis_transform(
                    compound=self,
                    point_on_z_axis=head_port.pos,
                    point_on_zx_plane=tail_port.pos
            )

With the polyethylene example above,

head_port.pos - tail_port.pos is array([-2.03379188, -1.14121417, 0.8149497 ]) without calling the z_axis_transform but if we do call it, then it becomes array([ 3.83137532e-13, -5.59901469e-05, 2.47038908e+00]) which is a lot closer to [0, 0, 1] if we allow for some tolerance.

So, for the sake of being more generally usable (especially with monomers created from SMILES), I think we should 1) call the z_axis_transform method under the if statement first and 2) add a parameter that allows one to set atol in the np.isclose method, maybe we add it as a prameter to build() call it something like z_axis_tol

Let me know what you think, or if this causes any other issues that I'm missing.

@jaclark5
Copy link
Contributor Author

jaclark5 commented Mar 11, 2024

Hey @chrisjonesBSU that sounds good to me! Do you want to add that as a contributor to this PR, or I can add it in.

I think this change will provide a more robust function, but I don't think that's the overall problem. I've been using this with a LJ system which is easy to align perfectly, but it's throwing an error because the x and y coordinates aren't perfect, despite being within 0.001 Å. I could add a tolerance keyword argument, but I'll just make it relative to the longest bond length in the system.

Copy link
Contributor

@CalCraven CalCraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few code quality changes to make it more readable. As an overall change, I think this should live as a separate method for the Polymer recipe, as opposed to an argument to the build function. e.g.
Current use case:

polymer.build(n=3, make_periodic=True)

New use case

polymer.build(n=3, add_hydrogens=False)
polymer.align_periodic_bond(axis="z")
"""This function could align on the x,y, or z dimension using coordinate transforms. 
Only functions if `self.head_groups is [None, None]`
Could check polymer.box to see if bond is periodic or non-periodic and raise Errors/Warnings
"""

May also need to adjust some port removal at the end of the build stage, since it seems like it wipes all ports. Either that, or this align_periodic_bond method takes the particles to create the bonds between as the argument, and if it isn't passed just takes the first and last children in the list.

mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
)
]
)
self.periodic = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be making an extra compound attribute here. Compound.periodicity already exists, which is a tuple of size three Booleans. Should we use that?

I also have issue with modifying this attribute as a whole here. What if you've already set the polymer peridocity for some reason? This would overwrite and could be missed.

polymer = mb.lib.recipes.Polymer()
polymer.periodicity = (False, True, False) # periodic in Y dimension
polymer.add_monomer(some_monomer)
polymer.build(n=3, make_periodic=True)
print(polymer.periodicity) # == (False, False, True)

If we do keep it as self.periodic, then that information will not be converted anywhere, which would be confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use periodicity, as it is written here, that may have been my intention when I first wrote it. I could also make it an internal variable for the error check so that it's not exposed.

Until the polymer is built, I'm not sure how the periodicity could be set, so this would take care of that for you. I will admit that I haven't used or set the periodicity when using mbuild since I usually make a large molecule, add a box around it, and solvate it with packmol.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's fair. Periodicity is typically used when doing the simulation side of things anyways, so most of the time it's set in the control files (which is external to MoSDeF). In mBuild, it's mostly used when tiling out compounds, you can tile them out based on their periodicity. However, it is an attribute of a compound that recipes and other building methods can take advantage of. So it's possible to have some use case that would build a polymer, and then do something with that periodicity, which would be overwritten here. That's why I'm a little hesitant to have it modified here, so I agree an internal variable may be the way to go.

mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
@jaclark5
Copy link
Contributor Author

jaclark5 commented Mar 14, 2024

I think it should be a method, but that it should still be a keyword argument in build() for two reasons:

  1. All current methods of the Polymer recipe are also inputs for build
  2. With add_hydrogens and end_groups as kwargs in build() there is a precedent for including capping options in that method, and honestly as a user I didn't know about the other methods in Polymer until recently because all I needed was build(), so why remove another capping option away from a user's fingertips?

@CalCraven
Copy link
Contributor

I think it should be a method, but that it should still be a keyword argument in build() for two reasons:

  1. All current methods of the Polymer recipe are also inputs for build
    I'm not sure about this. Methods that are not arguments to build: add_monomer, add_end_groups are the two main methods to the class, which are not part of the build method. The build arguments are build(self, n, sequence="A", add_hydrogens=True), none of which are methods in the class.
  1. With add_hydrogens and end_groups as kwargs in build() there is a precedent for including capping options in that method, and honestly as a user I didn't know about the other methods in Polymer until recently because all I needed was build(), so why remove another capping option away from a user's fingertips?
    Hmmm, I think this is an interesting argument. I guess I'm less worried about precedent as a whole, and more worried about functionality for the code. The things I'm considering are:
  • How often used are the two functions in conjunction as opposed to separately?

While the wrapping is useful and cool, I would say at least 90% or greater of the time, people are just using these functions to bond monomers together.

  • How independently can the two functions be implemented?

Should be pretty easy, there's nothing core about one function that needs to influence the other.

  • What are the purposes of the sections of code?

build(): Just to recursively bond together monomer groups into your pattern and replicates
periodic_wrap(): Take the end groups and bond them across a periodic boundary while aligning the group

  • What is the amount of information we're providing the user

I think the build function is pretty clear and intuitive.
I think the periodic wrap is a bit of extra complexity. Let's move the whole molecule with some complex transformation. Let's add this bond to make a periodic molecule in one dimension that needs to be handled carefully when setting the simulation box.

I think with these answers in mind, and the Zen of python echoing that explicit is better than implicit, this should be a separate method to call explicitly.

@chrisjonesBSU
Copy link
Contributor

chrisjonesBSU commented Mar 18, 2024

I am learning towards agreeing with @jaclark5.

First, the periodic bond doesn't necessarily need to be handled carefully. The bonds and angles being off initially is fixed easily by a quick energy minimization (e.g. short simulation after finishing mbuild -> gmso -> simulation engine). But yeah, I also agree that the self.periodicity attribute doesn't need to be set or changed here.

Second, the information needed to form this bond is already conveniently available during the build step and is lost after building up the polymer is complete. If I create a 20mer from a monomer, I don't know what 2 particles (or ports) to pass into force_overlap after the fact without some extra work.

Also, I agree that this process isn't that different in principle than 1) Placing hydrogen atoms back onto the head and tail monomers out of convenience and 2) adding end-groups monomers automatically after building up the polymer.

@chrisjonesBSU
Copy link
Contributor

chrisjonesBSU commented Mar 18, 2024

Ahh, I see you addressed the second point in an earlier comment @CalCraven

May also need to adjust some port removal at the end of the build stage, since it seems like it wipes all ports. Either that, or this align_periodic_bond method takes the particles to create the bonds between as the argument, and if it isn't passed just takes the first and last children in the list.

Either way, I think that we should make it easier to automatically bond the head and tail monomers. We shouldn't have more indices/particles/ports parameters that need to be manually defined later. Putting this in build seems fine IMO.

If we are going to have a separate function to both orient a polymer chain along an axis and form a head-tail-bond, then maybe the head and tail ports should be stored as properties or attributes of the Polymer class? I think keeping these around and making them more accessible would be beneficial. Having these as attributes would also make using the coordiante_transform methods easier. Aligning long polymer chains along a specific direction can be difficult without choosing the right particles, and is useful even in cases where you aren't needing to form a head-tail bond.

@CalCraven
Copy link
Contributor

Either way, I think that we should make it easier to automatically bond the head and tail monomers. We shouldn't have more indices/particles/ports parameters that need to be manually defined later. Putting this in build seems fine IMO.

Agreed, the indices stuff can get confusing since we don't provide a great way to visualize which particles have which indices other than print with enumeration.

If we are going to have a separate function to both orient a polymer chain along an axis and form a head-tail-bond, then maybe the head and tail ports should be stored as properties or attributes of the Polymer class? I think keeping these around and making them more accessible would be beneficial. Having these as attributes would also make using the coordiante_transform methods easier. Aligning long polymer chains along a specific direction can be difficult without choosing the right particles, and is useful even in cases where you aren't needing to form a head-tail bond.

I think that's a great idea, that would be a useful feature to have for other Polymer based methods in general, such as some sort of crosslinking algorithm. I'm also thinking if you wanted to make a polymer with more ports such that you could form other bonds later, just bllindly removing all the ports maybe should be adjusted.

@CalCraven
Copy link
Contributor

I am learning towards agreeing with @jaclark5.

First, the periodic bond doesn't necessarily need to be handled carefully. The bonds and angles being off initially is fixed easily by a quick energy minimization (e.g. short simulation after finishing mbuild -> gmso -> simulation engine). But yeah, I also agree that the self.periodicity attribute doesn't need to be set or changed here.

Second, the information needed to form this bond is already conveniently available during the build step and is lost after building up the polymer is complete. If I create a 20mer from a monomer, I don't know what 2 particles (or ports) to pass into force_overlap after the fact without some extra work.

Also, I agree that this process isn't that different in principle than 1) Placing hydrogen atoms back onto the head and tail monomers out of convenience and 2) adding end-groups monomers automatically after building up the polymer.

Yeah these are good points. I am surprised that the energy minimization works at all with a molecule with this high of energy. I think that's what I'm most worried about, adding a few hydrogens should work in almost all cases. However, creating this bond seems to me like it would cause lots of issues in a simulation workflow if you didn't really know what you were doing.

Copy link
Contributor

@chrisjonesBSU chrisjonesBSU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jaclark5 based on the discussion yesterday, I think we should go in the direction of having head and tail port properties, and using them where needed. Since a periodic bond will always be between head and tail ports, we can remove the 2 port parameters from create_periodic_bond.

I tried to push all of these changes, but I don't think this PR selected the option to allow edits by maintainers.

mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
mbuild/lib/recipes/polymer.py Outdated Show resolved Hide resolved
@chrisjonesBSU
Copy link
Contributor

If we go in the direction of having properties for head and tail ports, then I think the workflow of having create_periodic_bond() outside of build() becomes more approachable:

polymer = Polymer(...)
polymer.build(n=5)
polymer.create_periodic_bond(axis="z")

as opposed to

polymer = Polymer(...)
polymer.build(n=5, periodic_axis="z")

One thing that could come up is if add_hydrogens=True is set in build and create_periodic_bond is called later.

I think there are multiple ways to handle this. We could throw and error in create_periodic_bond() and the user has to re-call the build() with add_hydrogens=False. Or, even if the head and tail monomers were capped with hydrogens, I think we can find them easily enough and remove them in create_periodic_bond() using self.head_port.anchor.

Let us know what you think @jaclark5

@CalCraven
Copy link
Contributor

Agreed, I think you could still add the bond even if the polymer is capped with hydrogens. I think the way to do that would be to add an argument remove_caps=True. If this is true, make sure to remove a hydrogen from the anchors of the head_port and tail_port.

head = head_port.anchor
bonded_atoms = list(nx.neighbors(self.root.bond_graph, head))
cappedBool = False
for atom in bonded_atoms:
    if atom.name == "H":
        self.remove(atom)
        raise UserWarning(f"Removing {atom} because remove_caps was set to True")
        cappedBool = True
        break
if not cappedBool and remove_caps:
     raise ValueError(f"Could not find Hydrogen on {head =} or {tail =} to remove. Consider setting remove_caps=False")

@CalCraven
Copy link
Contributor

Code coverage should pass if you just add a loop to check the axis arguments y and z to go along with the x test that is already being performed.

@chrisjonesBSU
Copy link
Contributor

I think this is ready to merge. It could use a couple more tests in test_polymers. For example, ensure an error is raised when someone builds with end groups or add_hydrogens = True and then calls create_periodic_bond, and like Cal suggested, try passing in all of x, y z to create_periodic_bond

Thank you @jaclark5, this is a really helpful addition!

@CalCraven
Copy link
Contributor

Once this is updated to the main branch, we can merge!

Copy link
Contributor

@CalCraven CalCraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@daico007 daico007 merged commit 5e302f2 into mosdef-hub:main Apr 16, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ability for Polymer Builder to connect ends for a loop
4 participants