-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate defaults in atomate #268
Comments
This by design - putting the settings in atomate leaves any changes to the default settings in atomate's control, not pymatgen's control. e.g., if a user updates pymatgen (e.g., to get some new structure analysis feature, but which comes bundled with a change in pymatgen default settings) but leaves their atomate version alone, should their workflows change? It seems to me a user should need to explicitly update atomate in order to see new default settings. |
I think that sounds reasonable if we kept up in updating atomate. The issue is that we don't update the workflows that often since a lot of them are written by postdocs who come in for a short period of time, write a workflow, and then leave. We don't want to end up with a situation where theres tons of stale technical debt simply because defaults in atomate lag behind pymatgen which a lot more people use even for running VASP. |
Seconding that this is causing confusion for new atomate users. Ideally there should only be one source of truth. It also seems like an easy way for a bug to creep in if the pymatgen input sets are updated to fix an issue but atomate then overrides a particular setting. For example, it's not clear to me why atomate has changed the default reciprocal density for MPNonSCFSet in NonSCFFW from 100 to 1000, if the value in pymatgen is too low, why not change it there? |
If you guys are able to:
then I'm happy to move forward with putting everything in pymatgen. Note that our DOS is already pretty bad as per many discussions. I'm more concerned about atomate still being too low than I am about pymatgen being closer to the a good value. |
I would say that where atomate overlaps pymatgen, e.g., specifying parameters for MP-style calculations, these should be in pymatgen and atomate should avoid duplication. For other cases, there is no absolute truth, since it all depends what user requirements. I am fine if atomate has a different workflow called CapAmsRecipeForUltraAccurateDOS that is very different from MP defaults. But users who just want to perform a single calculation for comparison with MP should be able to do so without having to resort to atomate workflows. |
I agree with @shyuep. Further, I think that atomate calculations should be entirely driven by PMG input sets (or any object that implements the API). To me, it seems rare that a user would actually want to pass different reciprocal densities to something like We have been using completely different input sets more tuned for metals and we basically had to rewrite several Firetasks, e.g. |
@bocklund |
@shyuep Yes, it should also be easy to start from the MPRelaxSet, make a minor modification and give that to atomate without creating an entirely new class. In my experience, it's not easy to make modifications to the PMG input sets without creating a new class, but that's probably more a PMG issue rather than atomate. |
As an aside, the current way custom VASP input sets are handled seems dangerous also, e.g. if you have a Broadly speaking, I agree with all above + of course we are (collectively) MP, so it doesn't make sense that we've ended up in this situation where vanilla MPSet calculations have different defaults between atomate and pymatgen. I'm reminded of Conway's Law here :) |
Ok I think this is a good discussion but I am not sure we are going to solve this problem over this thread, especially given that there are multiple concerns in the air now. How about I suggest the following:
|
I will just have one final comment - I am confused what @bocklund means by saying that it is not easy to make a change to PMG input sets. The user_incar_settings, user_kpoints_settings, etc. all exist for that purpose. @mkhorton The reason why it is dangerous is because the input set is supplied as an object, not a class/type or interpretable string. That is why the input set overrides structure in the atomate workflows. As for deleting parameters, we can easily add a feature that None => unsetting a parameter. E.g., a I will of course not be able to attend the meeting. Deputy Dictator @mkhorton can speak on pymatgen's behalf. |
I think we can continue the discussion here until the meeting actually happens
|
I have addressed some of these challenges in dfttk where we have rewritten parts of atomate that don't play well with custom input sets, such as the wrong structure problem for instances of InputSets that @mkhorton pointed out (though that "solution" is more of a hack). IMO, a core part of this issue comes from how the I'm happy to participate on any calls and offer my perspective. In my ideal world, this can reduce the amount of reimplemented code in dfttk, but I understand the difference in priorities. @shyuep TLDR: user_XYZ_settings works, but it's a pain if you want to make the same modifications every time. You can imagine that ultimately a user will develop custom settings are not one-offs that they want to apply to every run of atomate (e.g. I always want ENCUT to be 400 eV). Then you either have to apply user_XYZ_settings all over the place or automate it in some way. Automating it, you end up either
|
To give a real example of how this could be dangerous - the init() of the DictSet might sort and reduce the structure. If the structure is updated from the time an input set is created, there is no quick and easy way to make sure these sortings / reductions are properly updated without re-initializing the input set again. I can't just take the input set as given by the user and use it |
So it sounds like some of the issues may be when the InputSet is being constructed. Currently, it happens twice in a lot of Fireworks, once on the construction of the Firework, and once on running the Firetask. Deciding when that should happen should make this a bit clearer. This could also depend on whats given resulting in a different firetask. |
@bocklund This is a fundamental philosophical difference. The raison d'etre for having an input set is to standardize. By definition, that means modifications should be rare and exceptional. So I don't see a problem with a user writing a simple:
To claim this is as complex as subclassing DictSet is inaccurate. As for sorting/reducing structures, the input set is by definition a manipulation of the structure to generate a set of inputs. Most of the time, the structure processing is done in accordance to some predefined rules, e.g., conventional/primitive standardization. By default, the only thing that DictSet does is to sort the structure, which is what you'd want in most cases to avoid having atoms of different species spread all around the input, with POSCAR species like |
The None to unset incar setting has been implemented. materialsproject/pymatgen@111f44f |
I have continued to give some thought on what it means (to me) to standardize. Keep in mind I am coming at this from outside the core MP group -- I have finite computer time, different projects with different accuracy needs, and I need to automate no more than a few hundred to a few thousand calculations for each. I see atomate as a way to communicate with the community and develop consensus on what the standard is for a type of calculation (e.g. a phonon calculation) in a code-agnostic fashion. To this extent, we should focus on implementing methods and let users determine the accuracy. We can develop a guiding principle for calculation Fireworks that can be used to check a PR or proposed change against this principle. I propose the following: In more words: Each Firework has some core task where a calculation setting is critical to that task. For example, a static calculation should always have What are some concrete examples?
I think we can also use this as a guiding principle for designing a calculation superclass, where one of the main common features of the superclass is to provide an inheritable set of arguments to control calculation accuracy. |
@bocklund Isn't that what the InputSets are for? If we get the defaults out of Atomate, then as long you define an inputSet you get your settings propagated through. |
Right now there's not a 1:1 map between input sets and calculations (see DFPTFW, workflows that use |
@bocklund I would modify a few things with regards to principles:
|
@bocklund I have a quick question. It is clear that ISMEAR = -5 is good for accurate energies in metals (no relaxation). What about forces (no relaxation)? Is ISMEAR=1 still preferred or is ISMEAR=-5 fine? |
@shyuep In my experience, ISMEAR=-5 is not good enough for accurate forces. IMO, if you don't know beforehand whether you have a metal or non-metal, you can't do one calculation for both accurate energy and accurate forces. You need to pick one or the other:
If I know a priori whether I have a metal or a not, I can specialize the forces to either use ISMEAR=-5 or ISMEAR=1 for non-metals/metals, respectively. |
@bocklund What about non-metals that are metals in DFT? i.e., should have small gap but ends up having zero gap in DFT. |
If DFT predicts occupied states at the Fermi energy, then the settings should reflect the behavior of that calculated electronic structure, even if we know that it's not metallic in reality |
@bocklund Ok, thanks. I implemented a warning for obvious situations. materialsproject/pymatgen@8c88211 |
Currently atomate has a lot of default values for parameters in the input sets that are already defined in pymatgen. It would be good to remove those from atomate so that default for any calculation type fully defined in pymatgen reside there. This makes understanding where settings are coming from a bit more transparent.
The text was updated successfully, but these errors were encountered: