Skip to content

Lattice dynamics workflow using Pheasy#1063

Open
hrushikesh-s wants to merge 182 commits intomaterialsproject:mainfrom
leslie-zheng:atomate2_jz_pheasy_anharmonic
Open

Lattice dynamics workflow using Pheasy#1063
hrushikesh-s wants to merge 182 commits intomaterialsproject:mainfrom
leslie-zheng:atomate2_jz_pheasy_anharmonic

Conversation

@hrushikesh-s
Copy link
Copy Markdown
Collaborator

@hrushikesh-s hrushikesh-s commented Nov 22, 2024

Summary

Include a summary of major changes in bullet points:

  • Feature 1 -- Add lattice dynamics workflow using pheasy for harmonic & anharmonic phonon calculations

Additional dependencies introduced (if any)

  • pheasy – used to generate symmetry-aware cluster spaces, construct null-space constraints, build displacement/sensing matrices, and fit harmonic force constants from displacement–force data; also used for optional higher-order IFC fitting, phonon renormalization, and force-constant conversion workflows.
  • alamode (alm) – used to estimate the number of irreducible force-constant parameters and thereby determine how many displaced supercells are needed for harmonic and anharmonic IFC extraction; also used to estimate higher-order anharmonic parameter counts.
  • hiphive – used as a post-processing and recovery backend for force constants: it enforces rotational sum rules on second-order IFCs when imaginary modes are present, reconstructs corrected force constants, and is also used to convert third-order force constants to phono3py format for thermal-conductivity workflows.

Before a pull request can be merged, the following items must be checked:

  • Code is in the standard Python style.
    The easiest way to handle this is to run the following in the correct sequence on
    your local machine. Start with running ruff and ruff format on your new code. This will
    automatically reformat your code to PEP8 conventions and fix many linting issues.
  • Doc strings have been added in the Numpy docstring format.
    Run ruff on your code.
  • Type annotations are highly encouraged. Run mypy to
    type check your code.
  • Tests have been added for any new functionality or bug fixes.
  • All linting and tests pass.

hrushikesh-s and others added 4 commits March 26, 2026 23:15
- Update thermo reference values for matgl-backed phonon workflows
- Enable thermal displacement generation in pheasy test (create_thermal_displacements=True)
- Relax force_constants type check (tuple --> list | tuple)
- Update total_dft_energy expectation (per-atom normalization)
- Fix CI dependency stack for torch-limited forcefields
@hrushikesh-s
Copy link
Copy Markdown
Collaborator Author

Hi @JaGeo, does this look okay?

@JaGeo
Copy link
Copy Markdown
Member

JaGeo commented Mar 27, 2026

@hrushikesh-s thanks!

I think one point we need to discuss is the deprecation of the phonon related schema. Our codes actually depend on the old schema and i think we should handle this carefully.

@hrushikesh-s
Copy link
Copy Markdown
Collaborator Author

Hi @JaGeo , Thanks for bringing this up, that makes a lot of sense.

I agree that switching directly to the Emmet models could be disruptive since earlier production codes may still rely on the current atomate2 schema. Would it be reasonable to handle this in a phased way?

What I had in mind is:

  • In this PR, we transition to using the Emmet models internally
  • Keep the existing atomate2.common.schemas.phonons import path for backward compatibility
  • Preserve any atomate2-specific helper methods so existing workflows don’t break
  • Add a DeprecationWarning with a clear timeline

This way we can move toward the new schema without introducing a breaking change immediately. Does that sound like a reasonable approach to you? If yes, then I can make these edits to the current PR.

@JaGeo
Copy link
Copy Markdown
Member

JaGeo commented Mar 27, 2026

I do recall a discussion involving @esoteric-ephemera that we wanted to potentially keep both options.
I am in principle okay with a long-term deprecation. I would need to check the emmet models in detail to judge if they would still fully serve our purposes.

This commit restores support for the legacy atomate2 phonon document schemas while introducing optional support for the newer Emmet schemas across phonon-related workflows.

Deprecation decision:
No deprecation is introduced in this commit
Both schemas are supported going forward
Deprecation of legacy atomate2 phonon schemas is deferred and will be discussed separately with a proper timeline
@esoteric-ephemera
Copy link
Copy Markdown
Collaborator

I don't remember off hand how much the two schemas differ, can you point me to the fields that are needed in the legacy schema that are missing in the new emmet one @hrushikesh-s?

@hrushikesh-s
Copy link
Copy Markdown
Collaborator Author

Hi @esoteric-ephemera ,

I compared the legacy atomate2 phonon schema against the new emmet one. The main differences are:

  1. The new emmet schema replaces the old PhononBSDOSDoc with a PhononBSDOSTask + built PhononBSDOSDoc split.

  2. phonon_bandstructure and phonon_dos are no longer stored as raw pymatgen objects; they use emmet wrapper schemas (PhononBS, PhononDOS).

  3. The following legacy stored fields are not present as explicit stored fields in the new emmet schema:

    • free_energies
    • heat_capacities
    • internal_energies
    • entropies
    • temperatures
    • uuids
    • jobdirs
  4. In emmet, thermodynamic quantities are computed from the DOS on demand rather than stored as arrays on the main BSDOS doc.

  5. has_imaginary_modes is now a computed field rather than a plain stored field.

  6. phonopy_settings was renamed to post_process_settings.

  7. force_constants changed shape from the legacy wrapped ForceConstants object to a plain nested matrix list.

  8. total_dft_energy changed convention: legacy atomate2 stored it per formula unit, while emmet stores it per atom.

  9. The new schema also adds fields the legacy one did not have, such as phonon_method, epsilon_electronic, and sum-rule diagnostics (sum_rules_breaking, ASR/CNSR-related computed data).

Aside -- I tried to restore the legacy phonon schema functionality for the phonon, QHA, and Grüneisen workflows, and have fixed the associated test failures. Can you pls take a look and confirm whether these changes align with the goal of continuing to support the legacy schema for these workflows in the short term, with a plan to migrate to the emmet schema later?

@hrushikesh-s
Copy link
Copy Markdown
Collaborator Author

Also, it's worthwhile to mention that the current PR makes use of my fork of Pheasy repo, and the PR is underway on Pheasy to make it compatible with the current atomate2 style reading of .npy files for disp & force matrices

@JaGeo
Copy link
Copy Markdown
Member

JaGeo commented Mar 31, 2026

Thanks!

From the timeline: i will likely look at it after Easter (next week earliest). I have to grade and prepare my teaching this week. I would also like to test and explore the changes in detail.

return Response(output={"plus": plus_struct, "minus": minus_struct})


@job(data=[PhononBSDOSDoc])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this new workflow, no object needs to be transferred to "data"?

)


@job(data=[PhononBSDOSDoc])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not needed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at least a replacement for this one?

Comment on lines -8 to -14
from atomate2.common.schemas.phonons import (
PhononBSDOSDoc,
PhononComputationalSettings,
PhononJobDirs,
PhononUUIDs,
ThermalDisplacementData,
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood that we keep this for now, yes? Why not keeping the tests as long as we keep the schema?

@@ -0,0 +1,362 @@
"""Flow for calculating (an)harmonic FCs and phonon energy renorma. with pheasy."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Flow for calculating (an)harmonic FCs and phonon energy renorma. with pheasy."""
"""Flow for calculating (an)harmonic FCs and phonon energy renormalisation with pheasy."""

if cal_anhar_fcs:
# Due to the cutoff radius of the force constants use the unit of Borh in ALM,
# we need to convert the cutoff radius from Angstrom to Bohr.
with ALM(lattice * 1.89, positions, numbers) as alm:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is accurate enough for a cutoff but maybe more digits?

Comment on lines +802 to +829
force_constants = parse_FORCE_CONSTANTS(filename=new_fc_file)
phonon.force_constants = force_constants
phonon.symmetrize_force_constants()

phonon.save(_DEFAULT_FILE_PATHS["phonopy"])

# get phonon band structure
kpath_dict, kpath_concrete = _get_kpath(
structure=get_pmg_structure(phonon.primitive),
kpath_scheme=kpath_scheme,
symprec=symprec,
)

npoints_band = kwargs.get("npoints_band", 101)
qpoints, connections = get_band_qpoints_and_path_connections(
kpath_concrete, npoints=kwargs.get("npoints_band", 101)
)

# phonon band structures will always be computed
phonon.run_band_structure(
qpoints, path_connections=connections, with_eigenvectors=True
)
phonon.write_yaml_band_structure(filename=_DEFAULT_FILE_PATHS["band_structure"])
bs_symm_line = get_ph_bs_symm_line(
_DEFAULT_FILE_PATHS["band_structure"],
labels_dict=kpath_dict,
has_nac=born is not None,
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can more code be reused from phonon.py? If we have an update for the plotting, we would need to fix it here as well. Same for the code below.

Comment on lines +3 to +10
from importlib.util import find_spec

if not find_spec("openff"):
raise ImportError(
"openff must be installed via conda forge to use with atomate2:\n"
"conda install -c conda-force openff"
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you check if this is consistent with how we handle this accross atomate2?

Comment on lines +115 to +130
assert_allclose(
ph_doc.primitive_matrix,
[[1.0, 0.0, 0.0], [0.0, 1.0, 0.0], [0.0, 0.0, 1.0]],
rtol=1e-5,
atol=1e-10,
)
assert ph_doc.code == "vasp"
assert isinstance(
ph_doc.post_process_settings,
PhononComputationalSettings,
)
assert ph_doc.post_process_settings.npoints_band == 101
assert ph_doc.post_process_settings.kpath_scheme == "seekpath"
assert ph_doc.post_process_settings.kpoint_density_dos == 7000

assert ph_doc.chemsys == "Si"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a test that actually checks the new implemented functionality. How do we make sure that the results stay consistent?

@JaGeo
Copy link
Copy Markdown
Member

JaGeo commented Apr 8, 2026

Hi @hrushikesh-s ,

I left a few comments.

My main points:

  • pheasy functionality must be tested, e.g., by testing some of the phonon results.
  • could you check that both Grüneisen and QHA workflow still work in practice when they are run with fireworks or jobflow-remote. The data objects are a really important point
  • should we provide a forcefield implementation for testing?
  • should we provide a tutorial for pheasy? if we had a forcefield implementation, this would be much easier to do.
  • conflicts need to be solved :)

@esoteric-ephemera
Copy link
Copy Markdown
Collaborator

Hey @hrushikesh-s sorry I missed your messages. These schema changes are very much intentional:

  1. The new emmet schema replaces the old PhononBSDOSDoc with a PhononBSDOSTask + built PhononBSDOSDoc split.

This is to explicitly allow for storing these as separate objects. PhononBSDOSTask should go to regular mongo collections, whereas band structures and DOS should go to GridFS

  1. phonon_bandstructure and phonon_dos are no longer stored as raw pymatgen objects; they use emmet wrapper schemas (PhononBS, PhononDOS).

There is an opt-in setting for using the new emmet models. The default is still to use the pymatgen models in atomate2 (opt-out).

  1. The following legacy stored fields are not present as explicit stored fields in the new emmet schema:
free_energies
heat_capacities
internal_energies
entropies
temperatures
uuids
jobdirs

There is no reason to store free_energies, heat_capacities, internal_energies, entropies, and temperatures in the document, since these are computed from a set of temperatures. What if the user wants a different set of temps than the one prescribed in the document?

The uuids and jobdirs are stored in a calc_meta field

  1. has_imaginary_modes is now a computed field rather than a plain stored field.

A model computed field will still be saved in the document if it has been calculated. It is just a way to defer calculating an expensive model field until the user requests it

Regarding the other fields being renamed, the document model explicitly handles the migration internally. You don't have to do anything to coerce the fields. Put a decent amount of work into it to do that since it needed to migrate our existing data seamlessly too

Please revert these changes regarding the schema or communicate what changes you need in the schema to get the workflows to run correctly

@hrushikesh-s
Copy link
Copy Markdown
Collaborator Author

Hi @JaGeo and @esoteric-ephemera, thanks for your detailed comments. I will start working on the comments this Friday and will keep you updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants