dara.utils module#

Utility functions for the dara package.

angular_correction(tt, eps1, eps2)[source]#
bool2yn(value)[source]#

Convert boolean to Y (yes) or N (no).

Return type:

str

compositions_to_array(compositions)[source]#

Convert a list of compositions/formulas to an array of their fractional elemental components.

copy_and_rename_files(file_map, dest_directory, verbose=True)[source]#

Copy files (and rename them) into a destination directory using a provided mapping.

src_directory: Path to the source directory dest_directory: Path to the destination directory file_map: Dictionary where keys are original filenames and values are new filenames

datetime_str()[source]#

Get a string representation of the current time.

Return type:

str

find_optimal_intensity_threshold(intensities, percentile=90)[source]#

Find the intensity threshold that captures percentile% of the intensities.

Parameters:
  • intensities (list[float] | ndarray) – the list of intensities

  • percentile (float) – the percentile to capture, defaults to 90

Return type:

float

Returns:

the intensity threshold

find_optimal_score_threshold(scores)[source]#

Find the inflection point from a list of scores. We will calculate the percentile first.

Return type:

tuple[float, ndarray]

fuzzy_compare(a, b)[source]#
get_chemsys_from_formulas(formulas)[source]#

Convert a list of formulas to a chemsys.

get_composition_distance(comp1, comp2, order=2)[source]#

Calculate the distance between two compositions.

The default is the Manhattan.

Return type:

float

get_composition_from_filename(file_name)[source]#

Get the composition from the filename. The composition is assumed to be the first part of the filename. For example, “BaSnO3_01.xrdml” will return “BaSnO3”.

Return type:

Composition

get_compositional_clusters(paths, distance_threshold=0.1)[source]#

Get similar clusters of compositions based on their compositional similarity. Uses AgglomerativeClustering with a distance threshold of 0.1.

Return type:

list[list[Path | str]]

get_entries_db(db, chemsys)[source]#

Get entries for a specific chemical system from a database.

get_entries_in_chemsys_db(db, chemsys)[source]#

Get all computed entries from a database covering all possible sub-chemical systems.

This is equivalent to MPRester.get_entries_in_chemsys.

Parameters:
  • db (MongoStore) – the database (must be connected!)

  • chemsys (list[str] | str) – a chemical system, either as a string (e.g., “Li-Fe-O”) or as a list of elements.

get_entries_in_chemsys_mp(chemsys)[source]#

Download ComputedStructureEntry objects from Materials Project.

get_head_of_compositional_cluster(paths)[source]#

Get head of a compositional cluster. This returns the closest stoichiometric composition to the average composition. If no stoichiometric composition is found, then the nonstoichiometric composition with the smallest distance to the average composition is returned.

Return type:

Composition

get_logger(name, level=10, log_format='%(asctime)s %(levelname)s %(name)s %(message)s', stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]#

Code borrowed from the atomate package.

Helper method for acquiring logger.

get_number(s)[source]#

Get the number from a float or tuple of floats.

Return type:

Optional[float]

get_optimal_max_two_theta(peak_data, fraction=0.7, intensity_filter=0.1)[source]#

Get the optimal 2theta max given detected peaks. The range is determined by proportion of the detected peaks.

Parameters:
  • fraction (float) – The fraction of the detected peaks. Defaults to 0.7.

  • intensity_filter – The intensity filter; the fraction of the max intensity that is required for a peak to be acknowledged in this analysis. Defaults to 0.1.

  • min_threshold – The minimum threshold for the 2theta range. Defaults to 50. If set to None, the minimum threshold will be the 2theta of the last peak.

Return type:

float

Returns:

A tuple of the optimal 2theta range.

get_wavelength(wavelength_or_target_metal)[source]#
Return type:

float

intensity_correction(intensity, d_inv, gsum, wavelength, pol=1)[source]#

Translated from Profex source (bgmnparparser.cpp:L112)

Parameters:
  • intensity (float) – the intensity of the peak

  • gsum (float) – the gsum of the peak

  • d_inv (float) – the inverse of the d-spacing

  • wavelength (float) – the wavelength of the X-ray

  • pol (float) – the polarization factor, defaults to 1

Returns:

the corrected intensity

load_symmetrized_structure(cif_path)[source]#

Load the symmetrized structure from a CIF file. This function will symmetrize the structure and provide a spacegroup analyzer object as well.

Return type:

tuple[SymmetrizedStructure, SpacegroupAnalyzer]

parse_refinement_param(refinement_param)[source]#
Return type:

tuple[str | float, float | None, float | None]

process_phase_name(phase_name)[source]#

Process the phase name to remove special characters.

Return type:

str

read_phase_name_from_str(str_path)[source]#

Get the phase name from the str file path.

Example of str: PHASE=BaSnO3 // generated from pymatgen FORMULA=BaSnO3 // Lattice=Cubic HermannMauguin=P4/m-32/m Setting=1 SpacegroupNo=221 // PARAM=A=0.41168_0.40756^0.41580 // RP=4 PARAM=k1=0_0^1 k2=0 PARAM=B1=0_0^0.01 PARAM=GEWICHT=0_0 // GOAL:BaSnO3=GEWICHT // GOAL=GrainSize(1,1,1) // E=BA+2 Wyckoff=b x=0.500000 y=0.500000 z=0.500000 TDS=0.010000 E=SN+4 Wyckoff=a x=0.000000 y=0.000000 z=0.000000 TDS=0.010000 E=O-2 Wyckoff=d x=0.000000 y=0.000000 z=0.500000 TDS=0.010000

Return type:

str

rpb(y_calc, y_obs, y_bkg)[source]#

Calculate the Rietveld profile without background (RPB) for a refinement.

The result is in percentage.

Parameters:
  • y_calc (ndarray) – the calculated intensity

  • y_obs (ndarray) – the observed intensity

Return type:

float

Returns:

the RPB

rwp(y_calc, y_obs)[source]#

Calculate the Rietveld weighted profile (RWP) for a refinement.

The result is in percentage.

Parameters:
  • y_calc (ndarray) – the calculated intensity

  • y_obs (ndarray) – the observed intensity

Return type:

float

Returns:

the RWP

standardize_coords(x, y, z)[source]#