dara.utils module#
Utility functions for the dara package.
- compositions_to_array(compositions)[source]#
Convert a list of compositions/formulas to an array of their fractional elemental components.
- copy_and_rename_files(file_map, dest_directory, verbose=True)[source]#
Copy files (and rename them) into a destination directory using a provided mapping.
src_directory: Path to the source directory dest_directory: Path to the destination directory file_map: Dictionary where keys are original filenames and values are new filenames
- find_optimal_intensity_threshold(intensities, percentile=90)[source]#
Find the intensity threshold that captures percentile% of the intensities.
- Parameters:
intensities (
list[float] |ndarray) – the list of intensitiespercentile (
float) – the percentile to capture, defaults to 90
- Return type:
float- Returns:
the intensity threshold
- find_optimal_score_threshold(scores)[source]#
Find the inflection point from a list of scores. We will calculate the percentile first.
- Return type:
tuple[float,ndarray]
- get_composition_distance(comp1, comp2, order=2)[source]#
Calculate the distance between two compositions.
The default is the Manhattan.
- Return type:
float
- get_composition_from_filename(file_name)[source]#
Get the composition from the filename. The composition is assumed to be the first part of the filename. For example, “BaSnO3_01.xrdml” will return “BaSnO3”.
- Return type:
Composition
- get_compositional_clusters(paths, distance_threshold=0.1)[source]#
Get similar clusters of compositions based on their compositional similarity. Uses AgglomerativeClustering with a distance threshold of 0.1.
- Return type:
list[list[Path|str]]
- get_entries_in_chemsys_db(db, chemsys)[source]#
Get all computed entries from a database covering all possible sub-chemical systems.
This is equivalent to MPRester.get_entries_in_chemsys.
- Parameters:
db (
MongoStore) – the database (must be connected!)chemsys (
list[str] |str) – a chemical system, either as a string (e.g., “Li-Fe-O”) or as a list of elements.
- get_entries_in_chemsys_mp(chemsys)[source]#
Download ComputedStructureEntry objects from Materials Project.
- get_head_of_compositional_cluster(paths)[source]#
Get head of a compositional cluster. This returns the closest stoichiometric composition to the average composition. If no stoichiometric composition is found, then the nonstoichiometric composition with the smallest distance to the average composition is returned.
- Return type:
Composition
- get_logger(name, level=10, log_format='%(asctime)s %(levelname)s %(name)s %(message)s', stream=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>)[source]#
Code borrowed from the atomate package.
Helper method for acquiring logger.
- get_optimal_max_two_theta(peak_data, fraction=0.7, intensity_filter=0.1)[source]#
Get the optimal 2theta max given detected peaks. The range is determined by proportion of the detected peaks.
- Parameters:
fraction (
float) – The fraction of the detected peaks. Defaults to 0.7.intensity_filter – The intensity filter; the fraction of the max intensity that is required for a peak to be acknowledged in this analysis. Defaults to 0.1.
min_threshold – The minimum threshold for the 2theta range. Defaults to 50. If set to None, the minimum threshold will be the 2theta of the last peak.
- Return type:
float- Returns:
A tuple of the optimal 2theta range.
- intensity_correction(intensity, d_inv, gsum, wavelength, pol=1)[source]#
Translated from Profex source (bgmnparparser.cpp:L112)
- Parameters:
intensity (
float) – the intensity of the peakgsum (
float) – the gsum of the peakd_inv (
float) – the inverse of the d-spacingwavelength (
float) – the wavelength of the X-raypol (
float) – the polarization factor, defaults to 1
- Returns:
the corrected intensity
- load_symmetrized_structure(cif_path)[source]#
Load the symmetrized structure from a CIF file. This function will symmetrize the structure and provide a spacegroup analyzer object as well.
- Return type:
tuple[SymmetrizedStructure,SpacegroupAnalyzer]
- parse_refinement_param(refinement_param)[source]#
- Return type:
tuple[str|float,float|None,float|None]
- process_phase_name(phase_name)[source]#
Process the phase name to remove special characters.
- Return type:
str
- read_phase_name_from_str(str_path)[source]#
Get the phase name from the str file path.
Example of str: PHASE=BaSnO3 // generated from pymatgen FORMULA=BaSnO3 // Lattice=Cubic HermannMauguin=P4/m-32/m Setting=1 SpacegroupNo=221 // PARAM=A=0.41168_0.40756^0.41580 // RP=4 PARAM=k1=0_0^1 k2=0 PARAM=B1=0_0^0.01 PARAM=GEWICHT=0_0 // GOAL:BaSnO3=GEWICHT // GOAL=GrainSize(1,1,1) // E=BA+2 Wyckoff=b x=0.500000 y=0.500000 z=0.500000 TDS=0.010000 E=SN+4 Wyckoff=a x=0.000000 y=0.000000 z=0.000000 TDS=0.010000 E=O-2 Wyckoff=d x=0.000000 y=0.000000 z=0.500000 TDS=0.010000
- Return type:
str
- rpb(y_calc, y_obs, y_bkg)[source]#
Calculate the Rietveld profile without background (RPB) for a refinement.
The result is in percentage.
- Parameters:
y_calc (
ndarray) – the calculated intensityy_obs (
ndarray) – the observed intensity
- Return type:
float- Returns:
the RPB