dara.search.tree module#

class BaseSearchTree(pattern_path, all_phases_result, peak_obs, refine_params, phase_params, intensity_threshold, wavelength, instrument_profile, express_mode, maximum_grouping_distance, max_phases, rpb_threshold, pinned_phases=None, record_peak_matcher_scores=False, *args, **kwargs)[source]#

Bases: Tree

A base class for the search tree. It is not intended to be used directly.

Parameters:
  • pattern_path (Path) – the path to the pattern

  • all_phases_result (dict[RefinementPhase, RefinementResult] | None) – the result of all the phases

  • peak_obs (ndarray | None) – the observed peaks

  • refine_params (dict[str, ...] | None) – the refinement parameters, it will be passed to the refinement function.

  • phase_params (dict[str, ...] | None) – the phase parameters, it will be passed to the refinement function.

  • intensity_threshold (float) – the intensity threshold to tell if a peak is significant

  • instrument_profile (str | Path) – the name/path of the instrument file, it will be passed to the refinement function.

  • maximum_grouping_distance (float) – the maximum grouping distance, default to 0.1

  • max_phases (float) – the maximum number of phases

  • rpb_threshold (float) – the minimum RPB improvement in each step

  • pinned_phases (list[RefinementPhase] | None) – the phases that are pinned and will be included in all the results

add_subtree(anchor_nid, search_tree)[source]#

Add a subtree to the search tree.

Parameters:
  • anchor_nid (str) – the node id that the subtree will be added to

  • search_tree (BaseSearchTree) – the search tree that will be added to the search tree

Returns:

the merged search tree

expand_node(nid)[source]#

Expand a node in the search tree.

This method will first do a naive search match method to find the best matched phases. Then it will refine the best matched phases and add the results to the search tree.

Parameters:

nid (str) – the node id

Return type:

list[str]

expand_root()[source]#

Expand the root node.

Return type:

list[str]

classmethod from_search_tree(root_nid, search_tree)[source]#

Create a new search tree from an existing search tree.

Parameters:
  • root_nid (str) – the node id that will be used as the root node for the new search tree

  • search_tree (BaseSearchTree) – the search tree that will be used to create the new search tree

Return type:

BaseSearchTree

Returns:

the new search tree

get_all_possible_nodes_at_same_level(node)[source]#

Get all possible phases that can be added to the current phase combination at this level.

Parameters:

node (Node) – the node in the search tree

Return type:

tuple[Node, ...]

Returns:

a list of selected node

get_expandable_children(nid)[source]#

Get the expandable children of a node.

The expandable children are the children that have not been expanded yet, which is marked as “pending”.

Parameters:

nid (str) – the node id

Return type:

list[str]

Returns:

a list of node ids that are expandable

get_phase_combinations(node)[source]#

Get all the phase combinations at this node.

Parameters:

node (Node) – the node that will be used to get the phase combinations

Return type:

tuple[list[tuple[RefinementPhase, ...]], list[tuple[float, ...]], list[tuple[float, ...]]]

Returns:

a tuple of the phase combinations

refine_phases(phases, pinned_phases=None)[source]#

Get the result of all the phases.

Parameters:
  • phases (list[RefinementPhase]) – the phases

  • pinned_phases (list[RefinementPhase] | None) – the pinned phases thta will be included in all the refinement

Return type:

dict[RefinementPhase, RefinementResult | None]

Returns:

a dictionary containing the phase and its result

score_phases(all_phases_result, current_result=None)[source]#

Get the best matched phases.

This is a naive search-match method based on the peak matching score. It will return the best matched phases, all phases’ scores, and the score’s threshold.

The threshold is determined by finding the inflection point of the percentile of the scores.

Parameters:
Return type:

tuple[list[RefinementPhase], dict[RefinementPhase, list[float]], float]

Returns:

a tuple containing the best matched phases, all phases’ scores, and the score’s threshold

class SearchTree(pattern_path, cif_paths, pinned_phases=None, refine_params=None, phase_params=None, wavelength='Cu', instrument_profile='Aeris-fds-Pixcel1d-Medipix3', express_mode=True, enable_angular_cut=True, maximum_grouping_distance=0.1, max_phases=5, rpb_threshold=4, record_peak_matcher_scores=False, *args, **kwargs)[source]#

Bases: BaseSearchTree

A class for the search tree.

Parameters:
  • pattern_path (Path | str) – the path to the pattern

  • cif_paths (list[RefinementPhase | Path | str]) – the paths to the CIF files

  • pinned_phases (list[RefinementPhase | Path | str] | None) – the phases that will be included in all the refinement

  • refine_params (dict[str, ...] | None) – the refinement parameters, it will be passed to the refinement function.

  • phase_params (dict[str, ...] | None) – the phase parameters, it will be passed to the refinement function.

  • instrument_profile (str | Path) – the name/path of the instrument file, it will be passed to the refinement function.

  • maximum_grouping_distance (float) – the maximum grouping distance, default to 0.1

  • max_phases (float) – the maximum number of phases, note that the pinned phases are COUNTED as well

  • rpb_threshold (float) – the minimium Rpb improvement for the search tree to continue to expand one node.

get_search_results()[source]#

Get the search results.

The search results are the results of the nodes that have been expanded and have no expandable children.

Return type:

list[SearchResult]

Returns:

a dictionary containing the phase combinations and their results

show(nid=None, level=0, idhidden=False, filter=None, key=None, reverse=False, line_type='ascii-ex', data_property='pretty_output', stdout=False, sorting=True)[source]#

Show the search tree.

Parameters:
  • nid – the node id

  • level – the level of the tree

  • idhidden – whether the node id is hidden

  • filter – the filter function

  • key – the sorting key

  • reverse – whether to reverse the sorting

  • line_type – the line type

  • data_property – the data property

  • stdout – whether to print the result

  • sorting – whether to sort the result

Returns:

the string representation of the search tree

batch_peak_matching(peak_calcs, peak_obs, return_type='PeakMatcher', batch_size=100)[source]#
Return type:

list[PeakMatcher | float]

batch_refinement(pattern_path, cif_paths, wavelength='Cu', instrument_profile='Aeris-fds-Pixcel1d-Medipix3', phase_params=None, refinement_params=None)[source]#
Return type:

list[RefinementResult]

calculate_fom_and_strain(phase, result)[source]#

Calculate the figure of merit for a phase and lattice strain.

For more detail, refer to https://journals.iucr.org/j/issues/2019/03/00/nb5231/. :type result: RefinementResult :param result: the refinement result

Return type:

tuple[float, float]

Returns:

the figure of merit of the target phase. If it cannot be calculated, return 0.

get_natural_break_results(results, sorting=True)[source]#

Get the natural break results based on (1-rho) value.

Return type:

list[SearchResult]

group_phases(all_phases_result, distance_threshold=0.1)[source]#

Group the phases based on their similarity.

Parameters:
  • all_phases_result (dict[RefinementPhase, RefinementResult | None]) – the result of all the phases

  • distance_threshold (float) – the distance threshold for clustering, default to 0.1

Return type:

dict[RefinementPhase, dict[str, float | int]]

Returns:

a dictionary containing the group id and the figure of merit for each phase

remove_unnecessary_phases(result, cif_paths, rpb_threshold=0.0)[source]#

Remove unnecessary phases from the result.

If a phase cannot cause increase in RWP, it will be removed.

Return type:

list[Path]