dara.search.peak_matcher module#

class PeakMatcher(peak_calc, peak_obs, intensity_resolution=0.01, angle_resolution=0.1, angle_tolerance=0.2, intensity_tolerance=2, max_intensity_tolerance=5)[source]#

Bases: object

Peak matcher class to match the calculated peaks with the observed peaks.

Parameters:
  • peak_calc (ndarray) – the calculated peaks, (n, 2) array of peaks with [position, intensity]

  • peak_obs (ndarray) – the observed peaks, (m, 2) array of peaks with [position, intensity]

  • intensity_resolution (float) – the resolution for the intensity, default to 0.01. Filter out peaks with lower intensity

  • angle_resolution (float) – the resolution for the angle, default to 0.1

  • angle_tolerance (float) – the maximum difference in angle, default to 0.3

  • intensity_tolerance (float) – the maximum ratio of the intensities, default to 2

  • max_intensity_tolerance (float) – the maximum ratio of the intensities to be considered as missing or extra, default to 10

property extra: ndarray#

Get the extra peaks in the calculated peaks.

get_isolated_peaks(peak_type, min_angle_difference=0.3, min_intensity_ratio=0.03)[source]#

Get the isolated missing peaks in the observed peaks.

The isolated missing/extra peaks are the missing/extra peaks that are not close to any other peaks in matched and wrong intensity peaks.

Parameters:
  • peak_type (Literal['missing', 'extra']) – the type of the peaks to consider, either “missing” or “extra”

  • min_angle_difference (float) – the tolerance to consider a peak as close to another peak, default to 0.3 degree

  • min_intensity_ratio (float) – the minimum ratio of the intensity to be considered as a peak, default to 0.01

Return type:

ndarray

Returns:

the isolated missing peaks with [position, intensity]

jaccard_index()[source]#

Calculate the Jaccard index of the matching result.

Return type:

float

Returns:

the Jaccard index of the matching result

property matched: tuple[ndarray, ndarray]#

Get the matched peaks in both the calculated peaks and the observed peaks.

property missing: ndarray#

Get the missing peaks in the observed peaks. The shape should be (N, 2) with [position, intensity].

score(matched_coeff=1, wrong_intensity_coeff=1, missing_coeff=-0.1, extra_coeff=-0.5, normalize=True)[source]#

Calculate the score of the matching result.

Parameters:
  • matched_coeff (float) – the coefficient of the matched peaks

  • wrong_intensity_coeff (float) – the coefficient of the peaks with wrong intensities

  • missing_coeff (float) – the coefficient of the missing peaks

  • extra_coeff (float) – the coefficient of the extra peaks

  • normalize (bool) – whether to normalize the score by the total intensity of the observed peaks

Return type:

float

Returns:

the score of the matching result

visualize()[source]#
property wrong_intensity: tuple[ndarray, ndarray]#

Get the indices of the peaks with wrong intensities in both the calculated peaks and the observed peaks.

absolute_log_error(x, y)[source]#

Calculate the absolute error of two arrays in log space.

Parameters:
  • x (ndarray) – array 1

  • y (ndarray) – array 2

Return type:

ndarray

Returns:

the absolute error in log space

distance_matrix(peaks1, peaks2)[source]#

Return the distance matrix between two sets of peaks.

The distance is defined as the maximum of the distance in position and the distance in intensity. The position distance is the absolute difference in position. The intensity distance is the absolute difference in log intensity.

Parameters:
  • peaks1 (ndarray) – (n, 2) array of peaks with [position, intensity]

  • peaks2 (ndarray) – (m, 2) array of peaks with [position, intensity]

Return type:

ndarray

Returns:

(n, m) distance matrix

find_best_match(peak_calc, peak_obs, angle_tolerance=0.2, intensity_tolerance=2, max_intensity_tolerance=5)[source]#

Find the best match between two sets of peaks.

Parameters:
  • peak_calc (ndarray) – the calculated peaks, (n, 2) array of peaks with [position, intensity]

  • peak_obs (ndarray) – the observed peaks, (m, 2) array of peaks with [position, intensity]

  • angle_tolerance (float) – the maximum difference in angle

  • intensity_tolerance (float) – the maximum ratio of the intensities

  • max_intensity_tolerance (float) – the maximum ratio of the intensities to be considered as

Return type:

dict[str, Any]

Returns:

missing[j]:

the indices of the missing peaks in the obs peaks

matched[i, j]:

the indices of both the matched peaks in the calculated peaks and the observed peaks extra[i]: the indices of the extra peaks in the calculated peaks

wrong_intensity[i, j]:

the indices of the peaks with wrong intensities in both the calculated peaks and the observed peaks

residual_peaks (N_peak_obs, 2):

the residual peaks after matching (not including extra peaks in peak_calc)

merge_peaks(peaks, resolution=0.0)[source]#

Merge peaks that are too close to each other (smaller than resolution).

Parameters:
  • peaks (ndarray) – the peaks to merge

  • resolution (float) – the resolution to use for merging

Return type:

ndarray

Returns:

the merged peaks