dara.search.peak_matcher module#
- class PeakMatcher(peak_calc, peak_obs, intensity_resolution=0.01, angle_resolution=0.1, angle_tolerance=0.2, intensity_tolerance=2, max_intensity_tolerance=5)[source]#
Bases:
objectPeak matcher class to match the calculated peaks with the observed peaks.
- Parameters:
peak_calc (
ndarray) – the calculated peaks, (n, 2) array of peaks with [position, intensity]peak_obs (
ndarray) – the observed peaks, (m, 2) array of peaks with [position, intensity]intensity_resolution (
float) – the resolution for the intensity, default to 0.01. Filter out peaks with lower intensityangle_resolution (
float) – the resolution for the angle, default to 0.1angle_tolerance (
float) – the maximum difference in angle, default to 0.3intensity_tolerance (
float) – the maximum ratio of the intensities, default to 2max_intensity_tolerance (
float) – the maximum ratio of the intensities to be considered as missing or extra, default to 10
- property extra: ndarray#
Get the extra peaks in the calculated peaks.
- get_isolated_peaks(peak_type, min_angle_difference=0.3, min_intensity_ratio=0.03)[source]#
Get the isolated missing peaks in the observed peaks.
The isolated missing/extra peaks are the missing/extra peaks that are not close to any other peaks in matched and wrong intensity peaks.
- Parameters:
peak_type (
Literal['missing','extra']) – the type of the peaks to consider, either “missing” or “extra”min_angle_difference (
float) – the tolerance to consider a peak as close to another peak, default to 0.3 degreemin_intensity_ratio (
float) – the minimum ratio of the intensity to be considered as a peak, default to 0.01
- Return type:
ndarray- Returns:
the isolated missing peaks with [position, intensity]
- jaccard_index()[source]#
Calculate the Jaccard index of the matching result.
- Return type:
float- Returns:
the Jaccard index of the matching result
- property matched: tuple[ndarray, ndarray]#
Get the matched peaks in both the calculated peaks and the observed peaks.
- property missing: ndarray#
Get the missing peaks in the observed peaks. The shape should be (N, 2) with [position, intensity].
- score(matched_coeff=1, wrong_intensity_coeff=1, missing_coeff=-0.1, extra_coeff=-0.5, normalize=True)[source]#
Calculate the score of the matching result.
- Parameters:
matched_coeff (
float) – the coefficient of the matched peakswrong_intensity_coeff (
float) – the coefficient of the peaks with wrong intensitiesmissing_coeff (
float) – the coefficient of the missing peaksextra_coeff (
float) – the coefficient of the extra peaksnormalize (
bool) – whether to normalize the score by the total intensity of the observed peaks
- Return type:
float- Returns:
the score of the matching result
- property wrong_intensity: tuple[ndarray, ndarray]#
Get the indices of the peaks with wrong intensities in both the calculated peaks and the observed peaks.
- absolute_log_error(x, y)[source]#
Calculate the absolute error of two arrays in log space.
- Parameters:
x (
ndarray) – array 1y (
ndarray) – array 2
- Return type:
ndarray- Returns:
the absolute error in log space
- distance_matrix(peaks1, peaks2)[source]#
Return the distance matrix between two sets of peaks.
The distance is defined as the maximum of the distance in position and the distance in intensity. The position distance is the absolute difference in position. The intensity distance is the absolute difference in log intensity.
- Parameters:
peaks1 (
ndarray) – (n, 2) array of peaks with [position, intensity]peaks2 (
ndarray) – (m, 2) array of peaks with [position, intensity]
- Return type:
ndarray- Returns:
(n, m) distance matrix
- find_best_match(peak_calc, peak_obs, angle_tolerance=0.2, intensity_tolerance=2, max_intensity_tolerance=5)[source]#
Find the best match between two sets of peaks.
- Parameters:
peak_calc (
ndarray) – the calculated peaks, (n, 2) array of peaks with [position, intensity]peak_obs (
ndarray) – the observed peaks, (m, 2) array of peaks with [position, intensity]angle_tolerance (
float) – the maximum difference in angleintensity_tolerance (
float) – the maximum ratio of the intensitiesmax_intensity_tolerance (
float) – the maximum ratio of the intensities to be considered as
- Return type:
dict[str,Any]- Returns:
- missing[j]:
the indices of the missing peaks in the
obs peaks- matched[i, j]:
the indices of both the matched peaks in the
calculated peaksand theobserved peaksextra[i]: the indices of the extra peaks in thecalculated peaks- wrong_intensity[i, j]:
the indices of the peaks with wrong intensities in both the
calculated peaksand theobserved peaks- residual_peaks (N_peak_obs, 2):
the residual peaks after matching (not including extra peaks in peak_calc)