Tutorial 2: Phase analysis with tree search#
Dara is equipped with a parallelized tree search algorithm to identify possible phases present in a given XRD pattern.
In this tutorial, we will try to identify the phases in one experimental solid-state
reaction sample between GeO2 and ZnO.
You can download this tutorial project from here.
%pip install ipywidgets nbformat
from pathlib import Path
from dara import search_phases
pattern_path = "tutorial_data/GeO2-ZnO_700C_60min.xrdml"
# three elements are present in the sample
chemical_system = "Ge-O-Zn"
Step 1: Prepare reference phases#
Dara pre-builds an index of all the unique and low-energy phases in ICSD and COD databases. It also implements a method to download CIF structures from COD data server so that there is no need to obtain the offline database.
Before every search, we will need to gather all the reference phases in the chemical
system for the search algorithm. Dara provides ICSDDatabase and CODDatabase to do
the filtering.
In this example, we will use CODDatabase to download all the phases in the chemical system of Ge-O-Zn.
from dara.structure_db import CODDatabase
# The COD database contains methods to filter phases in the chemical system
cod_database = CODDatabase()
# gather reference phases and save them to a directory called "cifs"
all_icsd_ids = cod_database.get_cifs_by_chemsys(chemical_system, dest_dir="cifs")
2026-01-25 06:25:19,804 WARNING dara.structure_db Local copy of database not found. Attempting to download structures...
2026-01-25 06:25:22,744 INFO dara.structure_db Saving downloaded CIFs to dara_downloaded_cifs
Skipping high-energy phase: 1528389 (Ge, 96): e_hull = 0.1494
Skipping high-energy phase: 9013109 (Ge, 64): e_hull = 0.3137
2026-01-25 06:25:22,754 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,754 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,755 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,755 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,756 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,756 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,756 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,757 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,757 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,757 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,758 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,758 INFO dara.structure_db Skipping common gas: O2
2026-01-25 06:25:22,759 INFO dara.structure_db Skipping common gas: O2
Skipping high-energy phase: 1011223 (ZnO2, 19): e_hull = 0.1674
Skipping high-energy phase: 1529590 (ZnO2, 164): e_hull = 0.4588
Skipping high-energy phase: 1534836 (ZnO, 225): e_hull = 0.1473
Skipping high-energy phase: 1525835 (GeO2, 205): e_hull = 0.2246
Skipping high-energy phase: 1533322 (Ge7O23, 215): e_hull = 0.6571
Successfully copied 9012435.cif to Zn_194_(cod_9012435)-0.cif in cifs
Successfully copied 4030923.cif to Zn_12_(cod_4030923)-None.cif in cifs
Successfully copied 9011050.cif to Ge_227_(cod_9011050)-0.cif in cifs
Successfully copied 7101738.cif to Ge_227_(cod_7101738)-0.cif in cifs
Successfully copied 1538108.cif to O17.28_12_(cod_1538108)-None.cif in cifs
Successfully copied 9004178.cif to ZnO_186_(cod_9004178)-0.cif in cifs
Successfully copied 1527883.cif to ZnO2_44_(cod_1527883)-None.cif in cifs
Successfully copied 1536063.cif to Zn10.26O48_160_(cod_1536063)-None.cif in cifs
Successfully copied 1537875.cif to ZnO_216_(cod_1537875)-7.cif in cifs
Successfully copied 4517837.cif to Zn5O12_15_(cod_4517837)-None.cif in cifs
Successfully copied 9007435.cif to GeO2_136_(cod_9007435)-0.cif in cifs
Successfully copied 1525833.cif to GeO2_60_(cod_1525833)-36.cif in cifs
Successfully copied 2104024.cif to GeO2_60_(cod_2104024)-36.cif in cifs
Successfully copied 1526227.cif to GeO2_14_(cod_1526227)-None.cif in cifs
Successfully copied 2300365.cif to GeO2_152_(cod_2300365)-0.cif in cifs
Successfully copied 8000212.cif to Ge5O11_12_(cod_8000212)-None.cif in cifs
Successfully copied 9006858.cif to GeO2_58_(cod_9006858)-6.cif in cifs
Successfully copied 9007477.cif to GeO2_154_(cod_9007477)-0.cif in cifs
Successfully copied 9015579.cif to GeO2_92_(cod_9015579)-1.cif in cifs
Successfully copied 1007256.cif to Zn2Ge3O8_212_(cod_1007256)-2.cif in cifs
Successfully copied 1549040.cif to Zn2GeO4_227_(cod_1549040)-None.cif in cifs
Successfully copied 1549041.cif to Zn2GeO4_95_(cod_1549041)-None.cif in cifs
Successfully copied 9014631.cif to Zn2GeO4_148_(cod_9014631)-0.cif in cifs
Since we are using a pre-filterd database (i.e., the COD), the downloaded CIF files will automatically be named according to the following convention:
{composition}_{spacegroup}_(cod|icsd_{id})-{e_hull}.cif
Where the e_hull is the energy above the convex hull in meV/atom, as determined from
the Materials Project database for the ground-state entry with matching composition and spacegroup.
Step 2: Search for phases#
After preparing the reference CIFs, we can start the phase search on a provided XRD pattern.
In this case, we are using the XRD pattern from the solid-state reaction sample
on our laboratory’s Aeris diffractometer (tutorial_data/GeO2-ZnO_700C_60min.xrdml).
# gather all the phases in the "cifs" directory
all_cifs = list(Path("cifs").glob("*.cif"))
search_results = search_phases(
pattern_path=pattern_path,
phases=all_cifs,
wavelength="Cu",
instrument_profile="Aeris-fds-Pixcel1d-Medipix3",
)
2026-01-25 06:25:23,659 INFO worker.py:1927 -- Started a local Ray instance.
2026-01-25 06:25:24,746 INFO dara.search.tree Detecting peaks in the pattern.
2026-01-25 06:25:51,291 INFO dara.search.tree The wmax is automatically adjusted to 60.04.
2026-01-25 06:25:51,292 INFO dara.search.tree The intensity threshold is automatically set to 9.06 % of maximum peak intensity.
2026-01-25 06:25:51,293 INFO dara.search.tree Creating the root node.
2026-01-25 06:25:51,293 INFO dara.search.tree Refining all the phases in the dataset.
2026-01-25 06:26:13,448 INFO dara.search.tree The initial value of eps2 is automatically set to 0.000000_-0.05^0.05.
2026-01-25 06:26:13,450 INFO dara.search.tree Finished refining 23 phases, with 7 phases removed.
2026-01-25 06:26:13,450 INFO dara.search.tree Express mode is enabled. Grouping phases before starting.
2026-01-25 06:26:13,978 INFO dara.search.tree Phases are grouped into 15 groups. In express mode, only the best phase in each group will be considered during the search.
(_remote_expand_node pid=2968) 2026-01-25 06:26:14,018 INFO dara.search.tree Expanding node b1ecd0ac-f9b6-11f0-a095-7ced8d093cd6 with current phases [], Rwp = None
(_remote_expand_node pid=2968) 2026-01-25 06:26:14,935 INFO dara.search.tree Expanding node bfe8aabe-f9b6-11f0-a095-7ced8d093cd6 with current phases [RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'})], Rwp = 41.3
(_remote_expand_node pid=2968) 2026-01-25 06:26:20,028 INFO dara.search.tree Expanding node c2ef882c-f9b6-11f0-a095-7ced8d093cd6 with current phases [RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'}), RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={'k1': '0.000000_0.0^0.01', 'b1': '0.004610_0.0^0.005'}), RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'})], Rwp = 12.11 [repeated 4x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
Step 3: Result analysis#
The returned search result will be a list of SearchResult object.
search_results
[SearchResult(refinement_result=RefinementResult(lst_data=LstResult(raw_lst='Rietveld refinement to file(s) GeO2-ZnO_700C_60min.xy\nBGMN version 4.2.23, 4614 measured points, 135 peaks, 24 parameters\nStart: Sun Jan 25 06:26:17 2026; End: Sun Jan 25 06:26:19 2026\n23 iteration steps\n\nRp=9.82% Rpb=18.72% R=10.35% Rwp=12.11% Rexp=2.68%\nDurbin-Watson d=0.10\n1-rho=2.04%\n\nGlobal parameters and GOALs\n****************************\nQGeO2152cod23003650=0.4809+-0.0021\nQZnO186cod90041780=0.3862+-0.0024\nQZn2GeO4148cod90146310=0.1329+-0.0013\nEPS2=-0.002894+-0.000012\n\nLocal parameters and GOALs for phase GeO2152cod23003650\n******************************************************\nSpacegroupNo=152\nHermannMauguin=P3_121\nXrayDensity=4.276\nRphase=11.17%\nUNIT=NM\nA=0.499118+-0.000020\nC=0.564812+-0.000033\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.2613+-0.0011\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase GeO2152cod23003650\n---------------------------------------------\n 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))\n\nLocal parameters and GOALs for phase ZnO186cod90041780\n******************************************************\nSpacegroupNo=186\nHermannMauguin=P6_3mc\nXrayDensity=5.669\nRphase=9.24%\nUNIT=NM\nA=0.325086+-0.000010\nC=0.520833+-0.000029\nk1=0\nB1=0.003365+-0.000094\nGEWICHT=0.2098+-0.0019\nGrainSize(1,1,1)=126.1+-3.5\nAtomic positions for phase ZnO186cod90041780\n---------------------------------------------\n 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))\n\nLocal parameters and GOALs for phase Zn2GeO4148cod90146310\n******************************************************\nSpacegroupNo=148\nHermannMauguin=R-3\nXrayDensity=4.776\nRphase=19.33%\nUNIT=NM\nA=1.423920+-0.000081\nC=0.952754+-0.000072\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.07221+-0.00069\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase Zn2GeO4148cod90146310\n---------------------------------------------\n 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))\n', pattern_name='GeO2-ZnO_700C_60min.xy', num_steps=23, rp=9.82, rpb=18.72, r=10.35, rwp=12.11, rexp=2.68, d=0.1, rho=2.04, phases_results={'GeO2_152_(cod_2300365)-0': PhaseResult(spacegroup_no=152, hermann_mauguin='P3_121', xray_density=4.276, rphase=11.17, unit='NM', gewicht=(0.2613, 0.0011), gewicht_name=None, a=(0.499118, 2e-05), b=None, c=(0.564812, 3.3e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))', k1=0.01, B1=0.005), 'ZnO_186_(cod_9004178)-0': PhaseResult(spacegroup_no=186, hermann_mauguin='P6_3mc', xray_density=5.669, rphase=9.24, unit='NM', gewicht=(0.2098, 0.0019), gewicht_name=None, a=(0.325086, 1e-05), b=None, c=(0.520833, 2.9e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))', k1=0, B1=(0.003365, 9.4e-05)), 'Zn2GeO4_148_(cod_9014631)-0': PhaseResult(spacegroup_no=148, hermann_mauguin='R-3', xray_density=4.776, rphase=19.33, unit='NM', gewicht=(0.07221, 0.00069), gewicht_name=None, a=(1.42392, 8.1e-05), b=None, c=(0.952754, 7.2e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))', k1=0.01, B1=0.005)}, QGeO2152cod23003650=(0.4809, 0.0021), QZnO186cod90041780=(0.3862, 0.0024), QZn2GeO4148cod90146310=(0.1329, 0.0013), EPS2=(-0.002894, 1.2e-05))), phases=((RefinementPhase(path=PosixPath('cifs/GeO2_154_(cod_9007477)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'}), RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'})), (RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={'k1': '0.000000_0.0^0.01', 'b1': '0.004610_0.0^0.005'}),), (RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'}),)), foms=((0.036542242871968895, 0.036651544160763515), (0.023575326529864105,), (0.014573994434645588,), (0.33021199329441936,)), lattice_strains=((0.0005516300389977617, 0.0002442188732087964), (0.00039042658300650893,), (-0.007717561975400113,), (-0.003294192005003817,)), missing_peaks=[], extra_peaks=[])]
In this pattern, we only have one solution found with Rwp = 12.04 %.
for i in range(len(search_results)):
print(f"Rwp of solution {i} = {search_results[i].refinement_result.lst_data.rwp} %")
Rwp of solution 0 = 12.11 %
Each SearchResult has a .visualize() method to visualize the refined pattern and
missing/extra peaks in the solution. If there are no missing or extra peaks, this option
will not appear.
search_results[0].visualize()
You can also view all the alternative phases in one solution from SearchResult.phases attribute.
print("Phases found in solution 0:")
for i, phases_ in enumerate(search_results[0].phases):
print(f" - Phase {i}: {[phase.path.name for phase in phases_]}")
Phases found in solution 0:
- Phase 0: ['GeO2_154_(cod_9007477)-0.cif', 'GeO2_152_(cod_2300365)-0.cif']
- Phase 1: ['ZnO_186_(cod_9004178)-0.cif']
- Phase 2: ['Zn2GeO4_148_(cod_9014631)-0.cif']
From the result, you can see that for the phase GeO2, the algorithm identifies two
similar phases with slightly different spacegroups (152 and 154).