Tutorial 2: Phase analysis with tree search#
Dara is equipped with a parallelized tree search algorithm to identify possible phases present in a given XRD pattern.
In this tutorial, we will try to identify the phases in one experimental solid-state
reaction sample between GeO2 and ZnO.
You can download this tutorial project from here.
%pip install ipywidgets nbformat
from pathlib import Path
from dara import search_phases
pattern_path = "tutorial_data/GeO2-ZnO_700C_60min.xrdml"
# three elements are present in the sample
chemical_system = "Ge-O-Zn"
Step 1: Prepare reference phases#
Dara pre-builds an index of all the unique and low-energy phases in ICSD and COD databases. It also implements a method to download CIF structures from COD data server so that there is no need to obtain the offline database.
Before every search, we will need to gather all the reference phases in the chemical
system for the search algorithm. Dara provides ICSDDatabase and CODDatabase to do
the filtering.
In this example, we will use CODDatabase to download all the phases in the chemical system of Ge-O-Zn.
from dara.structure_db import CODDatabase
# The COD database contains methods to filter phases in the chemical system
cod_database = CODDatabase()
# gather reference phases and save them to a directory called "cifs"
all_icsd_ids = cod_database.get_cifs_by_chemsys(chemical_system, dest_dir="cifs")
2026-01-03 20:25:25,702 WARNING dara.structure_db Local copy of database not found. Attempting to download structures...
2026-01-03 20:25:29,383 INFO dara.structure_db Saving downloaded CIFs to dara_downloaded_cifs
Skipping high-energy phase: 1528389 (Ge, 96): e_hull = 0.1494
Skipping high-energy phase: 9013109 (Ge, 64): e_hull = 0.3137
2026-01-03 20:25:29,393 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,394 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,394 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,395 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,395 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,396 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,396 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,397 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,397 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,398 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,399 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,399 INFO dara.structure_db Skipping common gas: O2
2026-01-03 20:25:29,400 INFO dara.structure_db Skipping common gas: O2
Skipping high-energy phase: 1525835 (GeO2, 205): e_hull = 0.2246
Skipping high-energy phase: 1533322 (Ge7O23, 215): e_hull = 0.6571
Skipping high-energy phase: 1011223 (ZnO2, 19): e_hull = 0.1674
Skipping high-energy phase: 1529590 (ZnO2, 164): e_hull = 0.4588
Skipping high-energy phase: 1534836 (ZnO, 225): e_hull = 0.1473
Successfully copied 9011050.cif to Ge_227_(cod_9011050)-0.cif in cifs
Successfully copied 7101738.cif to Ge_227_(cod_7101738)-0.cif in cifs
Successfully copied 1538108.cif to O17.28_12_(cod_1538108)-None.cif in cifs
Successfully copied 9012435.cif to Zn_194_(cod_9012435)-0.cif in cifs
Successfully copied 4030923.cif to Zn_12_(cod_4030923)-None.cif in cifs
Successfully copied 9007435.cif to GeO2_136_(cod_9007435)-0.cif in cifs
Successfully copied 1525833.cif to GeO2_60_(cod_1525833)-36.cif in cifs
Successfully copied 2104024.cif to GeO2_60_(cod_2104024)-36.cif in cifs
Successfully copied 1526227.cif to GeO2_14_(cod_1526227)-None.cif in cifs
Successfully copied 2300365.cif to GeO2_152_(cod_2300365)-0.cif in cifs
Successfully copied 8000212.cif to Ge5O11_12_(cod_8000212)-None.cif in cifs
Successfully copied 9006858.cif to GeO2_58_(cod_9006858)-6.cif in cifs
Successfully copied 9007477.cif to GeO2_154_(cod_9007477)-0.cif in cifs
Successfully copied 9015579.cif to GeO2_92_(cod_9015579)-1.cif in cifs
Successfully copied 9004178.cif to ZnO_186_(cod_9004178)-0.cif in cifs
Successfully copied 1527883.cif to ZnO2_44_(cod_1527883)-None.cif in cifs
Successfully copied 1536063.cif to Zn10.26O48_160_(cod_1536063)-None.cif in cifs
Successfully copied 1537875.cif to ZnO_216_(cod_1537875)-7.cif in cifs
Successfully copied 4517837.cif to Zn5O12_15_(cod_4517837)-None.cif in cifs
Successfully copied 1007256.cif to Zn2Ge3O8_212_(cod_1007256)-2.cif in cifs
Successfully copied 1549040.cif to Zn2GeO4_227_(cod_1549040)-None.cif in cifs
Successfully copied 1549041.cif to Zn2GeO4_95_(cod_1549041)-None.cif in cifs
Successfully copied 9014631.cif to Zn2GeO4_148_(cod_9014631)-0.cif in cifs
Since we are using a pre-filterd database (i.e., the COD), the downloaded CIF files will automatically be named according to the following convention:
{composition}_{spacegroup}_(cod|icsd_{id})-{e_hull}.cif
Where the e_hull is the energy above the convex hull in meV/atom, as determined from
the Materials Project database for the ground-state entry with matching composition and spacegroup.
Step 2: Search for phases#
After preparing the reference CIFs, we can start the phase search on a provided XRD pattern.
In this case, we are using the XRD pattern from the solid-state reaction sample
on our laboratory’s Aeris diffractometer (tutorial_data/GeO2-ZnO_700C_60min.xrdml).
# gather all the phases in the "cifs" directory
all_cifs = list(Path("cifs").glob("*.cif"))
search_results = search_phases(
pattern_path=pattern_path,
phases=all_cifs,
wavelength="Cu",
instrument_profile="Aeris-fds-Pixcel1d-Medipix3",
)
2026-01-03 20:25:30,291 INFO worker.py:1927 -- Started a local Ray instance.
2026-01-03 20:25:31,342 INFO dara.search.tree Detecting peaks in the pattern.
2026-01-03 20:25:57,843 INFO dara.search.tree The wmax is automatically adjusted to 60.04.
2026-01-03 20:25:57,845 INFO dara.search.tree The intensity threshold is automatically set to 9.06 % of maximum peak intensity.
2026-01-03 20:25:57,845 INFO dara.search.tree Creating the root node.
2026-01-03 20:25:57,846 INFO dara.search.tree Refining all the phases in the dataset.
2026-01-03 20:26:18,628 INFO dara.search.tree The initial value of eps2 is automatically set to 0.000000_-0.05^0.05.
2026-01-03 20:26:18,629 INFO dara.search.tree Finished refining 23 phases, with 7 phases removed.
2026-01-03 20:26:18,630 INFO dara.search.tree Express mode is enabled. Grouping phases before starting.
2026-01-03 20:26:19,226 INFO dara.search.tree Phases are grouped into 15 groups. In express mode, only the best phase in each group will be considered during the search.
(_remote_expand_node pid=2943) 2026-01-03 20:26:19,287 INFO dara.search.tree Expanding node 697bd50a-e8e2-11f0-9c97-000d3a35404d with current phases [], Rwp = None
(_remote_expand_node pid=2945) 2026-01-03 20:26:20,222 INFO dara.search.tree Expanding node 76b335b1-e8e2-11f0-9c97-000d3a35404d with current phases [RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={'k1': '0.000000_0.0^0.01', 'b1': '0.004610_0.0^0.005'})], Rwp = 49.17
Step 3: Result analysis#
The returned search result will be a list of SearchResult object.
search_results
[SearchResult(refinement_result=RefinementResult(lst_data=LstResult(raw_lst='Rietveld refinement to file(s) GeO2-ZnO_700C_60min.xy\nBGMN version 4.2.23, 4614 measured points, 135 peaks, 24 parameters\nStart: Sat Jan 3 20:26:22 2026; End: Sat Jan 3 20:26:24 2026\n23 iteration steps\n\nRp=9.82% Rpb=18.72% R=10.35% Rwp=12.11% Rexp=2.68%\nDurbin-Watson d=0.10\n1-rho=2.04%\n\nGlobal parameters and GOALs\n****************************\nQGeO2152cod23003650=0.4809+-0.0021\nQZnO186cod90041780=0.3862+-0.0024\nQZn2GeO4148cod90146310=0.1329+-0.0013\nEPS2=-0.002894+-0.000012\n\nLocal parameters and GOALs for phase GeO2152cod23003650\n******************************************************\nSpacegroupNo=152\nHermannMauguin=P3_121\nXrayDensity=4.276\nRphase=11.17%\nUNIT=NM\nA=0.499118+-0.000020\nC=0.564812+-0.000033\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.2613+-0.0011\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase GeO2152cod23003650\n---------------------------------------------\n 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))\n\nLocal parameters and GOALs for phase ZnO186cod90041780\n******************************************************\nSpacegroupNo=186\nHermannMauguin=P6_3mc\nXrayDensity=5.669\nRphase=9.24%\nUNIT=NM\nA=0.325086+-0.000010\nC=0.520833+-0.000029\nk1=0\nB1=0.003365+-0.000094\nGEWICHT=0.2098+-0.0019\nGrainSize(1,1,1)=126.1+-3.5\nAtomic positions for phase ZnO186cod90041780\n---------------------------------------------\n 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))\n\nLocal parameters and GOALs for phase Zn2GeO4148cod90146310\n******************************************************\nSpacegroupNo=148\nHermannMauguin=R-3\nXrayDensity=4.776\nRphase=19.33%\nUNIT=NM\nA=1.423920+-0.000081\nC=0.952754+-0.000072\nk1=0.0100000\nB1=0.00500000\nGEWICHT=0.07221+-0.00069\nGrainSize(1,1,1)=84.1811\nAtomic positions for phase Zn2GeO4148cod90146310\n---------------------------------------------\n 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))\n', pattern_name='GeO2-ZnO_700C_60min.xy', num_steps=23, rp=9.82, rpb=18.72, r=10.35, rwp=12.11, rexp=2.68, d=0.1, rho=2.04, phases_results={'GeO2_152_(cod_2300365)-0': PhaseResult(spacegroup_no=152, hermann_mauguin='P3_121', xray_density=4.276, rphase=11.17, unit='NM', gewicht=(0.2613, 0.0011), gewicht_name=None, a=(0.499118, 2e-05), b=None, c=(0.564812, 3.3e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 3 0.4512 0.0000 0.3333 E=(GE(1.0000))\n 6 0.3974 0.3022 0.2429 E=(O(1.0000))', k1=0.01, B1=0.005), 'ZnO_186_(cod_9004178)-0': PhaseResult(spacegroup_no=186, hermann_mauguin='P6_3mc', xray_density=5.669, rphase=9.24, unit='NM', gewicht=(0.2098, 0.0019), gewicht_name=None, a=(0.325086, 1e-05), b=None, c=(0.520833, 2.9e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 2 0.3333 0.6667 0.0000 E=(ZN(1.0000))\n 2 0.3333 0.6667 0.3821 E=(O(1.0000))', k1=0, B1=(0.003365, 9.4e-05)), 'Zn2GeO4_148_(cod_9014631)-0': PhaseResult(spacegroup_no=148, hermann_mauguin='R-3', xray_density=4.776, rphase=19.33, unit='NM', gewicht=(0.07221, 0.00069), gewicht_name=None, a=(1.42392, 8.1e-05), b=None, c=(0.952754, 7.2e-05), alpha=None, beta=None, gamma=None, atom_positions_string=' 18 0.2150 0.1940 0.5830 E=(ZN(1.0000))\n 18 0.5483 0.8607 0.5837 E=(ZN(1.0000))\n 18 0.2150 0.1940 0.2500 E=(GE(1.0000))\n 18 0.8877 0.4633 0.4293 E=(O(1.0000))\n 18 0.2220 0.1310 0.4030 E=(O(1.0000))\n 18 0.2230 0.1140 0.7500 E=(O(1.0000))\n 18 0.9957 0.6613 0.5833 E=(O(1.0000))', k1=0.01, B1=0.005)}, QGeO2152cod23003650=(0.4809, 0.0021), QZnO186cod90041780=(0.3862, 0.0024), QZn2GeO4148cod90146310=(0.1329, 0.0013), EPS2=(-0.002894, 1.2e-05))), phases=((RefinementPhase(path=PosixPath('cifs/GeO2_152_(cod_2300365)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'}), RefinementPhase(path=PosixPath('cifs/GeO2_154_(cod_9007477)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'})), (RefinementPhase(path=PosixPath('cifs/ZnO_186_(cod_9004178)-0.cif'), params={'k1': '0.000000_0.0^0.01', 'b1': '0.004610_0.0^0.005'}),), (RefinementPhase(path=PosixPath('cifs/Zn2GeO4_148_(cod_9014631)-0.cif'), params={'k1': '0.010000_0.0^0.01', 'b1': '0.005000_0.0^0.005'}),)), foms=((0.036651544160763515, 0.036542242871968895), (0.023575326529864105,), (0.014573994434645588,), (0.33021199329441936,)), lattice_strains=((0.0002442188732087964, 0.0005516300389977617), (0.00039042658300650893,), (-0.007717561975400113,), (-0.003294192005003817,)), missing_peaks=[], extra_peaks=[])]
In this pattern, we only have one solution found with Rwp = 12.04 %.
for i in range(len(search_results)):
print(f"Rwp of solution {i} = {search_results[i].refinement_result.lst_data.rwp} %")
Rwp of solution 0 = 12.11 %
Each SearchResult has a .visualize() method to visualize the refined pattern and
missing/extra peaks in the solution. If there are no missing or extra peaks, this option
will not appear.
search_results[0].visualize()
You can also view all the alternative phases in one solution from SearchResult.phases attribute.
print("Phases found in solution 0:")
for i, phases_ in enumerate(search_results[0].phases):
print(f" - Phase {i}: {[phase.path.name for phase in phases_]}")
Phases found in solution 0:
- Phase 0: ['GeO2_152_(cod_2300365)-0.cif', 'GeO2_154_(cod_9007477)-0.cif']
- Phase 1: ['ZnO_186_(cod_9004178)-0.cif']
- Phase 2: ['Zn2GeO4_148_(cod_9014631)-0.cif']
From the result, you can see that for the phase GeO2, the algorithm identifies two
similar phases with slightly different spacegroups (152 and 154).