Skip to content

Databases

FigShare based databases

Open in Colab Open in SLMat

JARVIS databases

Database name Number of data-points Description
dft_3d 75993 Various 3D materials properties in JARVIS-DFT database computed with OptB88vdW and TBmBJ methods
dft_2d 1109 Various 2D materials properties in JARVIS-DFT database computed with OptB88vdW
dft_3d_2021 55723 Various 3D materials properties in JARVIS-DFT database computed with OptB88vdW and TBmBJ methods (2021 version)
dft_2d_2021 1079 Various 2D materials properties in JARVIS-DFT database computed with OptB88vdW (2021 version)
cfid_3d 55723 Various 3D materials properties in JARVIS-DFT database computed with OptB88vdW and TBmBJ methods with CFID
jff 2538 Various 3D materials properties in JARVIS-FF database computed with several force-fields
alignn_ff_db 307113 Energy per atom, forces and stresses for ALIGNN-FF training for 75k materials
edos_pdos 48469 Normalized electron and phonon density of states with interpolated values and fixed number of bins
qe_tb 829574 Various 3D materials properties in JARVIS-QETB database
supercon_3d 1058 3D superconductor DFT dataset
supercon_2d 161 2D superconductor DFT dataset
vacancydb 464 Vacancy formation energy dataset
surfacedb 607 Surface property dataset
interfacedb 593 Interface property dataset
ramandb 5000 Raman spectra dataset
raw_files 144895 Figshare links to download raw calculations VASP files from JARVIS-DFT
stm 1132 2D materials STM images in JARVIS-STM database
wtbh_electron 1440 3D and 2D materials Wannier tight-binding Hamiltonian database for electrons with spin-orbit coupling in JARVIS-WTB (Keyword: 'WANN')
wtbh_phonon 15502 3D and 2D materials Wannier tight-binding Hamiltonian for phonons at Gamma with finite difference (Keyword: FD-ELAST)

Alexandria databases

Database name Number of data-points Description
alex_pbe_hull 116k Alexandria DB convex hull stable materials with PBE functional
alex_pbe_3d_all 5 million Alexandria DB all 3D materials with PBE
alex_pbe_2d_all 200k Alexandria DB all 2D materials with PBE
alex_pbe_1d_all 100k Alexandria DB all 1D materials with PBE
alex_scan_3d_all 500k Alexandria DB all 3D materials with SCAN
alex_pbesol_3d_all 500k Alexandria DB all 3D materials with PBEsol
alex_supercon 8253 Alexandria superconductor database

RRUFF databases

Database name Number of data-points Description
rruff_powder_xrd 1362 RRUFF powder XRD dataset
rruff_raman_excellent 7688 RRUFF Raman spectra dataset
rruff_ir 824 RRUFF IR spectra dataset

Materials Project databases

Database name Number of data-points Description
mp_3d_2020 127k CFID descriptors for materials project (2020)
mp_3d 84k CFID descriptors for 84k materials project
megnet 69239 Formation energy and bandgaps of 3D materials properties in Materials project database as on 2018, used in megnet
megnet2 133k 133k materials and their formation energy in MP
m3gnet_mpf 168k 168k structures and their energy, forces and stresses in MP
m3gnet_mpf_1.5mil 1.5 million 1.5 million structures and their energy, forces and stresses in MP

OQMD databases

Database name Number of data-points Description
oqmd_3d 460k CFID descriptors for 460k materials in OQMD
oqmd_3d_no_cfid 817636 Formation energies and bandgaps of 3D materials from OQMD database

Open Catalyst databases

Database name Number of data-points Description
ocp_all 510214 Open Catalyst 460328 training, rest validation and test dataset
ocp100k 149886 Open Catalyst 100000 training, rest validation and test dataset
ocp10k 59886 Open Catalyst 10000 training, rest validation and test dataset

Catalyst databases

Database name Number of data-points Description
AGRA_O 1000 AGRA Oxygen catalyst dataset
AGRA_OH 875 AGRA OH catalyst dataset
AGRA_COOH 280 AGRA COOH catalyst dataset
AGRA_CHO 214 AGRA CHO catalyst dataset
AGRA_CO 193 AGRA CO catalyst dataset
tinnet_N 329 TinNet Nitrogen catalyst dataset
tinnet_O 747 TinNet Oxygen catalyst dataset
tinnet_OH 748 TinNet OH group catalyst dataset

QM9 and molecular databases

Database name Number of data-points Description
qm9_std_jctc 130829 Various properties of molecules in QM9 database (standardized)
qm9_dgl 130829 Various properties of molecules in QM9 dgl database
qm9 134k Various properties of molecules in QM9 database with CFID
hopv 4855 Various properties of molecules in HOPV15 dataset
pdbbind 11189 Bio-molecular complexes database from PDBBind v2015
pdbbind_core 195 Bio-molecular complexes database from PDBBind core
cccbdb 1333 NIST CCCBDB computational chemistry dataset

MOF databases

Database name Number of data-points Description
qmof 20425 Bandgaps and total energies of metal organic frameworks in QMOF database
hmof 137651 Hypothetical MOF database

2D materials databases (external)

Database name Number of data-points Description
c2db 3514 Various properties in C2DB database
twod_matpd 6351 Formation energy and bandgaps of 2D materials properties in 2DMatPedia database
mxene275 275 MXene dataset

Other materials databases

Database name Number of data-points Description
aflow2 400k AFLOW dataset
cod 431778 Atomic structures from crystallographic open database
snumat 10481 Bandgaps with hybrid functional
polymer_genome 1073 Electronic bandgap and dielectric constants of crystalline polymers in polymer genome database
omdb 12500 Bandgaps for organic polymers in OMDB database
halide_peroskites 229 Halide perovskite dataset
supercon_chem 16414 Superconductor chemical formula dataset
mag2d_chem 226 Magnetic 2D materials chemical formula dataset
ssub 1726 SSUB formation energy for chemical formula dataset
mlearn 1730 Machine learning force-field for elements datasets
foundry_ml_exp_bandgaps 2069 Foundry ML experimental bandgaps dataset

Text and NLP databases

Database name Number of data-points Description
arXiv 1796911 arXiv dataset 1.8 million title, abstract and id dataset
arxiv_summary 137927 arXiv summary dataset (cond-mat)
cord19 223k CORD-19 COVID-19 research articles dataset

All these datasets can be obtained using jarvis-tools as follows, exception to stm, wtbh_electron, wtbh_phonon which have their own modules in jarvis.db.figshare:

from jarvis.db.figshare import data
d = data('dft_3d') #choose a name of dataset from above
# See available keys
print (d[0].keys())
# Dataset size
print(len(d))

# Visualize an atoms object
from jarvis.core.atoms import Atoms
a = Atoms.from_dict(d[0]['atoms'])
#You can visualize this in VESTA or other similar packages
print(a)

# If pandas framework needed
import pandas as pd
df = pd.DataFrame(d)
print(df)