U.S. flag An official website of the United States government

On Oct. 1, 2024, the FDA began implementing a reorganization impacting many parts of the agency. We are in the process of updating FDA.gov content to reflect these changes.

  1. Home
  2. Science & Research
  3. Bioinformatics Tools
  4. Endocrine Disruptor Knowledge Base (EDKB)
  5. EDKB Resources
  1. Endocrine Disruptor Knowledge Base (EDKB)

EDKB Database with Chemical Structure Search
Estrogen Receptor Binding Dataset
Androgen Receptor Binding Dataset
Comparative Molecular Field Analysis Model for Estrogen Receptor Binding
Comparative Molecular Field Analysis Model for Androgen Receptor Binding
Decision Forest Model with Prediction Confidence
Prediction of ER Binding for 58,000 Chemicals Using an Integrated System of a Tree-Based Model with Structural Alerts
Decision Forest Predictions for 6500 Industrial and Environmental Chemicals
Mold2—Chemical Descriptor Generator Software for Biological Activity Prediction
Four-Phase Screening and Priority-Setting Model
Future Web-Based Prediction
EDKB Keywords


EDKB Database with Chemical Structure Search 

The EDKB database is a curated database containing the Estrogen Receptor (ER) and Androgen Receptor (AR) training datasets together with considerable additional data from the literature for various types of in vitro and in vivo assays. The web-based EDKB consists of a biological activity database, QSAR (Quantitative Structure-Activity Relationship) training sets, in vitro and in vivo experimental data for more than 3,000 chemicals, literature citations, and chemical-structure search capabilities.

Estrogen Receptor Binding Dataset

Blair (2000) and Branham (2002) published the EDKB estrogen receptor (ER) binding dataset that was produced expressly as a training set designed for developing predictive models. The data is based on a validated assay using rat uteri. The dataset contains 131 ER binders and 101 non-ER binders. This structurally diverse dataset has 312 predictors generated using the Molconn-Z software 4.07 and was analyzed using CERP (Ahn, et al., 2007). These training-set chemicals were selected for both chemical-structure diversity and range of activity, both of which are essential to develop robust QSAR and other models (Perkins, 2003). Guided by the SAR (structure-activity relationship) studies described by Fang (2001), the chemicals were selected to provide uniform coverage of a diverse chemical structure domain, as well coverage of an activity range extending a million-fold below that of the endogenous hormones.
Download ER Binding Dataset SD File

Androgen Receptor Binding Dataset

Fang (2003) published the EDKB androgen receptor (AR) binding dataset that was produced expressly as a training set designed for developing predictive models. The data is based on a validated assay using recombinant AR. The dataset contains 146 AR binders and 56 non-AR binders. These training-set chemicals were selected for both chemical-structure diversity and range of activity, both of which are essential to develop robust QSAR (quantitative structure-activity relationship) and other models (Perkins, 2003). Guided by the SAR studies described by Fang (2003), the chemicals were selected to provide uniform coverage of a diverse chemical structure domain, as well coverage of an activity range extending a million-fold below that of the endogenous hormones.
Download AR Binding Dataset SD File

Comparative Molecular Field Analysis Model for Estrogen Receptor Binding 

Shi (2001) published the CoMFA QSAR model for ER binding that is based on the EDKB estrogen receptor binding dataset. The model has a cross-validated r2 = 0.66.

Comparative Molecular Field Analysis Model for Androgen Receptor Binding 

Hong (2003) published the CoMFA QSAR model for AR binding that is based on the EDKB androgen receptor binding dataset. The model has a cross-validated r2 = 0.57.

Decision Forest Model with Prediction Confidence 

More recent EDKB research has focused on developing predictive models that provide predictions with quantified accuracy. Tong published a model based on the Decision Forest (2003) method for predicting estrogen receptor binding activity that also quantifies confidence in predictions.
Install Decision Forest 

Prediction of ER Binding for 58,000 Chemicals Using an Integrated System of a Tree-Based Model with Structural Alerts

The models were developed using data for 232 structurally diverse chemicals (training set). The models were then validated by predicting estrogen receptor (ER) RBAs for 463 chemicals that had ER activity data (testing set). The integrated model was applied to approximately 58,000 potential EDCs. The ability to process large numbers of chemicals to predict inactivity for ER binding and to categorically prioritize the remainder provides one biologic measure to prioritize chemicals for entry into more expensive assays (most chemicals have no biologic data of any kind). The general approach for predicting ER binding reported here may be applied to other receptors and/or reversible binding mechanisms involved in endocrine disruption (Hong, 2002).

Decision Forest Predictions for 6500 Industrial and Environmental Chemicals 

The Decision Forest model with prediction confidence (Tong, 2004) has been applied to a dataset of 6573 industrial and environmental chemicals.
Install Decision Forest 

Mold2—Chemical Descriptor Generator Software for Biological Activity Prediction

Mold2 is publicly available, free software developed at the FDA's National Center for Toxicological Research (NCTR) and provides fast-calculating descriptors from a two-dimensional chemical structure that is suitable for small and large datasets.
Download Mold2

Four-Phase Screening and Priority-Setting Model 

Shi (2002), Hong (2002) and Tong (2002) have described integrated suites of multiple SAR and QSAR models to be used in sequence to prioritize for testing very large numbers of chemicals based on likelihood of activity. The models are calibrated to minimize false negative prediction.

Future Web-Based Prediction 

Future plans are to enable Web site-based models to predict estrogen and androgen activity based on chemical structure.

EDKB Keywords

AR — Androgen Receptor

CoMFA — Comparative Molecular Field Analysis

DF — Decision Forest

DT — Decision Tree

EDCs — Endocrine Disrupting Chemicals

EDKB — Endocrine Disruptor Knowledge Base

EDSTAC — Endocrine Disruptor Screening and Testing Advisory Committee

ER — Estrogen Receptor

FDA — Food and Drug Administration

LOO — Leave-one-out

LNO — Leave-N-out

NCTR — National Center for Toxicological Research

RBA — Relative Binding Activity

SAR — Structure Activity Relationship

QSAR — Quantitative Structure Activity Relationship 

Contact Information

Please address any questions and suggestions to Dr. Weida Tong at 870-543-7142 or weida.tong@fda.hhs.gov.


EDKB is a product designed and produced by the National Center for Toxicological Research (NCTR).  FDA and NCTR retain ownership of this product.


Resources For You

Sign up for NCTR Bioinformatics Tools news

Get regular FDA email updates delivered on this topic to your inbox.

Back to Top