EDKB Resources
EDKB Database with Chemical Structure Search
Estrogen Receptor Binding Dataset
Androgen Receptor Binding Dataset
Comparative Molecular Field Analysis Model for Estrogen Receptor Binding
Comparative Molecular Field Analysis Model for Androgen Receptor Binding
Decision Forest Model with Prediction Confidence
Prediction of ER Binding for 58,000 Chemicals Using an Integrated System of a Tree-Based Model with Structural Alerts
Decision Forest Predictions for 6500 Industrial and Environmental Chemicals
Mold2—Chemical Descriptor Generator Software for Biological Activity Prediction
Four-Phase Screening and Priority-Setting Model
Future Web-Based Prediction
EDKB Keywords
EDKB Database with Chemical Structure Search
The EDKB database is a curated database containing the Estrogen Receptor (ER) and Androgen Receptor (AR) training datasets together with considerable additional data from the literature for various types of in vitro and in vivo assays. The web-based EDKB consists of a biological activity database, QSAR (Quantitative Structure-Activity Relationship) training sets, in vitro and in vivo experimental data for more than 3,000 chemicals, literature citations, and chemical-structure search capabilities.
Estrogen Receptor Binding Dataset
Blair (2000) and Branham (2002) published the EDKB estrogen receptor (ER) binding dataset that was produced expressly as a training set designed for developing predictive models. The data is based on a validated assay using rat uteri. The dataset contains 131 ER binders and 101 non-ER binders. This structurally diverse dataset has 312 predictors generated using the Molconn-Z software 4.07 and was analyzed using CERP (Ahn, et al., 2007). These training-set chemicals were selected for both chemical-structure diversity and range of activity, both of which are essential to develop robust QSAR and other models (Perkins, 2003). Guided by the SAR (structure-activity relationship) studies described by Fang (2001), the chemicals were selected to provide uniform coverage of a diverse chemical structure domain, as well coverage of an activity range extending a million-fold below that of the endogenous hormones.
Download ER Binding Dataset SD File
Androgen Receptor Binding Dataset
Fang (2003) published the EDKB androgen receptor (AR) binding dataset that was produced expressly as a training set designed for developing predictive models. The data is based on a validated assay using recombinant AR. The dataset contains 146 AR binders and 56 non-AR binders. These training-set chemicals were selected for both chemical-structure diversity and range of activity, both of which are essential to develop robust QSAR (quantitative structure-activity relationship) and other models (Perkins, 2003). Guided by the SAR studies described by Fang (2003), the chemicals were selected to provide uniform coverage of a diverse chemical structure domain, as well coverage of an activity range extending a million-fold below that of the endogenous hormones.
Download AR Binding Dataset SD File
Comparative Molecular Field Analysis Model for Estrogen Receptor Binding
Shi (2001) published the CoMFA QSAR model for ER binding that is based on the EDKB estrogen receptor binding dataset. The model has a cross-validated r2 = 0.66.
Comparative Molecular Field Analysis Model for Androgen Receptor Binding
Hong (2003) published the CoMFA QSAR model for AR binding that is based on the EDKB androgen receptor binding dataset. The model has a cross-validated r2 = 0.57.
Decision Forest Model with Prediction Confidence
More recent EDKB research has focused on developing predictive models that provide predictions with quantified accuracy. Tong published a model based on the Decision Forest (2003) method for predicting estrogen receptor binding activity that also quantifies confidence in predictions.
Install Decision Forest
Prediction of ER Binding for 58,000 Chemicals Using an Integrated System of a Tree-Based Model with Structural Alerts
The models were developed using data for 232 structurally diverse chemicals (training set). The models were then validated by predicting estrogen receptor (ER) RBAs for 463 chemicals that had ER activity data (testing set). The integrated model was applied to approximately 58,000 potential EDCs. The ability to process large numbers of chemicals to predict inactivity for ER binding and to categorically prioritize the remainder provides one biologic measure to prioritize chemicals for entry into more expensive assays (most chemicals have no biologic data of any kind). The general approach for predicting ER binding reported here may be applied to other receptors and/or reversible binding mechanisms involved in endocrine disruption (Hong, 2002).
Decision Forest Predictions for 6500 Industrial and Environmental Chemicals
The Decision Forest model with prediction confidence (Tong, 2004) has been applied to a dataset of 6573 industrial and environmental chemicals.
Install Decision Forest
Mold2—Chemical Descriptor Generator Software for Biological Activity Prediction
Mold2 is publicly available, free software developed at the FDA's National Center for Toxicological Research (NCTR) and provides fast-calculating descriptors from a two-dimensional chemical structure that is suitable for small and large datasets.
Download Mold2
Four-Phase Screening and Priority-Setting Model
Shi (2002), Hong (2002) and Tong (2002) have described integrated suites of multiple SAR and QSAR models to be used in sequence to prioritize for testing very large numbers of chemicals based on likelihood of activity. The models are calibrated to minimize false negative prediction.
Future Web-Based Prediction
Future plans are to enable Web site-based models to predict estrogen and androgen activity based on chemical structure.
EDKB Keywords
AR — Androgen Receptor
CoMFA — Comparative Molecular Field Analysis
DF — Decision Forest
DT — Decision Tree
EDCs — Endocrine Disrupting Chemicals
EDKB — Endocrine Disruptor Knowledge Base
EDSTAC — Endocrine Disruptor Screening and Testing Advisory Committee
ER — Estrogen Receptor
FDA — Food and Drug Administration
LOO — Leave-one-out
LNO — Leave-N-out
NCTR — National Center for Toxicological Research
RBA — Relative Binding Activity
SAR — Structure Activity Relationship
QSAR — Quantitative Structure Activity Relationship
Contact Information
Please address any questions and suggestions to Dr. Weida Tong at 870-543-7142 or weida.tong@fda.hhs.gov.
EDKB is a product designed and produced by the National Center for Toxicological Research (NCTR). FDA and NCTR retain ownership of this product.