Downloadable Computational Toxicology Data

EPA’s computational toxicology research efforts evaluate the potential health effects of thousands of chemicals. The process of evaluating potential health effects involves generating data that investigates the potential harm, or hazard of a chemical, the degree of exposure to chemicals as well as the unique chemical characteristics.

As part of EPA’s commitment to share data, all of the computational toxicology data is publicly available for anyone to access and use. EPA's computational toxicology data is considered "open data", and thus all of the data below are free of all copyright restrictions, and fully and freely available for both non-commercial and commercial use.

High-throughput Screening Data

EPA researchers use rapid chemical screening (called high-throughput screening assays) to limit the number of laboratory animal tests while quickly and efficiently testing thousands of chemicals for potential health effects.

  • ToxCast Data: High-throughput screening data on thousands of chemicals.

Rapid Exposure and Dose Data

EPA researchers develop and use rapid exposure estimates to predict potential exposure for thousands of chemicals.

  • High-throughput toxicokinetics data: It is important to link the external dose of a chemical to an internal blood or tissue concentration, this process is called toxicokinetics. EPA researchers measure the critical factors that determine the distribution and metabolic clearance for hundreds of chemicals and incorporate these data into computer models. The high-throughput toxicokinetic data can be paired with the high-throughput screening data to estimate real-world exposures.  

Sustainable Chemistry Data

EPA researchers use chemistry data such as chemical structures and physicochemical property information to evaluate thousands of chemicals for potential health effects.

  • Distributed Structure Searchable Toxicity Database (DSSTox): Downloadable, structure-searchable, standardized chemical structure files associated with chemical inventories or toxicity data sets of environmental relevance.
  • Collaborative Estrogen Receptor Activity Prediction Project Data: Data and supplemental files from CERAPP (A large-scale modeling project). CERAPP combined multiple models developed in collaboration with 17 groups in the United States and Europe to predict estrogen receptor activity of a common set of 32,464 chemical structures. Quantitative structure-activity relationship models and docking approaches were employed, to build a total of 40 categorical and 8 continuous models for binding, agonist, and antagonist ER activity.
  • Chemistry Dashboard Data: Data from the Chemistry Dashboard including the mappings between the DTXSIDs and the InChIStrings and Keys, SDF files containing all chemical structures and relevant information, and a file containing CAS Number, Preferred Chemical Name and DTXSID file.

Virtual Tissues Data

EPA researchers develop virtual tissue computer models to simulate how chemicals may affect human development. Virtual tissue models are some of the most advanced methods being developed today. The models will help reduce dependence on animal study data and provide much faster chemical risk assessments.

  • Tipping Point Data: EPA researchers develop mathematical models to predict perturbation of biological systems and determine when cellular systems are no longer able to recover.  EPA researchers use these models to determine the “Tipping Point”, the point when biological systems are unable to recover from or adapt to chemical exposure. When cellular systems are unable to recover, chemical exposures could lead to adverse outcomes such as cancer.