Per- and Polyfluoroalkyl Substances (PFAS) in PubChem: 7 Million and Growing
By Emma L. Schymanski, Jian Zhang, Paul A. Thiessen, Parviel Chirsir, Todor Kondic, and Evan E. Bolton*
Environ. Sci. Technol.
October 23, 2023
Per- and polyfluoroalkyl substances (PFAS) are of high concern, with calls to regulate them as a class. In 2021, the Organisation for Economic Co-operation and Development (OECD) revised the definition of PFAS to include any chemical containing at least one saturated CF2 or CF3 moiety. The consequence is that one of the largest open chemical collections, PubChem, with 116 million compounds, now contains over 7 million PFAS under this revised definition. These numbers are several orders of magnitude higher than previously established PFAS lists (typically thousands of entries) and pose an incredible challenge to researchers and computational workflows alike. This article describes a dynamic, openly accessible effort to navigate and explore the >7 million PFAS and >21 million fluorinated compounds (September 2023) in PubChem by establishing the “PFAS and Fluorinated Compounds in PubChem” Classification Browser (or “PubChem PFAS Tree”). A total of 36500 nodes support browsing of the content according to several categories, including classification, structural properties, regulatory status, or presence in existing PFAS suspect lists. Additional annotation and associated data can be used to create subsets (and thus manageable suspect lists or databases) of interest for a wide range of environmental, regulatory, exposomics, and other applications.