The California GeoTracker Database: A Unique Public Resource for Understanding Contaminated Sites
By Lila Beckley, Steven McMasters, Matthew Cohen, Dayna Cordano, Sharon Rauch, and Thomas McHugh
Ground Water Monit Remediat
April 26, 2022
Data mining as a research tool requires access to high quality datasets. Investigation and cleanup of contaminated sites yield large amounts of monitoring data; however, historically, these data have not been available in large, consolidated datasets. The California GeoTracker web site and database is a public repository for a wide variety of information related to investigation and remediation of cleanup sites in California. Under California regulations, responsible parties must submit laboratory analytical results for environmental samples in electronic form along with reports and other information. The GeoTracker website also supports public access to the entire database of laboratory analytical results, which, for some sites, date back to 2001. This database includes approximately 285,000,000 analytical records for more than 50,000 contaminated and formerly contaminated sites in California. Because of the large volume of publicly-available data, GeoTracker has been used as the primary data source for a number of data mining studies in the last ten years. This paper describes the GeoTracker origin story and how it has evolved to account for changes in regulatory priorities such as understanding vapor intrusion mechanisms and distribution of per- and polyfluoroalkyl substances (PFAS) in the environment, while maintaining database continuity. Finally, we review data mining projects that have utilized GeoTracker to better understand various aspects of contaminated site management. This review illustrates how long-term commitment to collection and sharing of environmental data can support the general public and the regulatory and research communities.
View on NGWA