Automation of exploratory data analysis and univariate geochemical processing using Python


  • Brayan Jarry Castillo Requiz Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería Geológica, Minera, Metalúrgica y Geográfica, Lima, Peru
  • Jesús Daniel Tarazona Silva Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería Geológica, Minera, Metalúrgica y Geográfica, Lima, Peru
  • Cristian Eugenio Tarazona Silva Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería Eléctrica y Electrónica, Lima, Peru
  • Christian Hurtado Enriquez Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería Geológica, Minera, Metalúrgica y Geográfica, Lima, Peru
  • Félix Abraham Cornelio Orbegoso Universidad Nacional Mayor de San Marcos, Facultad de Ingeniería Geológica, Minera, Metalúrgica y Geográfica, Lima, Peru



Exploratory data analysis, univariate analysis, automation, Python, script


Process automation is being implemented in different disciplines of earth sciences, as seen in the implementation of libraries such as Pyrolite, PyGeochemCalc, dh2loop 1.0, NeuralHydrology, GeoPyToo among others. The present work addresses a methodology to automate the geochemical univariate analysis by using Python and open-source packages such as pandas, seaborn, matplotlib, statsmodels which will be integrated into a script in a local work environment such as Jupyter notebook or in an online environment such as Google Collaboratory. The script is designed to process any type of geochemical data, allowing to remove outliers, perform calculations and graphs of the elements and their respective geological domain. The results include graphics such as boxplot, quantile-quantile and calculations of normality tests and geochemical parameters, allowing to determine the background and threshold of the elements worked. The result of the geochemical parameters will be further processed in geographic information software which allows to generate the univariate anomaly map and the anomalous basins.







How to Cite

Castillo Requiz, B. J., Tarazona Silva, J. D., Tarazona Silva, C. E., Hurtado Enriquez, C., & Cornelio Orbegoso, F. A. (2023). Automation of exploratory data analysis and univariate geochemical processing using Python. Revista Del Instituto De investigación De La Facultad De Minas, Metalurgia Y Ciencias geográficas, 26(51), e24493.