Exploring Machine Learning on Geochemistry Data For Efficient Prediction of Metal Concentrations in Copper Deposits

No Thumbnail Available

Date

2024-01-25

Journal Title

Journal ISSN

Volume Title

Publisher

Namibia University of Science and Technology

Abstract

Naturally occurring ore bodies like Copper often occur in compound form with other useful metals such as Silver, Lead and Zinc. Due to the cost, mining companies find it difficult to pay for analysis of various metals in their samples and end up focusing on analysing one metal or a few, leaving out a bunch of other associated metal concentrations in the deposit. Additionally, analysing different metals in samples can take time, and this increased turnaround time of receiving results from the laboratory can negatively affect production. The research used a geochemistry dataset comprising of 3,282 samples from the Kombat Copper deposit area in Namibia to predict copper (Cu) concentrations from zinc (Zn) and lead (Pb) concentrations. In addition to the metal concentrations, the dataset had sample coordinates and grid names features. The four machine learning algorithms used were Random Forest (RF), K-Nearest Neighbour (KNN), Decision Tree (DT), and Support Vector Machine (SVM). These models were used because they were the commonly employed models for similar purposes, in the literature reviewed. The learning task was a regression problem, therefore, the primary metric utilised to assess the machine learning model and draw performance conclusions was the regression score (R-squared), which quantifies how well the model explains the variance in the data. The R squared score represents the percentage of variance in the dependent variable (target) that can be predicted from the independent variables (features). It ranges on a scale of 0 to 1, where 1 indicates a perfect fit. In addition Mean Squared Error (MSE), Root means squared error (RMSE), mean absolute error (MAE), Adjusted R-squared, and explained variance metrices were also looked at. Based on the R-squared metric, the KNN model outperformed the other three models, predicting 57% of the relationship between the dependent and independent variables. K-NN was followed by RF with 0.55 score, DT with a 0.49 score and the SVM with a 0.44 score. KNN model appeared to be the best choice among the four models for making predictions for the dataset. Further optimisation of the models improved their prediction accuracy, with the KNN model still with a superior performance of R-squared at 70% (0.70) with n-estimators set at 4 and the test size set to 10%. Predicting metal contents from geochemistry data with machine learning can iv help mining companies reduce costs by supplementing lab-based analyses with model-based predictions in determining grades.

Description

Keywords

machine learning, geochemistry, copper deposits

Citation

Joel , L. (2024). Exploring machine learning on geochemistry data for efficient prediction of Metal concentrations in copper deposits [Master’s thesis, Namibia University of Science and Technology].