Exploring Machine Learning on Geochemistry Data For Efficient Prediction of  Metal Concentrations in Copper Deposits

Joel, Lydia.

Exploring Machine Learning on Geochemistry Data For Efficient Prediction of Metal Concentrations in Copper Deposits

dc.contributor.author	Joel, Lydia.
dc.date.accessioned	2024-09-16T11:55:51Z
dc.date.available	2024-09-16T11:55:51Z
dc.date.issued	2024-01-25
dc.description.abstract	Naturally occurring ore bodies like Copper often occur in compound form with other useful metals such as Silver, Lead and Zinc. Due to the cost, mining companies find it difficult to pay for analysis of various metals in their samples and end up focusing on analysing one metal or a few, leaving out a bunch of other associated metal concentrations in the deposit. Additionally, analysing different metals in samples can take time, and this increased turnaround time of receiving results from the laboratory can negatively affect production. The research used a geochemistry dataset comprising of 3,282 samples from the Kombat Copper deposit area in Namibia to predict copper (Cu) concentrations from zinc (Zn) and lead (Pb) concentrations. In addition to the metal concentrations, the dataset had sample coordinates and grid names features. The four machine learning algorithms used were Random Forest (RF), K-Nearest Neighbour (KNN), Decision Tree (DT), and Support Vector Machine (SVM). These models were used because they were the commonly employed models for similar purposes, in the literature reviewed. The learning task was a regression problem, therefore, the primary metric utilised to assess the machine learning model and draw performance conclusions was the regression score (R-squared), which quantifies how well the model explains the variance in the data. The R squared score represents the percentage of variance in the dependent variable (target) that can be predicted from the independent variables (features). It ranges on a scale of 0 to 1, where 1 indicates a perfect fit. In addition Mean Squared Error (MSE), Root means squared error (RMSE), mean absolute error (MAE), Adjusted R-squared, and explained variance metrices were also looked at. Based on the R-squared metric, the KNN model outperformed the other three models, predicting 57% of the relationship between the dependent and independent variables. K-NN was followed by RF with 0.55 score, DT with a 0.49 score and the SVM with a 0.44 score. KNN model appeared to be the best choice among the four models for making predictions for the dataset. Further optimisation of the models improved their prediction accuracy, with the KNN model still with a superior performance of R-squared at 70% (0.70) with n-estimators set at 4 and the test size set to 10%. Predicting metal contents from geochemistry data with machine learning can iv help mining companies reduce costs by supplementing lab-based analyses with model-based predictions in determining grades.
dc.identifier.citation	Joel , L. (2024). Exploring machine learning on geochemistry data for efficient prediction of Metal concentrations in copper deposits [Master’s thesis, Namibia University of Science and Technology].
dc.identifier.uri	http://hdl.handle.net/10628/1024
dc.language.iso	en
dc.publisher	Namibia University of Science and Technology
dc.subject	machine learning
dc.subject	geochemistry
dc.subject	copper deposits
dc.title	Exploring Machine Learning on Geochemistry Data For Efficient Prediction of Metal Concentrations in Copper Deposits
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Lydia Joel_222085010_Dissertation.pdf
Size:: 4.62 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses and Dissertations