Granite classification using machine learning and edge computing

Background: The outlook and the aura of any place are highly dependent on how a place is decorated and what materials are used in designing it. Granite is such a kind of rock which is vastly used for this purpose. Granite flooring and countershave a major influence on the interior d ´ecor which is essential to set the moodand ambience of a house. A system is needed to help the end users differentiatebetween granites, which enhance the grandeur of their house and also check thefrauds of different color granite being sent by the merchant as compared to whatwas selected by the end user. Several models have been developed for this causeusing CNN and other image processing techniques. However, a solution for thispurpose must be precise and computationally efficient. Methods: For this purpose,researchers in this work developed a machine learning based granite classifier us-ing Edge Computing and a website to help users in choosing which granite wouldgo well with their d ´ecor is also built. The developed system consists of a colorsensor [TCS3200] integrated with an ESP8266 board. The data pertaining to RGBcontrasts of different rocks is acquired by using the color sensor from a dealership.This data is used to train a Machine Learning algorithm to classify the rock intodifferent granite types from a granite dealer and yield the category prediction. Re-sults: The proposed system yields a result of 94% accuracy when classified usingRandom Forest Algorithm. Conclusion: Thus, this system provides an upper handfor the end users in differentiating between different types of granites.


Introduction
Granite is a coarse-grained igneous rock.It is formed when magma is compressed due to the pressure underground.It is one of the most common rocks found on earth crust and constitutes a multimillion dollar granite industry.Granite owing to its beauty and composition, it is used in a wide range of applications which encompass house decor, flooring, counter tops etc.This flexibility of granite has instigated many people to cheat with the quality and type of granite.With the advancing technologies, like machine learning and Internet of Things (IOT) the verification and analysis of such samples has become an easy task.These technologies have made it possible to go about any particular task in an efficient and remote manner with pinpoint precision.This has led some researchers to explore the field and produce some workable solutions, models which recognize the type of granite using Convolutional Neural Network (CNN) on images of granite.These models classify the granites based on image patterns of different types of granite. 1 Models which use Computer Vision to classify rocks using the colors and textures of granites obtained from images.This model considers the color of granite and it's texture as the distinguishing factor. 2 Models were developed which apply Machine Learning on data gathered using a spectrophotometer.These models gather data from a spectrophotometer and then apply Machine Learning to differentiate between granite types. 3e solutions currently developed all rely on heavy computing techniques and are not portable.A solution which can be executed from anywhere at any time needs to be developed to better cater the needs of end users.To meet the demands of end users and make the system more portable and efficient, a model with Machine Learning and Edge Computing is experimented in this work.The increasing standards of living and per capita income have also inspired people to try different styles and experience various luxuries, primarily the looks of their houses.This sudden inflow of income and need for granite has instigated some merchants to fraud the end users by charging for a higher quality rock and issuing a lower quality rock.With these frauds on the incline, the financial situations are on a decline.It's high time that we develop a method to verify the quality of the rock and terminate the frauds.The end users also often seek suggestions to choose from the plethora of granites that are available in the market.It is important now more than ever to devise a solution to corroborate the granite received in order to prevent the fraud efficiently and effectively without compromise on the portability.The integration of ML and Edge Computing is an effective way to do the required task much more easily.This project utilizes the color variations of granites to classify them into various categories by collecting the color values from separate points on the rock and applying Machine Learning on them.The major objective of this work is to verify the rock delivered based on color.

Previous work
The homogeneity of pieces of granites being used based on color is very important for the aesthetic of place where granites are being used.In general humans carry out the classification process based on quality and look using human experience and exposure along with a bit of creativity.This opens doors for artificial intelligence based works to be carried out in this field in selection of a specific color granite.
Reference 1 work follows convolution neural network usage for granite classification.The researchers used transfer learning on MNSIT networks and CIFAR networks on a dataset of 1000 RGB images.Using Nearest Neighbor classifier with CIFAR they claim to achieve an accuracy of above 85% in classifying granites.
In the work 2 an expert system was developed based on computer vision concepts to classify granites.The authors claim that Support Vector Machine out performs than other methods they tried in classifying granites based on color and texture.The classification of granites using automatic artificial intelligence based systems would provide benefits of creating a repeatable procedure which is not possible in manual general process followed for decades in market.This further add to substantial upgrading in quality assessment procedures for companies.Also, lessening of losses owing to cancellation of order at customer site besides easy warehouse management can be achieved.The work was carried out using lot of 48 granites of twelve classes on total which were subdivided in to 64 samples per class and images were captured using a standard procedure in lab but finally the cropped image of "544Â544 pixels" was used in experimentation.They used methods based on co occurance matrix, Gobor features, chromatic features and Local Binary Patterns.The work made use of multiple classifiers including "Support Vector classifier", "Linear Classifier" "Naïve Bayes'", "Nearest Neighbor" etc.
In some works sum and differences of histograms along with LVQ neural networks is used to classify granites. 4In the letter published by Ref. 5 the researchers claim to use "wavelet analysis" to classify the granites.Discrete wavelet transformation is used to generate sequences of signals in their work.The researchers made use of 30 image database having three classes of granites.The work claims to obtain 90% accuracy using LVQ neural networks.
With the work 6 the researchers have proposed an "electro-mechanical system" that robotically categorizes marbles while on conveyor belt itself based on a "hierarchical clustering approach".They made use of Programmable Logic Controller along with an auxiliary micro controller to bind between PLC and Matlab.Clustering based classification was implemented along with hierarchical classification based on cascaded features of color, texture and spectral.83.6% accuracy as recorded in their work for correct classification of marbles.Reference 7 work makes use of transfer learning on AlexNet and VGGNet to develop a Convolution Neural Network based granite tiles classification using a dataset of 2000 RGB images classified into 25 classes.Out of experiments they observed that fine tuning VGGNet they could record an accuracy of almost 99.3%.
The authors of Ref. 8 have implemented there own CNN architecture and good bring out an accuracy of around 96.1% in marble quality classification.The authors of Ref. 9 have done extensive experimentation by examining fifteen varieties of convolution neural networks to sort dolomitic stone tiles.They observed that "DenseNet201" model could yield an accuracy of 83.24% while trained on a 489 digital image data set of granites classified into 3 classes.

Methods
The Model developed takes the color values of granite as input and then feeds it to a machine learning algorithm.The sensor relays data to a nodemcu, which analyses it and deploys a machine learning model to verify the granite being sampled and classifies it into a specific class of rock.

Machine learning
Machine learning is a part of Artificial Intelligence (AI) that trains a computing machine to learn from data and improve.The term machine learning depicts an automated process of pattern recognition and feature detection in any data.Machine learning also increased the comforts of living, personal assistants are now available which cater to a person's personal needs and desires.Machine learning can be categorised as Supervised machine learning, Unsupervised machine learning, Semisupervised learning as shown in Figure 1 below.
Machine learning models require a large amount of data to be trained on for efficient prediction and increased accuracy.The data can be obtained from a variety of methods including real time data collection as done in this work.This data can be processed in a lot of ways such as normalization, removing data or filling the missing data with mean, median or mode and other ways.After processing the data, the Machine learning models are ready to be trained  Edge computing refers to a distributed computing paradigm, in which the data is stored and computations are executed closer to the actual site in order to improve response time and carry the functions of any process faster.It's more of a topology.The purpose of Edge computing is to move the entire computation of any process away from data centers and closer to the edge of the network in order to reduce the stress and load on the data center for efficient and quicker responses.It exploits the functions and specifications of smart objects like smart phones, controller boards etc., which have built-in memory to perform tasks and provide services instantaneously rather than accessing the data centers and retrieving information for every task.

Dataset
Dataset used is collected from a granite dealership with the consent of pertaining dealers in real time.This collected data is then used to train the developed IoT based machine learning model using edge computing.The Features of the dataset include RGB values of different types of granites and the last column giving the color of granite.The sample of data set collected is shown in Figure 3 below.The sensors were taken to a dealership and the RGB values pertaining to the different classes of granites are recorded.This data is then stored in a csv file for easier training of ML models.

Data collection
The ESP8266 and color sensor were integrated to build the experimental setup for the purpose of recording the RGB values of granites and for the collection of dataset.A dealer was contacted to get the permission for recording the values of granites and after taking necessary precautions and permission.The values of red granite, black granite and white granite were collected and tabulated as shown in Figure 3    at each point by adjusting the s0 and s1 pins.These RGB values are stored in an array.The seuence diagram of idea carried out can be seen in the figure below.The overall system architecture of the idea implemented can be seen as in Figure 5 below.This work is implemented using Arduino IDE for edge computing, python language to perform machine learning and HTML, CSS to develop the web application.With the help of special libraries, machine learning is carried out on the Arduino platform.The Arduino (IDE) software makes the uploading of code to the NodeMCU much easier and quicker.Python Language is used to implement the machine learning modules using Scikit Learn library which is then transferred to Arduino.
The Dataset built already is given to train machine learning models to learn.After the data is trained and models learnt the process, various algorithms are evaluated and compared to select the one with highest accuracy for classification i.e.Random forest.The Implementation of this project is done in 3 steps: 1.Data collection: collection of data from the granite dealership; 2. Model Selection: various ML algorithms were observed and a model is fixed; 3. Deployment: The model is deployed into a nodemcu for real time usage.
Various machine learning algorithms are used to classify any particular data and each one of them is dependent on a specific mathematical concept.These concepts are then applied to find a most suitable correlation with the given data or a function that best fits the data being analyzed.The algorithms that are compared for better output in this work are: Random Forests Algorithm 10 ; Support Vector Machine Algorithm 11 ; K-Nearest Neighbors (KNN) Algorithm. 12The algorithms are implemented in python and then converted to C for it to be executed in NodeMCU.NodeMCU is an opensource firmware, used for IoT applications on open source boards.Lua scripting language is used to implement the firmware on Espressif SDK for Esp8266.Esp8266 image is shown in figure below.A Wi-fi soc from espressif systems is available on the board.A dual in-line package provided by the prototyping hardware integrates a USB controller with an MCU and antenna laden board.UART, DAC and ADC interfaces are supported by the board and can be accessed through a specific set of pins among the 21 pins available on the board.
The deployment is done using python IDLE and Arduino IDE.Micromlgen is an open source library developed to bring machine learning to microcontrollers.It essentially converts a ML code into an optimized c code for the microcontrollers to execute it.This acts as an alternative to the Tensorflow package which is computationally complex and cannot be used on boards with smaller capacities and memories.Micromlgen supports some of the basic machine learning algorithms like SVM, Random Forests, Decision trees, Gaussian Naive Bayes, XGBoost among others.The obtained output code in C is then written in a header file which is to be included in the arduino sketch.Thus, ML is implemented in the NodeMCU effectively.The sensor is then connected to the NodeMCU and the model is loaded onto the board.The physical model gets ready and can be used to classify granites in real time.TCS3200 color sensor contains a TAOS TCS3200 RGB sensor chip and 4 white LEDs.The image of TCS3200 color sensor is shown in the image in figure given below.The chip forms the most important module of the sensor, which is basically a color light-to-frequency converter.The sensor is arduino compatible, which makes it very useful and easy to handle.The corresponding frequency of various colors is given as output by sensing the presence of Red, Green and Blue colors.This module can be used in a variety of applications such as granite classification, color matching tests, color sorting robots, etc.The setup built for Edge Computing using ESP8266 and TCS3200 for carrying out the idea can be seen in the image of Figure 6 below.The Figure 7 above shows the sample results found during experimentation.

Conclusion
In this work the model built helps the users in choosing a granite that enhances the grandeur of their house and also reduces losses for return orders at merchant end.The model aims to help the end users in ascertaining the quality and type of granite they received.Improvising the age-old techniques of image classification, this model does the task with edge computing for increased efficiency and effectiveness.The researchers tested the dataset using several machine learning algorithms which have produced satisfactory results.Random Forest classifier gave 94% accuracy, SVM exhibited 78% accuracy and KNN could only achieve 75% accuracy.The values have been shown tabular form in Table 1 above.Random Forest algorithm yielded the best results.The accuracy can be improved by collecting larger data.

Author contributions
Mrs. K. Madhavi was chief mentor for the work and supervised the whole research taken up with time to time inputs.She took care of reviewing the article and methodology.
Mr. Krishna Chythanya conceptualised the idea and assisted in Implementing the research with proper evaluations time to time and also writing the article.Mr. P. Gowtham Sai was instrumental in collecting data as well as implementing the code.Mr. S. Saikiran Manasa was contributing in algorithm implementation and evaluation.Mr. G. Pranay Krishna was playing role during documentation and review.

This project contains the following underlying data:
-The data set has around 90 rows each consisting of 12 coloumns that stands of RGB values of Granite.Each of RGB is represented by 4 values respectively in a row.These values are recorded using TCS3200 color sensor on granite peices.The values were recorded by visiting couple of near by granite dealer shops in Hyderabad, Table 1.Different algorithms tried and their accuracy.

Algorithm Accuracy
Random forest 94% SVM 78% KNN 75% Telangana, India.The last column denotes the color of granite and thisacts as a target variable in case of machine learning algorithms to predict the color of granite.
-The RAR file consists of three folders.Each folder has got code for each of the three machine learning algorithms implemented for Granite Classification using Edge Computing.The code has folders by name KNN, SVM and Random Forest.

Blessing Olamide Taiwo
Federal University of Technology Akure, Akure, Ondo, Nigeria Firstly, I want to thank the authors for putting this great work together and for their efforts.
The manuscript presents the development of machine learning based granite classifier using Edge Computing and a website to help users in choosing which granite would go well with their decor is also built.The authors considered color sensor [TCS3200] integrated with an ESP8266 board in the work.
I have the following comments and recommendation to improve the current state of the work.

Grammar and Presentation:
The work needs some rephrasing and corrections to improve the grammar and presentation.(Pg 4 last paragraph and other areas).

Literature Review:
The author provides some literature about previous work, but i recommend they include recent works relating to optimization algorithms, ensemble learning classifiers to provide more ground to users about the methodology used in this work.

Model Result validation and Data description:
I recommend that the author should provide the data in Table form and also provide the testing input dataset along with Figure 7.
The use of ROC as part of the result assessment should be included in the result section.
"ROC curve is a graphical representation of the performance of a binary classifier system as its discrimination threshold is varied.

Ratan Lal
Northwest Missouri State University, Maryville, MO, USA The authors consider the problem of identifying the quality of Granite rock to help people and avoid fraud in Granite rocks.The broad approach is the following.First, a set of RGB colors for real images is collected through the color sensors.Then, the data has been used to train different machine learning models, such as Random Forest, Support Vector Machine, and K Nearest Neighbor.Upon training the model, the best among the selected models has been considered to deploy on NodeMCU, which is an open-source firmware used for IoT applications on open-source boards.The model is trained through Scikit Learn Library in python, and then it is converted into C using Micromlgen in order to deploy the ML model on the NodeMCU.Finally, the Random Forest model has 94% accuracy, and it is deployed on the NodeMCU.

Pros:
The idea of identifying the quality of Granite rock using IoT devices is interesting.1.
The paper is written well.2.

Major Comments:
I recommend the authors use the term "prediction" in place of "verification" as the machine learning model does not always give an accurate result. 1.
It would be good if the authors could report the confusion matrix for the Random Forest model.

2.
I recommend the authors remove methods:, results: heading and conclusion: paragraph from the abstract.

Are sufficient details of methods and analysis provided to allow replication by others? Yes
If applicable, is the statistical analysis and its interpretation appropriate?Yes Are all the source data underlying the results available to ensure full reproducibility?No

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Formal Verification, Cyber-Physical Systems, Robotics, Machine Learning, Internet of Things I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.

The benefits of publishing with F1000Research:
Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com . The authors opted for classification model of ML to be implemented for this work as objective is to classify granites based on color.There are several Algorithms which analyze the data and predict/classify the output.Each of them works on a different principle and the ML model.Training is the part where the machine learns the data and analyses it.The Output and the efficiency majorly depend on this part.The data is trained to provide a specific output depending on the input provided.Evaluation of a model refers to the part where the model's efficiency is tested.In this work Accuracy and Confusion matrix are considered to evaluate trained model.The overall machine learning process discussed can be depicted in figure as shown in Figure 2.

Figure 1 .
Figure 1.Types of machine learning.
below.The data was collected by one of the authors of this work as can be found in image of Figure 4 below.The dataset acquired accommodates 90 rows with 12 columns of rgb values describing 3 classes.A program was written in Arduino IDE to collect the sensor readings.The sensor gives "rgb" values

Figure 3 .
Figure 3.The sample of data set used for in this experiment.

Figure 4 .
Figure 4.One of the authors collecting data using sensors.

Figure 5 .
Figure 5. Architecture of the system implemented.

Figure 6 .
Figure 6.SetUp for Edge Computing using ESP8266 board and TCS3200 used for implementing this work.

Figure 7 .
Figure 7. Sample outputs for the test inputs given.

the work clearly and accurately presented and does it cite the current literature? Partly Is the study design appropriate and is the work technically sound? Yes Are sufficient details of methods and analysis provided to allow replication by others? Partly If applicable, is the statistical analysis and its interpretation appropriate? Partly Are all the source data underlying the results available to ensure full reproducibility? Partly Are the conclusions drawn adequately supported by the results? Yes Competing Interests:
It is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings."Authors should provide the chart for both FPR and TPR for both training data only for testing data to enhance the result interpretation.No competing interests were disclosed.

have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
This is an open access peer review report distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.