ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Research Article

Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems

[version 1; peer review: 2 approved with reservations]
PUBLISHED 30 Sep 2021
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Research Synergy Foundation gateway.

Abstract

Background: Customer churn prediction (CCP) refers to detecting which customers are likely to cancel the services provided by a service provider, for example, internet services. The class imbalance problem (CIP) in machine learning occurs when there is a huge difference in the samples of positive class compared to the negative class. It is one of the major obstacles in CCP as it deteriorates performance in the classification process. Utilizing data sampling techniques (DSTs) helps to resolve the CIP to some extent.
Methods: In this paper, we review the effect of using DSTs on algorithmic fairness, i.e., to investigate whether the results pose any discrimination between male and female groups and compare the results before and after using DSTs. Three real-world datasets with unequal balancing rates were prepared and four ubiquitous DSTs were applied to them. Six popular classification techniques were utilized in the classification process. Both classifier’s performance and algorithmic fairness are evaluated with notable metrics.
Results: The results indicated that Random Forest classifier outperforms other classifiers in all three datasets and, using SMOTE and ADASYN techniques cause more discrimination in the female group. The rate of unintentional discrimination seems to be higher in the original data of extremely unbalanced datasets under the following classifiers: Logistics Regression, LightGBM, and XGBoost.
Conclusions: Algorithmic fairness has become a broadly studied area in recent years, yet there is a very little systematic study on the effect of using DSTs on algorithmic fairness. This study presents important findings to further the use of algorithmic fairness in CCP research.

Keywords

Customer churn prediction, Data sampling techniques, Algorithmic fairness, Class imbalance problem

Introduction

Customer churn, the phenomenon in which customers are shifting to rival companies due to dissatisfaction with the existing services or to other inevitable reasons,1 is one of the common issues usually encountered in every customer-oriented sector, including telecommunication. Customer churn prediction (CCP) is a supervised binary classification procedure that detects the potential churners before they are churned. Since there are no standardized principles for collecting data for CCP tasks, data distribution between classes will be varied from one data set to another. Therefore, one class might have extremely underrepresented compared to another class. In CCP, the target class is those being churned or not. To be exact, churn is always a minority class when the non-churn class usually comes in large numbers. Therefore, churn is used to consider a rare object2 in service-based domains including telecom. Thus, telecom datasets always suffer from a class imbalance problem (CIP) and lead to a situation in which minority instances remain unlearned.

Advanced machine learning techniques can be applied to predict potential churners. Let us consider a dataset with 10,000 data instances with 10% of churn samples i.e., 1000 churners and 9,000 non-churners. Even if a carefully built model could predict 90% correctly on the minority class, it means 100 customers are misclassified to the wrong class. Suppose 60 churners are misclassified as non-churners, i.e., false negatives, the company will lose a huge amount of revenue since recruiting new customers is more expensive than keeping the existing ones.3 Thus, the ultimate goal in the telecom sector is to increase profit by decreasing customer churn. Hence, CIP is a block when trying to achieve the major goal of CCP, since it degrades classification accuracy. Algorithmic fairness has become a very active research topic since ProPublica observed that the algorithms could yield discriminative outcomes, which impacted a minority group in real life.4

Algorithmic fairness is monitored in line with the protected features or sensitive variables in the dataset. Sensitive data could generally be, but not limited to gender, race, age group or religion. Algorithmic fairness is achieved if the decisions generated by a model do not favor more or less any individual or a group.5 The lesser the bias in the training data, the bigger the chance of achieving algorithmic fairness. However, it is almost not possible to train a zero-bias model since the historical data could have contained bias for many reasons.6 The common reasons for bias in the training data involve the compounding of initial bias over time, using proxy variables, and unbalancing of sample size between minority and majority group.7

In the CCP process, customers’ behavior is analyzed within specific time windows, for example within one month.8 Once the prediction is done, the outcomes are reused as training data for the next prediction. Therefore, there are high chances to have repeated bias in the historical data without even noticing. One solution for CIP is to apply data sampling techniques (DSTs) to the training data. Since the major function of DSTs is to increase or decrease the sample instances to balance between majority and minority classes, there are changes in the number of samples in the different groups in the dataset. The main goal of this study is to explore and identify the impact of using DSTs on training data on algorithmic fairness in the CCP process. To the best of our knowledge, there is very little research concerning algorithmic fairness in the CCP process. We believe the findings of this study would provide valuable insights to future CCP research.

Methods

Ethical Approval Number: EA1742021

Ethical Approval Body: Research Ethics Committee 2021, Multimedia University

In this study, the original data set is prepared to make three versions of unbalanced datasets, with rates of 5%, 15% and 30%. Each version is applied with four DSTs and compared the results with the unsampled original dataset to evaluate the classification performance and impacts on algorithmic fairness. The step-by-step methods to conduct the study are presented in Figure 1.

5e89b74e-3e40-439f-9f76-c0958179167d_figure1.gif

Figure 1. Procedures of the study.

Datasets

A real-world telecom dataset was provided by one of Malaysia’s leading telecom companies (see Underlying data for details on access to this dataset). The original dataset contains 1,265,535 customer records, which were collected from January 2011 to December 2011. Since the original data set is huge in volume, we randomly selected 100,000 records and utilized them for this study. We included demographics, call information, network usage, billing information, and customer satisfactory data in our dataset since they are considered as influential factors in the CCP process.9,10 A total of 22 features were extracted after careful aggregation, i.e., new features were created based on the original data and some unnecessary features were deleted from it, and features are listed in Table 1.

Table 1. Features used in the real-world dataset.

No.Name of the featuresDescription
1Customer IDCustomer ID
2AgeAge of customer
3Is seniorIs the customer over 60 or not
4GenderGender of customer
5Is localIs the customer a Malaysian or an international?
6RaceIs the customer Malay or Indian or Chinese or Other?
7Technical-problem-countTotal technical complaints and general complaints made by a customer
8Complain-countTotal general complaints made by a customer
9Avrg downloadAverage download rate
10Avrg uploadAverage upload rate
11T-LocationThe location where the customer registered for the service
12HSBB areaIs the customer in the area where a high-speed connection is required or not
13SpeedBroadband speed customer has registered for
14Price startThe value of package customer has bought
15Contract periodThe contract period of the customer
16Median- outstandingAverage overdue fees
17Avrg local amtAverage amount spends for calling locally
18Avrg std amtAverage amount spends for subscriber trunk dialing
19Avrg idd amtAverage amount spends on international calls
20Avrg voice usageAverage amount spends on voice calls
21Avrg dialup amtAverage amount spends on dialup service
22ChurnWhether the customer is churned or not

The final dataset was prepared with three different rates of unbalancing: 5%, 15%, and 30%. We created a Python script (see Extended data) which used the Pandas tool of Scikit-learn machine learning library to prepare three versions of datasets. We set up these specific rates because we wanted to experiment with extremely unbalanced cases up to intermediate levels.

Data preprocessing

In the data preprocessing stage, we excluded any null values. Since we found only a few outliers in the selected dataset, we manually removed them without using any specific procedure. We applied four DSTs to the data: Random Over Sampler (ROS), Random Under Sampler (RUS),11 Synthetic Minority Oversampling Technique (SMOTE),12 and Adaptive Synthetic Oversampling Technique (ADASYN).13 The selection of DSTs was based on their popularity and to know the impact of each of them on the algorithmic fairness in the CCP process.

Classification of data

We applied six popular classifiers: Random Forest (RF), Decision Tree (DT), LightGBM (LGBM), Gradient Boosting (GB), Logistics Regression (LR), and XGBoost.14 We created our own Python script (see Extended data) using Scikit-learn machine learning library to perform this step. After a careful exploratory data analysis, we dropped Customer ID, Avrg local amt, Avrg std amt, Avrg idd amt, Avrg dialup amt from the predictor variable list since they were weakly correlated to the target variable.

Evaluation of experiment

We performed two evaluations: performance measures15 and algorithmic fairness metrics.16

Performance measures

In measuring the classifier's performance, we applied standard measures which are commonly used in most of machine learning classification tasks, including precision, recall and accuracy. We applied F-1 and AUC-ROC scores since accuracy alone is not enough to evaluate the actual performance of the classifiers. We created an own script (see Extended data) using Scikit-learn, a free machine learning software library for Python programming language. The performance of each classification was done as follows:

Accuracy=TP+TNTP+TN+FP+FN,

where

TP=true positive
TN=true negative
FP=false postive
FN=false negative
Precision=True positiveTrue positive+False positive
Recall=True positiveTure positive+False negative
F1Score=2PrecisionRecallPrecision+Recall
AUCROC=Rank++++1/2++    where, Σ Rank (+) is the sum of all positive classified examples|+| is the number of positive examples in the dataset|-| is the number of negative examples in the dataset

Algorithmic fairness metrics

We emphasized the assessment of whether the classifier is discriminated between women, a protected group, and men, a non-protected group. We applied two well-known fairness definitions in measuring algorithmic fairness, and utilized the popular AI-fairness 360 tool to calculate algorithmic fairness.16

Statistical parity (SP): Also known as an equal acceptance rate. SP is achieved if women have an equal probability to be predicted in the positive, i.e., churn class, as the men.17

SP difference measures the difference of a specific outcome between the protected (female group) and non-protected (male group). The smaller the SP difference between the two groups, we can say that the model treats the unprotected group statistically similar to the protected group.

SP is calculated as follows:

PrY=1Group=male)=PrY=1Group=female),whereY=predicted decision

Disparate Impact (DI): Also known as indirect discrimination where no protected variables are directly applied, but biased outcomes are still produced relying on the variables correlated with protected variables.18 The standardized threshold in calculation of DI is 0.8, which means the group whose DI values are under 0.8 are discriminated by the classifier.

The threshold value 80% is advised by the US Equal Employment Opportunity Commission.19 The model could be DI-free when the value is larger than 80% but it should be lower than 125% according to.20

DI is calculated as follows:

DI=PrY=1Group=female)PrY=1Group=male)τ=0.8,
where Y=predicted decision

Results

The preliminary classification results for the datasets with different data unbalanced rates using four DSTs are shown in Tables 24. Table 2 shows the specific results of classification performance gotten when testing on 5% of unbalanced rate with respect to the chosen classifiers and four DSTs.

Table 2. The classification results for the dataset with 5% unbalanced rate.

Classifier5% imbalanced with ROS5% imbalanced with RUS
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
RF0.990.990.990.990.990.840.840.840.840.93
DT0.980.980.980.980.970.800.800.800.800.79
LGBM0.890.900.890.890.960.840.840.840.840.93
GB0.850.850.850.850.990.850.850.850.850.93
LG0.780.810.780.780.870.800.830.800.810.88
XGBoost0.930.930.930.930.970.830.830.830.830.92
5% imbalanced with SMOTE5% imbalanced with ADASYN
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
0.980.980.980.980.990.980.980.980.980.99
0.960.960.960.960.960.960.960.960.960.96
0.980.980.980.980.990.980.980.980.980.99
0.960.960.960.960.990.960.960.960.960.99
0.790.810.790.790.870.770.810.770.770.84
0.980.980.980.980.990.980.980.980.980.99

Table 3. The classification results for the dataset with 15% unbalanced rate.

Classifier15% imbalanced with ROS15% imbalanced with RUS
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
RF0.970.970.970.970.990.760.750.760.760.83
DT0.930.930.930.930.930.680.680.680.680.68
LGBM0.780.780.780.780.870.760.770.760.760.84
GB0.760.770.760.760.840.760.760.760.760.84
LG0.640.640.640.640.690.640.640.640.640.70
XGBoost0.810.810.810.810.900.750.760.750.760.84
15% imbalanced with SMOTE15% imbalanced with ADASYN
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
0.930.940.930.930.970.930.930.930.930.97
0.890.890.890.890.890.880.890.880.890.88
0.940.940.940.940.970.940.940.940.940.97
0.910.920.910.910.960.910.920.910.910.96
0.640.640.640.640.700.590.590.590.590.63
0.940.760.940.940.970.940.940.940.940.97

Table 4. The classification results for the dataset with 30% unbalanced rate.

Classifier30% imbalanced with ROS30% imbalanced with RUS
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
RF0.890.890.890.890.950.740.740.740.740.82
DT0.830.830.830.830.830.660.660.660.660.66
LGBM0.760.760.760.760.840.750.760.750.750.83
GB0.740.750.740.740.820.750.760.750.750.82
LG0.660.670.660.660.680.630.640.630.630.68
XGBoost0.770.770.770.770.850.750.750.750.750.83
30% imbalanced with SMOTE30% imbalanced with ADASYN
AccuracyPrecisionRecallF1-scoreAUC-ROCAccuracyPrecisionRecallF1-scoreAUC-ROC
0.86o.860.860.860.920.860.860.860.860.92
0.790.790.790.790.790.790.790.790.790.79
0.860.870.860.860.920.860.870.860.860.93
0.840.860.840.840.910.840.850.840.840.91
0.660.670.660.660.680.550.920.550.650.63
0.860.860.860.860.920.860.870.860.860.93

Table 3 shows the details results of classification performance obtained when testing on 15% of unbalanced dataset with respect to the chosen classifiers and four DSTs.

Table 4 shows the details results of classification performance obtained when testing on 30% of unbalanced dataset with respect to the chosen classifiers and four DSTs.

In our study, we have observed that a variable, is-senior remained unbalanced even after applying the DSTs. The algorithmic fairness scores for each group with different unbalanced rates are described in Tables 57. Table 5 shows the comparative results of SP difference and DI scores calculated on 5% unbalanced dataset and original dataset.

Table 5. The algorithmic fairness measures on 5% unbalanced dataset.

Algorithmic fairness metrics5% original data5% imbalanced with ROS5% imbalanced with RUS5% imbalanced with SMOTE5% imbalanced with ADASYN
RF
SP Difference−0.0056−0.0703−0.05240.14020.1401
DI0.800.860.891.321.32
LightGBM
SP Difference−0.0067−0.0703−0.05240.14020.1390
DI0.790.860.891.321.32
GB
SP Difference−0.0057−0.0854−0.54560.13490.1319
DI0.800.840.891.311.29
LR
SP Difference−0.0024−0.0789−0.02270.17650.1591
DI0.640.940.961.341.28
XGBoost
SP Difference−0.0069−0.0842−0.54460.13870.1383
DI0.780.850.891.321.32
DT
SP Difference−0.0073−0.0698−0.03970.13310.1317
DI0.860.870.911.301.29

Table 6. The algorithmic fairness measures on 15% unbalanced dataset.

Algorithmic fairness metrics15% original data15% imbalanced with ROS15% imbalanced with RUS
RF
SP Difference−0.009644−0.058223−0.049729
DI0.8712120.8925870.891628
LightGBM
SP Difference−0.011044−0.070341−0.059666
DI0.8569640.8532930.872208
Gradient Boosting
SP Difference−0.006168−0.057517−0.055826
DI0.9136210.8742250.877351
Logistic Regression
SP Difference−0.007872−0.10040.01865
DI0.7117710.795811.03868
XGBoost
SP Difference−0.011008−0.069346−0.059144
DI0.8640760.8583860.874791
Decision Tree
SP Difference−0.016556−0.05406−0.016812
DI0.9009140.9074140.967054

Table 7. The algorithmic fairness measures on 30% unbalanced dataset.

Algorithmic fairness metrics30% original data30% imbalanced with ROS30% imbalanced with RUS
RF
SP Difference−0.035246−0.046984−0.043816
DI0.8245130.9110020.905239
LightGBM
SP Difference−0.036931−0.055603−0.059621
DI0.8117010.8827270.872917
Gradient Boosting
SP Difference−0.027146−0.04126−0.044747
DI0.8475310.9103970.901757
Logistic Regression
SP Difference0.0024240.0118730.029787
DI1.0429281.0339431.063294
XGBoost
SP Difference−0.036665−0.056972−0.058089
DI0.8260010.8816220.877416
Decision Tree
SP Difference−0.28866−0.033586−0.028326
DI0.9117110.942180.944787

Table 6 displays the comparative results of SP difference and DI scores calculated on 15% unbalanced dataset and original dataset.

Table 7 describes the comparative results of SP difference and DI scores calculated on 30% unbalanced dataset and original dataset.

Discussion

Overview of experimental results

Recent works of algorithmic fairness research in machine learning applications is broadly organized into three main trends. Some studies emphasize enhancing or proposing better fairness notions and evaluation metrics in line with the domains concerned,17,21 some focus on the ways to mitigate the bias in the classification process (which can further be divided into three main groups: pre-, in-, and post-processing techniques),22-25 while the last trend proposes how to maintain the ethical AI standards and policies in practicing machine learning applications in different sectors.26,27

Despite some previous empirical studies on the impact of using preprocessing techniques on algorithmic fairness, the findings of previous works could not pinpoint the direct impact of using DSTs on algorithmic fairness. Lourenc and Antunes,28 which is the closest work to our research, distinguish the effect of data preparation on algorithmic fairness. However, their work has been tested with two small datasets and provides general results of using random under- and over- DSTs. Importantly, their work fails to be tested on the widely-applied DSTs, SMOTE and ADASYN. In contrast, we apply real-world business data and show how different DSTs influence dissimilar levels of imbalance rate.

In the classification task, RF seems to be the best classifier since it yielded the best results over the other five models, while LR provided the worst scores for almost all metrics. It was observed that RUS worked better for the extremely unbalanced situation compared with 15% and 30% imbalanced rates. The best outcomes were found via ROS, SMOTE, and ADASYN in all different unbalanced rates, thus, could be concluded that oversampling techniques seem to provide more promising prediction results over undersampling techniques. This might be because the undersampling technique modifies the data by decreasing the majority of instances, which makes the dataset lack useful information for learning.

For all three unbalanced rates, the original dataset always gave less statistical parity differences (SPD) comparing to sampled datasets created using four DSTs, while datasets with RUS and ROS yield a slightly larger SPD but the statistics showed there is no disparate impact. However, we can hypothetically consider there might still be a bias as both RUS and ROS have their limitations. With RUS, important and essential data could have been removed and the classifier could provide a biased result since there was less information to learn from. On the other hand, with ROS, the prediction performance could also be biased due to the overfitting problem. In this sense, it is suggestible to apply different fairness measures and to compare the fairness scores. For the DI scores, if there is DI less than 0.8, there is indirect discrimination towards the unprotected group. The mathematical equivalence of DI suggests equalizing the outcomes between protected and unprotected groups. However, in reality, the conditions in the context of interest drive us to allow DI to a specific group up to some percentage. For example, in telecom CCP, the number of female customers could be very less than the dataset, since most males usually apply for a network plan representing the whole household. Therefore, we assume considering DI with 80% rule is reasonable.

In the 5% unbalanced original dataset, LGBM, LR and XG-Boost imposed with DI values of 0.79, 0.64, and 0.78 respectively. But there is no DI in the other two original datasets for 10% and 30%. This reveals that more discrimination could occur on a more unbalanced dataset. The analysis on all datasets with SMOTE and ADASYN provides alarming information on the classifier’s discrimination on the unprotected group. The 30% unbalanced dataset yields the worst unfair results since this is the highest SPD between female and males’ group with LR as 0.38 and 0.43, respectively. Overall, among all DSTs, ADASYN, and SMOTE tend to provide more unfair outcomes compared to other DSTs. In contradiction, they both provide a better classification performance in comparison to RUS and ROS. There is not a huge difference among the three different data unbalanced levels. However, in this study, we experimented with the gender attribute as a sensitive variable.

Opportunities and challenges

Due to the nature of the CCP process and the rarity issue, training datasets have high chances to have compounded bias and suffer from unbalanced problems not only for the target class but also in the other attributes including sensitive variables. We have noticed that one variable remained unbalanced even after applying the DSTs; in such a case, a careful selection of data attributes should be done to avoid selection bias.

As the quality of training data is important, we would suggest enhanced mechanisms of data repairing techniques to prevent bias in the training data. Furthermore, the algorithmic fairness problem is mostly concerns societal discrimination. For example, in the scholarship selection process, if classifiers give more favors to males than females who have the same qualifications as males but are not selected, this will decrease their chances of scholarship. In a profit-centered industry like telecom, one could think there will be no loss for the customers though any group is less or more favored. It is important to consider the impact of biased decisions for the sake of the company’s reputation, the importance of equal treatment to customers, and to practice ethical AI policies.

Conclusions

In this paper, we experimented on three versions of unbalanced real-world telecom datasets to assess the impact of using four types of DSTs on the algorithmic fairness in the CCP process and compared the results with the unsampled original dataset. Classification performance and algorithmic fairness were evaluated with well-known metrics. The outcomes imply that RF provides the best classification results. Using SMOTE and ADASYN yields larger SPD between male and female groups as well as a disparate impact on the female over the male group. Previous work emphasizes the use of this method in choosing a scholarship candidate, releasing prisoners on parole, and choosing a credit candidate. Since machine learning applications would be applied to almost every sector in the near future, the practice of using fairer or unbiased systems is essential. Our study highlights the importance of paying attention to algorithmic fairness in the machine-driven decision-making process of the profit-centered and customer-oriented sectors on which very little research work has been done. Particularly, our finding highlights the fact that a careful choice of DSTs must be done to achieve unbiased prediction results. In future work, we would like to test the same procedure on a larger dataset and would like to measure more algorithmic fairness metrics to investigate the best suitable algorithmic measures for the CCP task. Moreover, we would like to test more sensitive variables rather than just gender.

Data availability

Underlying data

The real-world telecom dataset was obtained from the Business Intelligence and Analytics department of Telekom Malaysia Bhd. The authors were required to go through a strict approval process following established data governance framework. Interested readers/reviewers may contact the Business Intelligence and Analytics department to request the data (technicalsuport@tm.com.my). The decision as to whether or not to grant access to the data is at the discretion of Telekom Malaysia Bhd.

As most telco companies own similar customer data, other customer churn datasets that are representative of the data being used in this research can be found as follows:

Extended data

Analysis code available from: https://github.com/mawmaw/fairness_churn.

Archived analysis code as at time of publication: https://doi.org/10.5281/zenodo.5516218.29

License: MIT License.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 30 Sep 2021
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Maw M, Haw SC and Ho CK. Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:988 (https://doi.org/10.12688/f1000research.72929.1)
NOTE: If applicable, it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 1
VERSION 1
PUBLISHED 30 Sep 2021
Views
11
Cite
Reviewer Report 24 Feb 2022
Prabu P, Department of Computer Science, CHRIST University, Bengaluru, Karnataka, 560029, India 
Approved with Reservations
VIEWS 11
This work mainly focus on data imbalance problems in customer churn prediction. The author has to incorporate the following suggestion in his/her article in-order to improve the quality of the work
  1. The author need to clearly
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
P P. Reviewer Report For: Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:988 (https://doi.org/10.5256/f1000research.76542.r120732)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Jun 2022
    Haw Su Cheng ., Faculty of Computing and Informatics, Multimedia University, Cyberjaya, 63100, Malaysia
    30 Jun 2022
    Author Response
    1. AUC-ROC score was applied to compare the performance of the respective classifiers and this information is added for clarification in the discussion section/ performance measure section
       
    ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Jun 2022
    Haw Su Cheng ., Faculty of Computing and Informatics, Multimedia University, Cyberjaya, 63100, Malaysia
    30 Jun 2022
    Author Response
    1. AUC-ROC score was applied to compare the performance of the respective classifiers and this information is added for clarification in the discussion section/ performance measure section
       
    ... Continue reading
Views
17
Cite
Reviewer Report 11 Jan 2022
Chu Kiong Loo, Faculty of Computer Science and Information Technology, University of Malaya, Kuala Lumpur, Malaysia 
Approved with Reservations
VIEWS 17
  • This paper addresses an important issue of algorithmic fairness, i.e. to investigate whether the results pose any discrimination between male and female groups and compare the results before and after using DSTs. 
     
... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Loo CK. Reviewer Report For: Utilizing data sampling techniques on algorithmic fairness for customer churn prediction with data imbalance problems [version 1; peer review: 2 approved with reservations]. F1000Research 2021, 10:988 (https://doi.org/10.5256/f1000research.76542.r96164)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Jun 2022
    Haw Su Cheng ., Faculty of Computing and Informatics, Multimedia University, Cyberjaya, 63100, Malaysia
    30 Jun 2022
    Author Response
    Learning curves for Random Forest before and after applying DSTs (for three versions of datasets) are provided. We discussed briefly in the discussion section as well.
    Competing Interests: No competing interests were disclosed.
COMMENTS ON THIS REPORT
  • Author Response 30 Jun 2022
    Haw Su Cheng ., Faculty of Computing and Informatics, Multimedia University, Cyberjaya, 63100, Malaysia
    30 Jun 2022
    Author Response
    Learning curves for Random Forest before and after applying DSTs (for three versions of datasets) are provided. We discussed briefly in the discussion section as well.
    Competing Interests: No competing interests were disclosed.

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 30 Sep 2021
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.