Projet de fin d'étude : Development of a framework to explore Internet finance risks in banking sector

Etudiant : MUSTAPHA AMARIGH

Filière : Master Big Data Analytics & Smart Systems (BDSaS)

Encadrant : Pr. ZINEDINE AHMED

Annèe : 2024

Résumé : Internet finance is a new type of financial business model used to describe a new technology that seeks to enable, improve and reduce financial services. Currently, many banks offer online financial products such as money transfers and payments, but some banks have even developed a peer-to-peer (P2P) online lending business. Because Internet finance is still in the development phase, it will face many risks, such as technical, operational, and credit risk problems. In order to handle these risks, our study only focuses on credit risk problem in the banking sector as the first version. Typically, the general credit risk assessment approach is applying the classification model to past customer data, including good and bad customers, to find the relationship between user features and potential credit risk. To control this credit risk problem in banks, I proposed a framework that divides the customers into three groups: "Normal", "Low-risk", and "High-risk" which means that the problem we deal with in this study is a semi-supervised problem, from the unsupervised problem to find the credit risk levels, to the supervised problem to find the best classification model for better decision-making. This framework focused on three main steps, the first one being the duplicate elimination after feature selection, the second is the balancing of categorical data using my algorithm, which I named "MySMOTEC" (My Synthetic minority oversampling technique for categorical data), and thirty is Chir feature selection method to select the important features that have a positive dependency on the target, comparing it with other feature selection methods, especially mutual information and random forest feature selection. The decision models used to compare these three methods are the Feed forward neural network, Random Forest classifier, and logistic regression, which are the most common models for assessing credit risk problems.