Join thousands of AI enthusiasts and experts at the Learn AI Community.

Publication

Latest

Applying Classification Algorithms to Past Loan Data

Last Updated on July 5, 2022 by Editorial Team

Author(s): Gencay I.

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

KNN, Decision Tree, Support Vector Machine, Logistic Regression

Photo by Scott Graham onย Unsplash

In this data set, I am going to conduct classification machine learning analysis on past loan data whichย are;

Content Table
ยท Data Visualization
ยท One hot encoding
ยท Feature Selection
ยท Normalize Data
ยท Classification
โˆ˜ K Nearest Neighbor
โˆ˜ Evaluation Metrics of KNN
โˆ˜ Decision Tree
โˆ˜ Evaluation Metrics of Decision Tree
โˆ˜ Support Vector Machine
โˆ˜ Evaluation Metrics of SVM
โˆ˜ Logistic Regression
โˆ˜ Evaluation Metrics of Logistic Regression
โˆ˜ Model Evaluation using a Test set
โˆ˜ Jaccard Scores
โˆ˜ F1 Scores
โˆ˜ Final Evaluation

Let's load the necessary libraries;

Image byย Author

The Loan_train.csv data set includes details of 346 customers whose loans are already paid off or defaulted.

Image byย Author

Lets loadย data;

Image byย Author

It is always efficient to look shape of data, to see the bigย picture.

Image byย Author

Now let's fix the data frames columnย type.

Image byย Author

Data Visualization

Let's see how many of each class is in our dataย set

Image byย Author

Let's plot some columns to understand better

Image byย Author
Image byย Author

Let's look at the day of week people get theย loan

Image byย Author

We see that people who get the loan at the end of the week don't pay it off, so let's use Feature binarization to set threshold values less than dayย 4

Image byย Author

Now it is time to change categorical features to numerical because we will use machine learning algorithms.

Image byย Author

86 % of females pay their loans while only 73 % of males pay theirย loan

Let's convert male to 0 and female toย 1:

Image byย Author

One hotย encoding

Now letโ€™s look education column.

Image byย Author
Image by Author- These are the features that weโ€™re gonna use in our prediction.

We use dummies to transform education from categorical to numerical.

Image byย Author

Feature Selection

Letโ€™s define features;

Image byย Author

Now it is time to define ourย label;

Image byย Author

Normalize Data

Image byย Author

Classification

These are the classification techniques that I will use in thisย Dataset.

  • K Nearest Neighbor(KNN)
  • Decision Tree
  • Support Vectorย Machine
  • Logistic Regression

K Nearestย Neighbor

Now it is time to split train and test data, as usual, 0.2โ€“0.8ย portion.

Image byย Author
Image byย Author
Image byย Author

Now it is time to look into the accuracy of test and trainย data.

Image byย Author

To define bestย K;

Image byย Author

As we can see result 7 is the best K for ourย data.

Image byย Author
Image byย Author
Image by Author- Fit theย Model

Evaluation Metrics ofย KNN

Image byย Author

Decision Tree

Now let's try using Decision Tree algorithms.

Image byย Author
Image byย Author

To define the best of theย depth;

Image byย Author

5 is the best depth score according to accuracyย scores.

Image byย Author

Letโ€™s conduct our algorithm then and evaluate;

Evaluation Metrics of Decisionย Tree

Image byย Author

Support Vectorย Machine

Now letโ€™s useย SVM.

Image byย Author

To find out the best model inย SVM;

Image byย Author
Image byย Author
Image byย Author

Evaluation Metrics ofย SVM

Image byย Author

Logistic Regression

Now it is time to use Logistic Regression.

Lets lock andย load;

Image byย Author

Train-test split;

Image byย Author

Find the bestย solver;

Image byย Author
Image byย Author

Evaluation Metrics of Logistic Regression

Image byย Author

Model Evaluation using a Testย set

Image byย Author
Image byย Author

Data processing;

Image byย Author

Jaccard Scores

Image byย Author
Image byย Author
Image byย Author
Image byย Author

F1 Scores

Image byย Author
Image byย Author
Image byย Author
Image byย Author

Final Evaluation

Image byย Author

Thanks, IBM for Machine Learning Tutorial which gets meย there.


Applying Classification Algorithms to Past Loan Data was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. Itโ€™s free, we donโ€™t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aย sponsor.

Published via Towards AI

Feedback โ†“