Bank Scan: Your Personal Financial Advisor
Last Updated on May 24, 2022 by Editorial Team
Author(s): M Khorasani
Artificial Intelligence, Programming
Developing an AI Fintech application to analyze and determine the financial health of bankingΒ clients
In a world where 2β3 billion people are underbanked, including 25% of households in the United States, the need to provide intelligent banking analytics cannot possibly be overstated. For those who may not be familiar with the term βunderbankedβ, it is defined as an individual with access to a bank account but with limited financial services and insights. Especially for many of us millennials, the absence of a solid financial advisor such as the likes of Apple Pay, can see us siphon off our wealth at warpΒ speed.
Ostensibly, not everyone has access to advanced analytics provided by their bank or smartphone, however, most of us will have access to some form of a bank statement or something to that effect. While current analytics services have the luxury of directly interfacing with your bank account and thereby acquiring information objectively, a third party service would need to quite literally read and digest the information printed on your bank statement. This is where a Fintech application with artificial intelligence can interface and add value to our poorly underbanked client. And this is indeed what this AI Fintech application hereafter referred to as βBank Scanβ willΒ do.
Methodology
Bank Scan is an application that will read your bank statement, calculate how much credit, debit, and balance you have had, and will even decipher in what categories you have been spending your money in. Finally, it will tender a βfinancial health scoreβ that will indicate to what degree of well-being your finances are. All of this is done with the utility of techniques such as natural language processing, data mining, and most salientlyβββartificial intelligence. Namely, there are several packages in Python that are being used in this program to extract data from your bank statement, that would have otherwise been carried out by a human. Specifically, Tabula is extracting the tabular data from your bank statement and feeding it into a Pandas dataframe in Python, while spaCy is taking care of the natural language processing that extracts all the human-readable text from the document and even uses AI to find items such as name and currency.
Once the content of the bank statement is inserted into a dataframe, it is fairly simple to compute total credit, debit and balance, however, what is not so trivial is determining what categories the credit and debit transactions fall into. For this purpose, a bag-of-words approach was utilized to analyze the description for each transaction and to decipher what category the item correlates to. For instance, if the descriptor has any of the following words it will be classified as spending onΒ food:
food_words =[βrestauβ,βburgerβ,βfoodβ,βsandwichβ,βsteakβ,βgrocerβ,βmealβ,βmcdonaldβ,βlunchβ,βdinnerβ,βbreakfastβ,βgourmetβ,βwineβ,βbarβ,βdrinkβ,βf&bβ,βbeverageβ,βnutriβ,βmeatβ,βeatβ,βmexicβ]
A dictionary of hundreds of stemmed and lemmatized words was formed for each of the following spending categories to classify each transaction.
- Housing
- Food
- Insurance
- Utilities
- Transportation
- Healthcare
- Recreation
- Personal
- Education
- Investments
- Other
Likewise any credits in the bank statement were categorized in the same manner with the following classifications.
- Salary
- Earnings
- Other
Visualizations
The credit/debit of the account is visualized as follows, where blue and red bars denote credit and debit respectively.
Subsequently, the chart is broken down into each of the aforementioned credit/debit categories, whereby each classification is colorΒ coded.
Financial HealthΒ Score
The spending in each category is scored linearly based on how much under or overspending is detected. Overspending is penalized while under-spending is not, and the threshold for each was loosely determined by an aggregate of financial punditsβ advice on spending, i.e. you should spend no more than 25β35% on housing and no more than 10β15% on food as a percentage of your total spending.
Finally in order to compute the financial health score, initially I calculated a βdebit scoreβ that combines your score for all of the eleven spending categories and gives this score a weight ofΒ 50%.
Then I calculated the βbalance scoreβ which takes into consideration whether you have a surplus or deficit spending and normalizes this based on your accountβs starting balance as shown in the relation below. Likewise, the balance score is also given a weight ofΒ 50%.
Subsequently, the βdebit scoreβ and βbalance scoreβ are combined as follows to give you an overall βfinancial healthΒ scoreβ.
Conclusion
The beta version of this application is complete and can be downloaded at https://github.com/mkhorasani/Bank_Scan. While this is a comprehensive application that addresses the needs of underbanked clients, it is however not exhaustive either and is indeed a work in progress. By releasing this as an open-source service and providing access to the source code, I hope to engage in a collaborative and iterative approach to enhance this product and to render it as a web app in the nearΒ future.
Bank Scan: Your Personal Financial Advisor was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.
Published via Towards AI