Towards AI Can Help your Team Adopt AI: Corporate Training, Consulting, and Talent Solutions.

Publication

How Can Hardcoded Rules Overperform ML?
Latest   Machine Learning

How Can Hardcoded Rules Overperform ML?

Last Updated on March 21, 2023 by Editorial Team

Author(s): Ivan Reznikov

Originally published on Towards AI.

How Can Hardcoded Rules Overperform ML?

I have a confession to make.

When I was younger, I was sure that ML could, if not overperform, at least match the pre-ML-era solutions almost everywhere. I’ve looked at rule constraints in deployment and wondered why not replace them with tree-based ml models.

However, gaining more industry experience, I realized that the world is not always black and white. While machine learning certainly has its place in problem-solving, it’s not always the best solution. Rule-based systems can even outperform machine learning, especially in the areas where interpretability, robustness, and transparency are critical.

In this article, I’ll share what I’ve learned, where hybrid systems can be used, and what benefits you get by introducing them to an ML pipeline. We’ll look into practical examples in industries such as healthcare, finance, and supply chain management.

Content:

  1. Rule-Based Systems
  2. ML-Based Systems
  3. Hybrid Systems
  4. Practical Cases
  5. Personal Experience
  6. Conclusions

Rule-based systems

A rule-based system is a set of predefined rules to make decisions or provide recommendations. The system evaluates the data against the stored rules and performs a certain action based on the mapping.

Below are a few examples:

Fraud Detection: In fraud detection, rule-based systems can be used to flag and investigate suspicious transactions based on predefined rules quickly.

Around ten years ago, I remember creating an algorithm to catch chess cheaters. The basic way how cheaters acted was by having a computer chess app in another window that suggested the best moves. No matter what the position’s complexity, each move is too 4–5 seconds to make. Add “accuracy threshold” — how close to the “best” computer line the player performed, and you’ll have quite a robust system.

A chess game vs. a cheater. Image generated by the author.

Healthcare: Rule-based systems could be used to manage prescriptions and prevent medication errors. They also can be quite useful to assist doctors with prescribing additional analyses to be taken by patients based on the results of the previous ones.

Supply Chain Management: In supply chain management, rule-based systems can be used to generate alerts for low inventory, help manage expiration dates, or for new product introduction.

ML-based systems

Machine Learning (ML) systems use algorithms to learn from data and make predictions or take actions without being explicitly programmed to do so. ML systems use the knowledge gained by being trained on large amounts of data to make predictions and decisions for new data. ML algorithms can improve their performance as more data is used for training. ML systems include natural language processing, image, and speech recognition, predictive analytics, etc.

Fraud detection: a bank might use an ML system to learn from past fraudulent transactions and identify potential fraudulent activity in real-time. Or, it might reverse engineer the system and look for transactions that look very “outlierish”.

Healthcare: a hospital might use an ML system to analyze patient data and predict the likelihood of a patient developing a certain disease based on some X-rays.

Computer Vision for X-ray Shots. Github

Supply Chain: demand forecasting based on historical sales for different users/locations/items/SKUs

Pros and Cons

Both rule-based and ML systems have their advantages and disadvantages. Let’s go through them.

Rule-based systems: Advantages

  • Simple to comprehend and interpret
  • Fast to implement
  • Easy to modify
  • Robust

Rule-based systems: Disadvantages

  • Issues involving a vast number of variables
  • Problems with numerous constraints
  • Limited to existing rules

ML-based systems: Advantages

  • Autonomous learning systems
  • Ability to tackle more intricate problems
  • Higher efficiency with the reduced human intervention compared to rule-based systems
  • Flexibility to adapt to changes in data and environment over time through continuous learning

ML-based systems: Disadvantages

  • Requires data. Sometimes a lot
  • Limited to the data ML seen before
  • Limited cognitive ability

There is a sure way to combine the pros of both methods — hybrid systems

Hybrid-based systems

Rule, ML, and Hybrid Systems. Image generated by the author.

Hybrid systems, combining rule-based systems and machine learning algorithms, have become increasingly popular recently. They can provide more robust, accurate, and efficient results, particularly when dealing with complex problems.

Let’s look at the types of hybrid systems that can be implemented using a rental dataset:

Dataset example. Image generated by the author.

1. Rule(Xi) → ML(Xi): Feature engineering. For example, converting a floor to one of three categories: high, medium, or low, depending on the number of floors in the building.

2. ML(Yi) → Rule(Yi): Post processing. Rounding or normalizing final results.

3. ML(Yi) → Rule(Zi): Using ml predictions for other decisions. For example, should we place a bid for a flat based on the probability of the flat being sold this week (sold_7), and will the price be dropped (price_drop)?

4. ML(Yi) + Rule(Yi) → aggregated forecast. Blending ML results with rule/domain/table based.

Before jumping into code, let’s think of some known companies or tools that use rule-ml systems under the hood.

I would expect companies like Grammarly or QuillBot to use NLP hybrid systems for checking spelling and rephrasing. Another case — is search and recommendation systems that have configurable parameters to boost particular results higher.

OK, let’s take a look at a couple of practical cases.

Practical Cases

You can find the notebook below examples and more on Github.

Hearth diseases

Let’s take a look at a heart disease dataframe:

Heart Disease DataFrame. Image created by the author

Let’s implement a Random Forest to predict the target class:

clf = RandomForestClassifier(n_estimators=100, random_state=random_seed
X_train, X_test, y_train, y_test = train_test_split(
 df.iloc[:, :-1], df.iloc[:, -1], test_size=0.30, random_state=random_seed
)
clf.fit(X_train, y_train))

One of the reasons we chose Random Forest is it’s built feature importance ability. Below you can take a look at the importance of features used for training:

Feature Importance for Selected Dataset. Generated by author.

Let’s look at our results:

y_pred = pd.Series(clf.predict(X_test), index=y_test.index
cm = confusion_matrix(y_test, y_pred, labels=clf.classes_)
conf_matrix = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=clf.classes_)
conf_matrix.plot())
Confusion Matrix for Pure Random Forest Model
f1_score(y_test, y_pred): 0.74
recall_score(y_test, y_pred): 0.747

Now, let’s imagine that a cardiologist sees your model. Based on his experience and domain knowledge, he suggests that the Thalassemia feature (thal) is much more important than shown above. You decide to build a histogram plot and look at the results.

Histogram Plot for Features and Target

Let’s assign a mandatory rule

y_pred[X_test[X_test["thal"] == 2].index] = 1

The confusion matrix, in this case, will change:

Confusion Matrix for Hybrid Random Forest Model
f1_score(y_test, y_pred): 0.818
recall_score(y_test, y_pred): 0.9

As you can notice, the results have increased. Our domain knowledge played an important role in estimating our patients’ scores.

Fraud transactions

Another dataset, that we’ll take a look into is bank fraud transactions.

Fraud DataFrame. Image created by the author

The dataset is highly imbalanced:

df["Class"].value_counts()
0 28431
1 4925

In order to create rules, let’s view the distribution boxplots of features:

BoxPlot for Features and Target. Image created by the author

One of the possible solutions, besides writing our own HybridEstimator class, would be using a human-learn package:

from hulearn.classification import FunctionClassifier
rules = {
 "V3": ("<=", -2),
 "V12": ("<=", -3),
 "V17": ("<=", -2),
}
def create_rules(data: pd.DataFrame, rules):
 filtered_data = data.copy()
 for col in rules:
 filtered_data[col] = eval(f"filtered_data[col] {rules[col][0]} {rules[col][1]}")
 result = np.array(filtered_data[list(rules.keys())].min(axis=1)).astype(int)
 return result
hybrid_classifier = FunctionClassifier(create_rules, rules=rules)

We can compare the results for our pure 3-rules-based system and kNN method, which handles the imbalance quite well:

Confusion Matrix for 3-Rules Model and kNN. Image created by the author

You can find more examples of hybrid systems and their usage, including safety switches and dealing with bad predictions, in the notebook on GitHub.

Personal Experience

The above examples are great, but what about ML systems in production?

I’ve recently ordered an item to be delivered and found an interesting field to track — estimated delivery. I understand when the estimated time is three days but isn’t displaying seconds a bit too much?

Image created by the author

At one of my recent projects, we were doing demand forecasting. Three ways how hybrid systems were implemented:

1. Format Outputs (without Data Filtering)

Converting ML forecast into the business quantity to replenish:

Converting Forecast to Actual Quantity. Image created by the author

2. Format NPI Outputs (without Data Filtering)

Whenever a new product is introduced to the market, for the first few days, we may use sales domain knowledge:

Using NPI Domain Knowledge for Demand Forecasting

3. Select Threads (with Data Filtering) for Time Series

Time series models may take a lot of time to run, especially if they are designed to run a single item/sku at a time. One of the solutions may be to speed up the forecast is selecting/sorting the most valuable threads to be calculated first:

Possible Distribution of Sales Forecast for Items

Conclusions

Hybrid rule-ML systems offer practical benefits such as fast implementation, robustness to outliers, and increased transparency. They are beneficial when combining business logic with machine learning. For instance, hybrid rule-ML systems in healthcare can diagnose diseases by combining clinical rules and machine-learning algorithms that analyze patient data.

As business and data science continue to integrate, hybrid systems can be crucial in smoother integration. These systems offer a valuable combination of rule-based knowledge and machine learning capabilities, providing a more comprehensive and accurate solution to complex problems. The flexibility and adaptability of hybrid rule-ML systems make them effective across many industries.

Business ML Systems from Scratch to Product

Machine learning has rapidly transformed the business world in the recent years, offering new opportunities for…

medium.com

Data Scientist 2.0: Evolving with the Industry

Discover the Secrets to Becoming a Data Scientist 2.0! The latest article describes how to become an indispensable…

medium.com

 

Have you ever faced a case when rule or hybrid models outperformed pure ml? Leave in the comments below.

Originally published on Linkedin

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓