Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: pub@towardsai.net
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Our 15 AI experts built the most comprehensive, practical, 90+ lesson courses to master AI Engineering - we have pathways for any experience at Towards AI Academy. Cohorts still open - use COHORT10 for 10% off.

Publication

The Danger of High (or Small) Numbers In Your Computer And ML Models
Data Science   Latest   Machine Learning

The Danger of High (or Small) Numbers In Your Computer And ML Models

Last Updated on October 7, 2025 by Editorial Team

Author(s): Nelson Cruz

Originally published on Towards AI.

During day-to-day programming or general computer use, it’s common to overlook how the computer handles numbers in its definition. But this easily becomes a problem when we try to optimize a solution and even in unavoidable situations.

What really is the danger

Computers represent numbers using bits, their most basic binary unit. Each memory has a bit storage capacity defined by current technology, so if we have a computer that operates with 3 bits, we have the following situation:

The Danger of High (or Small) Numbers In Your Computer And ML Models
Source: Image by the author.

When we add 1 when we’re at the maximum number of bits, we encounter a problem since we would need 1 more bit to represent that number. In computing, this is called an integer overflow. When a sum results in integers that are too large for the chosen type, the result “turns inside out."

For example, in Python:

import numpy as np

# Define the largest 32-bit signed integer
x = np.int32(2147483647)
print("Before overflow:", x)

# Add 1 -> causes overflow
x = x + np.int32(1)
print("After overflow:", x)

Output:

Before overflow: 2147483647
After overflow: -2147483648

This behavior isn’t a bug, but rather a consequence of the limits of binary representation. In several famous examples, this occurs with real-world problems.

Unexpected famous cases

The Boeing 787 Case (2015)

In 2015, Boeing discovered that the Boeing 787 Dreamliner’s generators could shut down mid-flight if they were left on for 248 consecutive days without being restarted.

The reason? An internal timer, based on 32-bit integers, would overflow after this period, leading to a failure in the aircraft’s power management.

The fix was simple: Periodically restart the system to reset the count and memory to zero, but the potential impact was enormous.

The Level 256 Bug in Pac-Man

Those who played Pac-Man in the arcades may be familiar with the “Kill Screen.” After level 255, the level counter (stored in 8 bits) overflows upon reaching 256. This creates a glitched screen, with half the maze unreadable, making the game impossible to complete.

The developers didn’t expect anyone to play 256 levels of Pac-Man, so they didn’t handle this exception!

Source: Image by techspot

The bug of 2038

In the past, just before the year 2000, the bug of the millennium event was very popular: it said that many computers could have a bug due to the change of year 31/12/99 to 01/01/00 after midnight. Gladly, everything turned out fine, but now another catastrophic event looms just like a new Maya prophecy.

Many Unix and C systems use a signed 32-bit integer to count seconds since January 1, 1970 (the famous Unix timestamp). This counter will reach its limit on January 19, 2038, overflowing around 2.147.483.647 seconds. If left unfixed, any software that relies on time could exhibit unpredictable behavior.

And these situations doesn’t just happen with integers — with floating point numbers the situation is even more delicate, especially when we talk about numerical precision in different examples such as in Machine Learning.

How Float Variables Work

Floats (floating-point numbers) are used to represent real numbers in computers, but unlike integers, they cannot represent every value exactly. Instead, they store numbers approximately using a sign, exponent, and mantissa (according to the IEEE 754 standard).

And just like the integers of previous examples, the mantissa and exponent are represented by bits that are finite. Its value will depend on the number of bits such as 16, 32 or 64 defined by the variable declaration.

Float16 (16 bits):

  • Can represent values roughly from 6.1 × 10⁻⁵ to 6.5 × 10⁴
  • Precision of about 3–4 decimal digits
  • Uses 2 bytes (16 bits) of memory

Float32 (32 bits):

  • Can represent values roughly from 1.4 × 10⁻⁴⁵ to 3.4 × 10³⁸
  • Precision of about 7 decimal digits
  • Uses 4 bytes of memory

Float64 (64 bits):

  • Can represent values roughly from 5 × 10⁻³²⁴ to 1.8 × 10³⁰⁸
  • Precision of about 16 decimal digits
  • Uses 8 bytes of memory

The trade-offs applied in Machine Learning

The higher precision of float64 uses twice the memory and can because of that are slower than flot32 and float16, but is it necessary to use float64?

Deep learning models can have hundreds of millions of parameters. Using float64 would double the memory consumption. For many ML models, including neural networks, float32 is sufficient and allows faster computation with lower memory usage. Some are even studying the application of float16.

In theory, always using the highest precision type seems safe, but in practice Modern GPUs (RTX, for example) perform poorly on float64, while they are optimized for float32 and in some cases float16. For exemple, float64 are 10–30x slower on GPUs optimized for float32.

A simple benchmark test can be made by multiplying matrix:

import numpy as np
import time

# Matrix size
N = 500

# Matrix with different float bit sizes
A32 = np.random.rand(N, N).astype(np.float32)
B32 = np.random.rand(N, N).astype(np.float32)

A64 = A32.astype(np.float64)
B64 = B32.astype(np.float64)

A16 = A32.astype(np.float16)
B16 = B32.astype(np.float16)

def benchmark(A, B, dtype_name):
start = time.time()
C = A @ B # multiply matrix
end = time.time()
print(f"{dtype_name}: {end - start:.5f} seconds")

benchmark(A16, B16, "float16")
benchmark(A32, B32, "float32")
benchmark(A64, B64, "float64")

Exemple of output (it will depend of computational resources):

float16: 0.01 seconds
float32: 0.02 seconds
float64: 0.15 seconds

That said, an important point is that common problems in Machine Learning model performances, such as gradients, are not solved simply by increasing accuracy, but rather by making good architectural choices.

Some good practices to solve it

In deep networks, gradients can become very small after traversing several layers. In float32, values ​​smaller than ~1e-45 literally become zero.
This means that the weights are no longer updated — the infamous vanishing gradient problem.
But the solution isn’t to migrate to float64. Instead, we have smarter solutions.

ReLU : Unlike sigmoid and tanh, which flatten values ​​and make the gradient disappear, ReLU keeps the derivative equal to 1 for x > 0.
This prevents the gradient from reaching zero too quickly.

ReLU function

Batch Normalization: Normalizes the activations in each batch to keep means close to 0 and variances close to 1. This way, the values ​​remain within the safe range of float32 representation.

Residual Connections (ResNet): They create “shortcuts” through a specific function so the gradient can span multiple layers without disappearing. They allow networks with 100+ layers to work well in float32.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI


Take our 90+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Towards AI has published Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!


Discover Your Dream AI Career at Towards AI Jobs

Towards AI has built a jobs board tailored specifically to Machine Learning and Data Science Jobs and Skills. Our software searches for live AI jobs each hour, labels and categorises them and makes them easily searchable. Explore over 40,000 live jobs today with Towards AI Jobs!

Note: Content contains the views of the contributing authors and not Towards AI.