Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

My Experience with the Apple Vision Pro and Future Perspectives in Computer Vision and Healthcare
Artificial Intelligence   Computer Vision   Latest   Machine Learning

My Experience with the Apple Vision Pro and Future Perspectives in Computer Vision and Healthcare

Last Updated on February 5, 2024 by Editorial Team

Author(s): Alberto Paderno

Originally published on Towards AI.

Just me and the Apple Vision Pro

“It was a rainy day in Palo Alto. The line in front of the Apple Store was interminable, and only a few strong-willed people could be persistent enough to wait for hours and have a shot at earning one of the first commercially available Apple Vision Pro. Each step brought me closer to a revolution in computing…”

Ok, let’s be real. This sounds great as an opening, but I would never wait in line for any commercial product. However, thanks to the amazing Digital Health team of the Stanford Byers Center for Biodesign, I was able to try the new Apple Vision Pro and have some discussion about its potential in computer vision and healthcare.

Biodesign for Digital Health

The aging population, growth in chronic disease, skyrocketing healthcare costs, and the increasing shortage of…

biodesign.stanford.edu

Why Computer Vision and Healthcare

I won’t talk about the manufacturing quality. Apple usually does an excellent job and their new top-of-the-line product does not underdeliver. There are already a lot of professional reviews talking about this, not really my field. Let’s look at things from a computer vision and AI perspective — what is the potential, what should we expect, and what are the future lines of research?
And, of course, being a surgeon, it’s difficult not to think about the subsequent developments of this type of technology in healthcare. My head bumped into enough screens in the operating room to realize that they might not be the best solution for our current applications. Screens are everywhere in healthcare: endoscopic, laparoscopic, exoscopic, robotic surgery, patient monitoring (ECG, saturation, ventilation), radiology, etc. And if you think the Apple Vision Pro is expensive, you should look at the prices of medical-grade monitors.
But screens were a necessary evil that allowed us to shift from purely optical visualization technologies (e.g., optical microscopes and loupes) to a direct digital input — the dream of every computer vision researcher. With this type of input, it’s possible to collect training data (data that we previously threw away!), analyze procedures, and develop AI applications that are valuable in clinical practice.

From Eyes to Algorithms

While my initial focus was testing visual quality, I was struck by how intuitive the experience was, and I found myself being more impressed by some less-discussed features:

– Eye-tracking

– Passthrough quality

– Hand/pinch detection and tracking

These are the elements that make the experience particularly seamless and that, together with the concept of “Spatial Computing,” will revolutionize current UI/UX design standards. But they also made me more aware of the possible interactions with computer vision and AI.

Let me explain.

Eye-tracking is not just the “new mouse”; it’s a data source

Attention is a central concept in AI and computer vision. The article “Attention is All You Need” introduced this concept by describing the transformer architecture in natural language processing, and “An Image is Worth 16×16 Words: Transformers for Image Recognition at Scale” transferred this concept to the visual field. We are talking about a mathematical form of attention, but the broad meaning is still there.

Eye movements are the external manifestation of human visual attention. For the first time, we (well, Apple) will have an instrument that tracks and records eye movements prospectively, long term. This is an exciting concept since eye micromovements are mostly involuntary and will help us understand how we perceive the visual world — with potential correlates on the way we structure and evaluate computer vision algorithms. On the other hand, it’s scary to consider that, by virtue of their subconscious nature, eye movements might help profile customers or track what the person is thinking — a sort of “you are what you look at” concept.

Finally, medical conditions (especially neurologic and balance disorders) can influence eye micromovements and eye-hand coordination. Constant tracking may be a beneficial “opportunistic screening” tool to diagnose these conditions early.

High passthrough quality brings computer vision to vision

One of the most striking elements is the impact of high passthrough quality on the overall experience. The objective is to shut off the perception of looking at a screen and to integrate digital elements into the real world. And the Apple Vision Pro gets really close to that.

This has been achieved thanks to the concomitant increase in resolution and quality of camera sensors and micro-OLED displays — and we are getting closer to a condition where it will be impossible to determine if we are looking through a digital camera + screen, or through a glass.

As a consequence, it will be possible to apply computer vision to every setting in everyday life. It’s not just autonomous driving and specific applications. Computer vision applications won’t need a separate device — smartphones, tablets, computers, endoscopes — the interaction will be direct.

Spatial computing is the perfect platform for computer vision.

Interfaces based on hand-eye input will change UI/UX design principles

Conventional interfaces are based on well-defined input devices (e.g., mouse, keyboard, trackpad). Here, everything in the visual field can potentially become an input source — starting from the hands and extending to the entire available space. Again, this is based on computer vision (eye and hand/gesture tracking on top of everything) — the entire video feed from the numerous cameras must be processed as an “input”, reprocessed, and integrated with digital components (creating the “output”), shattering the conventional separation between input and output. This will significantly increase the interactions between the applications, user, and environment — ultimately requiring new UI/UX design paradigms.

Yes, And What About Healthcare?

As a physician and surgeon, it’s difficult not to think about the potential revolution this technology would bring into healthcare, apart from the previously mentioned struggle between my head and floating monitors. The surgeon could easily position and look at 2D or 3D screens during endoscopic or exoscopic surgery, integrating the view with the patient’s information from vital signs tracking, radiologic imaging, and image enhancement techniques. The view could be further extended with dedicated computer vision algorithms to detect instruments, tissues, and anatomical structures.

Finally, UI/UX design in healthcare is far from ideal. Ease of use and functional layouts are often low priorities when dealing with complex medical data. However, the advent of spatial computing offers a blank slate to build on, maybe following better concepts of design and usability.

The Spezi framework from Stanford caters to these needs thanks to its modular structure. Specifically, Spezi is an open-source framework for the rapid development of modern, interoperable digital health applications, and the team is already working on integrating applications in VisionOS.

Spezi

Spezi is an open-source framework for the rapid development of modern, interoperable digital health applications based…

spezi.sites.stanford.edu

In wrapping up my dive into the Apple Vision Pro and its intersection with computer vision and healthcare, it’s clear we’re on the cusp of a transformative period. This device isn’t just about sharper images or smoother interfaces; it’s about redefining our interaction with technology and its application in medicine. The Vision Pro exemplifies how technology can seamlessly integrate into our lives, offering insights that extend far beyond the screen.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓