Master LLMs with our FREE course in collaboration with Activeloop & Intel Disruptor Initiative. Join now!

Publication

Pull and Push — How Machines Deliver Text Data To Human
Latest

Pull and Push — How Machines Deliver Text Data To Human

Last Updated on January 6, 2023 by Editorial Team

Author(s): Andrew D #datascience

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Pull and Push — How Machines Deliver Text Data To Humans

Learn how pull and push strategies define how relevant information is delivered to the end user

body[data-twttr-rendered=”true”] {background-color: transparent;}.twitter-tweet {margin: auto !important;}

function notifyResize(height) {height = height ? height : document.documentElement.offsetHeight; var resized = false; if (window.donkey && donkey.resize) {donkey.resize(height);resized = true;}if (parent && parent._resizeIframe) {var obj = {iframe: window.frameElement, height: height}; parent._resizeIframe(obj); resized = true;}if (window.location && window.location.hash === “#amp=1” && window.parent && window.parent.postMessage) {window.parent.postMessage({sentinel: “amp”, type: “embed-size”, height: height}, “*”);}if (window.webkit && window.webkit.messageHandlers && window.webkit.messageHandlers.resize) {window.webkit.messageHandlers.resize.postMessage(height); resized = true;}return resized;}twttr.events.bind(‘rendered’, function (event) {notifyResize();}); twttr.events.bind(‘resize’, function (event) {notifyResize();});if (parent && parent._resizeIframe) {var maxWidth = parseInt(window.frameElement.getAttribute(“width”)); if ( 500 < maxWidth) {window.frameElement.setAttribute("width", "500");}}

Information Retrieval (IR) is the process of gaining knowledge from a source of data from the environment. This environment can be explored in several ways to obtain such information, depending on its state and the state of the user.

The main goal of IR is to minimize the reduction of noise delivery and maximize the delivery of signal delivery.

Think about Google — it is safe to say that it is the main source of information retrieval in the world. Now think about Amazon — we all know how powerful its recommendation systems are. But how do Google and Amazon work, and what strategies do they use to deliver relevant content to the end user? We won’t look at how search engines and recommender systems work, but we’ll see together the strategies that they employ to favor IR.

Strategies to deliver text data

We have mentioned Google and Amazon. The first because it is a search engine, and the second because of its recommender systems. They are the de-facto standard of the industry because they work so well. This is proven by how satisfied the users are with using their product.

But they are radically different in terms of how they deliver information to the user in some of their specific functionalities. While they both use search and recommendation systems (for instance, Google suggests related keywords, which can be interpreted as a recommendation, while Amazon delivers product info through the search bar), we can dissect how Google delivers information through search and how Amazon delivers information through recommendation.

The user queries the system: the Pull strategy

It literally means that the user pulls the information from the system. An example is when users query a database or a search engine. In this context, the user takes the initiative and searches the environment for information.

In tangible terms, whenever we search Google for something, we are pulling information from its database to do something with that information.

Pulling involves two aspects:

Querying
We query the search engine through a keyword, and the engine returns relevant documents. The ability of the engine to deliver relevant content dictates whether the search engine is doing a good job or not. Querying works very well when users know what they are looking for.

Browsing
The user navigates the structure of the documents to find the information he’s looking for. As you can intuitively understand, this strategy works well when the user doesn’t know what to look for or can’t conveniently query the system.

How Do Google and Amazon Use Pull Strategies?

In search and in visualizing their results.
Whenever Google returns a SERP (Search Engine Result Page), or whenever Amazon displays a list of products, they are moving the user from the querying space to the browsing space. This does not happen if the end-user lands directly on the result they looked for (Google’s Are you feeling lucky feature for instance).

Users query the system → “where to buy sneakers in Milan, Italy.”

Users browse the results → documents (items) match the intent of the user

The system guesses what info is relevant: the Push strategy

This strategy is used when systems take the initiative to deliver presumably relevant information to the end user. This strategy is employed by recommendation systems. The better these systems are at pushing information, the better their performance and usage.

Amazon is a perfect example of how these systems work on the professional level. Netflix’s system is another one worth mentioning. We can all acknowledge how powerful these systems are in that they directly increase (or decrease) the user value for the business.

But why are these systems so difficult to tune? Why is Amazon so good at suggesting items and some other engines fail when it comes to creating more complex associations?

It’s because these systems require stable, clean information coming in from the end user. In other words, it must access user behavior data. Of course, the more traffic you have on your website, the more data you can store and feed into the system.

The Problem of User Intent

Natural Language Processing is a very difficult domain. Having machines decode what humans imply during conversation is turning out to be quite the challenge.

John saw a kid with a telescope.

This sentence alone is sufficient to break any NLP algorithm of the past 20 years. The portion with a telescope can either refer to John (as if John saw the kind by looking through a telescope) or to the kid (as if the kid was holding a telescope when John saw him). Discerning ambiguity is one of the greatest challenges in NLP today. Google and Facebook have done great work in the field, together with many other big shots of the industry.

It goes without saying that understanding user intent during the search is one tough task. Google, being the first search engine in the world whose job is literally to predict what users' intention is, is still trying to figure out how to achieve this. Many updates are pushed on a weekly basis to its core algorithm, and often these updates also tune the engine’s ability to understand user queries better.

Data scientists and ML engineers implementing search and recommendation systems face the problem of understanding user intent more than anything. That being said, IR systems are still allowing businesses to make millions in spite of the difficulties in understanding accurately what the user really seeks. And we are just scratching the surface.

Bonus: Google’s generative AI, LamDA

As a bonus for those who stayed with me until the end, here’s a video of the incredible conversational capabilities that Google’s new AI, LamDA. Not related to IR… but who knows? Maybe in the future search with be conducted via conversation? We already have Siri and Google Assistant…

If you want to support my content creation activity, feel free to follow my referral link below and join Medium’s membership program. I will receive a portion of your investment, and you’ll be able to access Medium’s plethora of articles on data science and more in a seamless way.

Join Medium with my referral link – Andrew D #datascience


Pull and Push — How Machines Deliver Text Data To Human was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓