Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Advancing Neural Search with Jina 2.0
Latest

Advancing Neural Search with Jina 2.0

Last Updated on January 6, 2023 by Editorial Team

Last Updated on August 10, 2021 by Editorial Team

Author(s): Shubham Saboo

Deep Learning

With the onset of the Information Age, the ability to intelligently search through massive amounts of information has become an integral part of our day-to-day lives…

Defying Conventionβ€Šβ€”β€ŠNavigating the future of search with JinaΒ 2.0!

Pre-Requisites

To understand the basics of neural search and how it differs from conventional search please go through my previous blog on β€œNext-gen powered by Jina”. It explains how Jina- a cloud-native, open-source company is pioneering the field of neural search. It builds on the idea of semantic search and explains the basic building blocks of the Jina framework required to build intelligent search applications.

RE: NeuralΒ Search

Just as a recap the idea behind neural search is to leverage state-of-the-art deep neural networks to intelligently retrieve contextual and semantically relevant information from the heaps of data. A neural search system can go way beyond simple text search by allowing you to search through all the formats of data including images, videos, audios, and evenΒ PDFs.

Applications of NeuralΒ Search

  • A question-answering chatbot can be powered by neural search: by first indexing all hard-coded QA pairs and then semantically mapping the user dialog to thoseΒ pairs.
  • A smart speaker can be powered by neural search: by applying STT (speech-to-text) recognition and then semantically mapping text to internal commands.
  • A recommendation system can be powered by neural search: by embedding user-item information in the form of numerical vectors and finding top-K nearest neighbors of a particular user/item.

Neural Search has created a new way to comprehend the world and provided us with the capability to perform intelligent information retrieval on heaps and heaps of data that is universal across the internet. Jina is a cloud-native neural search platform that is at the forefront of creating the future ofΒ search!

Jina 2.0β€Šβ€”β€ŠWhat’sΒ Changed?

Jina 1.x v/s JinaΒ 2.0

Jina 1.x was a complex beast with lots of boilerplate code and not so much transparency but Jina 2.0 embraces the principle of β€œexplicit above implicit”. To embrace the power of neural search and make it accessible to a wider audience, Jina launched its version 2.0 which is easier to learn, leaner to adopt, and faster toΒ learn.

Jina 1.x was harder to learn and was difficult to get used to the different components and put them in a coherent fashion for the application to work. All these individual components are abstracted by a simple layer of just a Flow and Executor, all the middle layers including the pods, peas are hidden behind the scenes allowing you to just focus on β€œWhat really matters”.

To get started with Jina 2.0 on the journey to build intelligent search systems, you just need to know three concepts – Document, Executor, and Flow and with the user-friendly pythonic interface you will get to speed within noΒ time.

Fundamental Components of JinaΒ 2.0

Document, Executor, and Flow are the three fundamental concepts inΒ Jina.

  • A Document is the basic data type inΒ Jina
  • An Executor is how Jina processes Documents
  • A Flow is how Jina streamlines and scales Executors

Document

Document is the basic data type that Jina operates with, it is agnostic to the type /format of data. Text, picture, audio, video are all considered as documents in Jina. The superset of document data type is DocumentArray, it wraps up multiple individual documents and acts as a container forΒ them.

You can think of a DocumentArray as a text file composed of multiple sentences where each sentence represents a Document. A DocumentArray is a first-class citizen of Jina’s Executor serving as the Executor’s input and output. For the data folks, you can understand the β€œdocument” by a simple analogy to the famous NumpyΒ library.

Document = np.float; DocumentArray = np.ndarray

Executor

Executor is the smallest algorithmic unit in Jina that is used to process the documents be it encoding images into vectors, storing vectors on the disk, ranking results all of them are formulated as executors. Executor provides intuitive interfaces, allowing AI developers and engineers to really focus on the algorithm. Some common executors are asΒ follows:

  • Crafter: Crafter is used for pre-processing the documents intoΒ chunks.
  • Encoder: The encoder takes the input pre-processed chuck of documents from the crafter and encodes them into embedding vectors.
  • Indexer: Indexer takes the encoded vectors as input and indexes and stores the vectors in a key-value fashion.
  • Ranker: Ranker runs on the indexed storage and sorts the results based on a certainΒ ranking.

Executor process DocumentArray in-place via functions decorated with @requests. Following are the features of an executor inΒ Jina:

  • An Executor should subclass directly from jina.Executor class.
  • An Executor class is a bag of functions with a shared state (via self) allowing it to contain an arbitrary number of functions with arbitrary names.
  • Functions decorated with @requests the decorator will be invoked according to their on= endpoint.

There are mainly two ways to design an executor in Jina, so let’s look at a simple example of how we can create an executor using both Python andΒ YAML:

  • Using Python

To create an executor in python, you just need to import the native executor class from the Jina core and create a subclass of the same. Under the MyExecutorsubclass, you can define a number of functions by attaching the @requests a decorator on top of these functions to make them accessible within theΒ flow.

After defining the executor subclass, you can create a flow and call the executor via the endpoint of the request /random_work, the following is the code snippet showing how to use an executor:

  • Using YAML

An Executor can be loaded from and stored to a YAML file. Following is a replica of the python executor created above, you can save this file as β€œexec.yml”.

After saving the exec.yml file, you can construct an executor using the same and add it to a new or existing JinaΒ flow:

Flow

Flow is how Jina streamlines and scales executors, it represents high-level tasks like indexing, searching, training, etc. It acts as a context manager and orchestrates a group of executors to accomplish a single task e.g. if you want to index the data you need a sequence of executors like crafter, encoder, indexer to work in tandem with each other in order to achieve the desiredΒ result.

Flow is a service, allowing multiple clients to access it via gRPC / REST / WebSocket from a public or privateΒ network.

Flow follows a lazy construction pattern, so it won’t actually run until you use with to open it. Flows can be created by simply importing them from the jina core library, and then adding executors to it. To run a flow you can simply open it via with and can send the data requests as we do in the belowΒ example:

References

  1. https://github.com/jina-ai/jina
  2. https://github.com/jina-ai/jina/tree/master/.github/2.0/cookbooks
  3. https://www.thinkwithgoogle.com

If you would like to learn more or want to me write more on this subject, feel free to reachΒ out.

My social links: LinkedIn| Twitter |Β Github

If you liked this post or found it helpful, please take a minute to press the clap button, it increases the post visibility for other mediumΒ users.


Advancing Neural Search with Jina 2.0 was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓