Accelerate your AI journey. Join our AI Community!

Publication

PyTorch vs TensorFlow 2022: Which Deep Learning Framework Should You Use?
Deep Learning

PyTorch vs. TensorFlow 2022: Which Deep Learning Framework Should You Use?

Author(s): Tharun P

 

Originally published on Towards AI the World’s Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses.

Deep Learning Framework Performance and Speed This post will lead you through the primary benefits and drawbacks of PyTorch versus TensorFlow, as well as how to choose the best framework for your needs.

Image by artificialintelligencememes

Today, the two most popular Deep Learning frameworks are PyTorch and TensorFlow. The question of whether the framework is preferable has long been a source of contention, with both camps having passionate supporters.

The debate terrain between TensorFlow and PyTorch is constantly changing because both programs have grown so rapidly. In most cases, outdated or incomplete information abounds, which obscures the intricate issue of whether the framework is superior in a particular area.

Despite TensorFlow’s reputation as a framework for industry, and PyTorch’s reputation as a framework for research, we will demonstrate that these perceptions are partly based on outdated information. Let’s examine these distinctions today to get a grasp of how the debate over which framework is most dominant will evolve in 2022.

Relationship Between performance and outcomes

PyTorch and TensorFlow, like any other, have distinct development histories and convoluted design-decision histories. Previously, comparing the two required a lengthy technical examination of their existing and anticipated future characteristics. Given that both frameworks have grown at an exponential rate since their creation, many of the technical distinctions are now relics.

Fortunately for those of us who don’t want our eyes to glaze over, the PyTorch vs TensorFlow argument is now down to three practical considerations:

  1. Model Availability: With the Deep Learning domain extending every year and models growing in size, creating State-of-the-Art (SOTA) models from scratch is just not possible. There are thankfully numerous SOTA models that are publicly available, and it is critical to use them wherever feasible.
  2. Deployment Infrastructure: It’s worthless to train high-performing models if they can’t be used. Reduced time-to-deployment is critical, especially with the rising popularity of microservice business models; and quick deployment has the potential to make or destroy many Machine Learning-centered organizations.
  3. Ecosystems: Deep Learning is no longer limited to certain use cases in carefully controlled contexts. Because AI is bringing new power to a variety of sectors, a framework that sits inside a bigger ecosystem and allows development for mobile, local, and server apps is critical. Furthermore, with the introduction of specialized Machine Learning hardware, such as Google’s Edge TPU, effective practitioners must work with a framework that can interface seamlessly with this hardware.

We’ll go through each of these three practical factors in turn, and then provide suggestions on which framework to employ in specific situations.

TensorFlow Vs Pytorch — Model Availability

In applications like Natural Language Processing (NLP) where engineering and optimization are challenging, building an effective Deep Learning model can be challenging from scratch. For small-scale organizations, training and tuning are simply not feasible, if not impossible, due to the increasing complexity of SOTA models. With the GPT-4, OpenAI has produced a test with over 100 trillion parameters, compared to GPT-3 with 175 billion parameters. Due to limited computing resources, startups and researchers alike cannot develop and test these models on their own. Either being able to use pre-trained models for transfer learning, fine-tuning, or out-of-the-box inference is essential.

PyTorch and TensorFlow vary dramatically in terms of model availability. Both PyTorch and TensorFlow have official model repositories, which we’ll look at in the Ecosystems section below, but practitioners may wish to use models from other sources. Let’s look at the model availability for each framework quantitatively.

HuggingFace

HuggingFace allows you to include trained and adjusted SOTA models in your workflows with only a few lines of code.

When we compare HuggingFace model availability in PyTorch with TensorFlow, the results are astounding. The figure below shows the total number of models accessible on HuggingFace that are either specific to PyTorch or TensorFlow, or available for both frameworks. As we can see, the number of models available for usage specifically in PyTorch completely outnumbers the competitors. Almost 85 percent of models are PyTorch exclusive, and those that don’t have a 50 percent probability of also being accessible in PyTorch. In comparison, just around 16% of all models are TensorFlow-compatible, with only about 8% being TensorFlow-exclusive.

HuggingFace chart of popular models

We get similar findings if we limit our scope to the top 30 models on HuggingFace. Only about two-thirds of the top 30 models are accessible in TensorFlow, whereas all are available in PyTorch. There are no top-30 models that are unique to TensorFlow.

This graph depicts the proportional fraction of papers that employ PyTorch or TensorFlow based on data pooled from eight leading academic journals over several years. As you can see, PyTorch’s growth was extraordinarily quick, increasing from approximately 7% to over 80% of publications that utilize either PyTorch or TensorFlow in just a few years.

Data source

Much of the reason for this quick growth was due to issues with TensorFlow 1 that were worsened in the context of research, forcing academics to go to the newer alternative PyTorch. While many of TensorFlow’s flaws were solved with the introduction of TensorFlow 2 in 2019, PyTorch’s momentum has been strong enough for it to keep its position as the established research-centric framework, at least from a community standpoint.

Citations to PyTorch in ArXiv papers grew 194% in the first half of 2019, while the platform’s contributions increased more than 50%.

When we look at the percentage of researchers that migrated frameworks, we notice a similar tendency. When we examine publications by writers who used PyTorch or TensorFlow in 2018 and 2019, we discover that the large majority of authors who used TensorFlow in 2018 switched to PyTorch in 2019 (55 percent), whereas the vast majority of authors who used PyTorch in 2018 stayed with PyTorch in 2019. (85 percent ). This information is depicted in the Sankey diagram below, where the left side corresponds to 2018 and the right side relates to 2019. It should be noted that the figures indicate the percentage of each framework’s users in 2018, not overall numbers.

Data source

Research Papers

The figures are shown above clearly show that PyTorch now dominates the research scene. While TensorFlow 2 makes it simpler to use TensorFlow for research, PyTorch has given researchers little incentive to go back and give TensorFlow another shot. Furthermore, backward compatibility difficulties between old TensorFlow 1 research and current TensorFlow 2 research worsen the problem.

For the time being, PyTorch is the obvious victor in the field of research since it has been widely accepted by the community and is used in the majority of publications and downloadable models.

There are a few exceptions:

  • Google AI: Obviously, Google’s research relies heavily on TensorFlow. Given that Google is far more productive than Facebook (292 papers published in NeurIPS or ICML in 2020 versus 92 publications published in NeurIPS or ICML in 2020), some researchers may find it advantageous to utilize or at least be skilled in TensorFlow. Google Brain also makes use of JAX in combination with Flax, Google’s JAX neural network library.
  • Tesla Workflow with Pytorch: PyTorch’s timely releases correspond to Elon Musk’s self-imposed deadlines for his Tesla team. With the success of its Smart summon, Tesla plans to become fully autonomous in the next few years, and we can fairly state that it has wisely picked PyTorch to undertake the hard lifting.
  • DeepMind: It is more widespread than Facebook. DeepMind developed Sonnet, a high-level API for TensorFlow that is geared at research and is commonly referred to as “the research version of Keras,” which may be beneficial to individuals considering utilizing TensorFlow for research. Deepmind’s Acme framework may also be useful for Reinforcement Learning practitioners.
  • OpenAI: On the other side, OpenAI standardized the use of PyTorch internally in 2020; however, for those in Reinforcement Learning, their earlier baselines repository is built in TensorFlow. Because Baselines provides a high-quality implementation of Reinforcement Learning algorithms, TensorFlow may be the ideal choice for Reinforcement Learning practitioners.
  • JAX: Google has another framework called JAX, which is gaining popularity among researchers. In certain ways, JAX has less overhead than PyTorch or TensorFlow; nonetheless, the core concept of JAX differs from that of PyTorch and TensorFlow, and as a result, moving to JAX may not be a suitable solution for most. There are an increasing number of models and articles that use JAX, but it is unknown how popular it will be in the research community in the next years in comparison to PyTorch and TensorFlow.

TensorFlow faces a long and difficult, if not impossible, path to reassert itself as the leading research framework.

In this part, PyTorch’s argument against TensorFlow clearly demonstrates why more people prefer it over TensorFlow.

TensorFlow vs. PyTorch — Prototype Deployment

From an inference standpoint, SOTA models are the holy grail of Deep Learning applications. However, this isn’t always possible or even applicable in an enterprise environment. Accessing SOTA models is worthless if their intelligence cannot be used due to inefficient, error-prone methods. Hence, it is critical to analyze each Deep Learning framework’s end-to-end process, not just which framework brings you the prettiest models.

TensorFlow has long been the go-to framework for deployment-oriented apps, and for good reason. TensorFlow is accompanied by a slew of tools that make the end-to-end Deep Learning process simple and efficient. TensorFlow Serving and TensorFlow Lite, in particular, make it easy to deploy on clouds, servers, smartphones, and IoT devices.

PyTorch used to be fairly underwhelming in terms of deployment, but it has worked hard in recent years to close the gap. TorchServe last year and PyTorch Live just a few weeks ago provided much-needed native deployment capabilities, but has PyTorch reduced the deployment gap sufficiently to make its use beneficial in an industry setting? Let’s take a closer look.

TensorFlow

It provides scalable production with static graphs tuned for inference performance. Depending on the application, you utilize TensorFlow Serving or TensorFlow Lite for distributing a model using TensorFlow.

TensorFlow Serving:

TensorFlow Serving is utilized inside the TensorFlow Extended (TFX) end-to-end Machine Learning platform when delivering TensorFlow models to servers, whether in-house or in the cloud. Serving makes it simple to serialize models into well-defined folders with model tags and decide which model is used to perform inference requests while keeping server architecture and APIs unchanging.

Serving enables you to simply deploy models on dedicated gRPC servers that use Google’s open-source framework for high-performance RPC. Because gRPC was created with the goal of linking a wide ecosystem of microservices, these servers are ideally suited for model deployment. Serving as a whole is intimately connected with Google Cloud via Vertex AI and connects with Kubernetes and Docker.

TensorFlow Lite:

TensorFlow Lite (TFLite) is a lightweight TensorFlow model deployment tool for mobile and IoT/embedded devices. TFLite compresses and improves models for these devices, as well as handles the five restrictions for on-device AI: latency, connection, privacy, size, and battery consumption. The same pipeline is used to export both regular Keras-based SavedModels (used with Serving) and TFLite models at the same time, allowing model quality to be compared.

TFLite is compatible with Android and iOS, as well as microcontrollers (ARM with Bazel or CMake) and embedded Linux (e.g. a Coral device). TensorFlow’s APIs for Python, Java, C++, JavaScript, and Swift (archived as of this year) provides developers with a diverse set of programming languages.

PyTorch

PyTorch has invested in making deployment easier, despite previously being infamously poor in this area. Previously, PyTorch users had to use Flask or Django to construct a REST API on top of the model, but now they have native deployment alternatives in the shape of TorchServe and PyTorch Live.

TorchServe

TorchServe is an open-source deployment platform that was launched in 2020 as a consequence of cooperation between AWS and Facebook (now Meta). It provides basic functionality like endpoint specification, model archiving, and watching metrics, but it falls short of TensorFlow. TorchServe supports both REST and gRPC APIs.

PyTorch Live

PyTorch Mobile was initially launched in 2019 with the goal of creating an end-to-end workflow for the deployment of optimized machine learning models for Android, iOS, and Linux.

PyTorch Live, a follow-up to PyTorch Mobile, was launched in early December. It creates cross-platform iOS and Android AI-powered apps with related UIs using JavaScript and React Native. PyTorch Mobile continues to do on-device inference. Live includes sample projects to get you started, and it hopes to include audio and video input in the future.

Deployment — Final Words

As of now, TensorFlow still leads in deployment. It is clear that TFLite and PyTorch are more robust than PyTorch competitors. The ability to support local AI in conjunction with Google’s Coral devices is an essential feature for many industries. TorchServe, by contrast, is still in its infancy, while PyTorch Live focuses only on a mobile. The deployment arena will probably continue to change in the coming years.

But for now, In this debate, TensorFlow wins the PyTorch vs TensorFlow debate.

PyTorch vs TensorFlow — Ecosystems

Pytorch

The ecosystems in which PyTorch and TensorFlow are situated are the final crucial aspect that distinguishes them in 2022. Both PyTorch and TensorFlow are capable modeling frameworks, and their technical differences are less relevant at this stage than the ecosystems that surround them, which include tools for simple deployment, maintenance, distributed training, and so forth.

Now let’s have a look at the ecosystems of each framework.

  • SpeechBrain, API, which makes it extremely easy to build models quickly
  • TorchServe, an open-source model server developed in collaboration between AWS and Facebook
  • TorchElastic for training deep neural networks at scale using Kubernetes
  • PyTorch Hub, an active community for sharing and extending cutting-edge models
  • TorchX, is responsible for launching the distributed job while natively supporting jobs that are locally managed by TorchElastic.
  • Lightning is sometimes called the Keras of PyTorch. Lightning is a useful tool for simplifying the model engineering and training processes.

Tensorflow

Some highlights of the APIs, extensions and useful tools of the TensorFlow extended ecosystem include:

  • TensorFlow Hub, a library for reusable machine learning modules
  • Model Garden, an official collection of models that use TensorFlow’s high-level APIs
  • Extended (TFX), is an end-to-end platform for model deployment. You can load, validate, analyze, and transform data; train and evaluate models; deploy models using Serving or Lite, and then track artifacts and their dependencies
  • Vertex AI can help you automate, monitor, and govern Machine Learning systems by orchestrating workflows in a serverless manner. Vertex AI can also store artifacts of a workflow, allowing you to keep track of dependencies and a model’s training data, hyperparameters, and source code.
  • MediaPipe is a framework for building multimodal, cross-platform applied Machine Learning pipelines which can be used for face detection, multi-hand tracking, object detection, and more
  • Coral, covers the privacy and efficiency issues raised in the TFLite part of the Deployment section, as well as the limitations of deploying onboard AI. It also provides a wide range of hardware components for prototyping, manufacturing, and sense, some of which are simply more powerful Raspberry Pis designed exclusively for AI applications. Their Edge TPUs are used in their products for high-performance inference on low-power devices. Coral also provides pre-compiled models for image segmentation, position estimation, speech recognition, and other tasks to developers that want to build their own local AI systems. The flowchart below depicts the basic processes for creating a model.
Image source
  • TensorFlow.js is a JavaScript machine learning toolkit that allows you to train and deploy models in the browser as well as server-side using Node.js. They include documentation with examples and instructions on how to import Python models, pre-trained models that are ready to use right away, and live demos with related code.
  • Cloud, is a library that connects your local environment to the Google Cloud. The offered APIs are intended to bridge the gap between local model creation and debugging and distributed training and hyperparameter tweaking on Google Cloud, without the requirement for Cloud Console.
  • Google Colab, like Jupyter, is a cloud-based notebook environment. Connecting Colab to Google Cloud for GPU or TPU training is simple. It should be noted that PyTorch may also be utilized with Colab.
  • Playground, It provides a basic dense network that is shown within a tidy user interface. You may experiment with the number and size of network layers to watch how features are learned in real-time. On various datasets, you can also examine how adjusting hyperparameters like learning rate and regularisation intensity influences the learning process. Playground allows you to simulate the learning process in real-time and examine how inputs are converted during the training process in a very visual manner. Playground even includes an open-source tiny neural network library on which it was built, so you can examine the network’s inner workings.
  • Datasets, Google Research Datasets is a dataset site where Google publishes datasets on a regular basis. Google also provides a Dataset Search feature that allows you to search an even larger dataset database. Users of PyTorch may, of course, benefit from these datasets as well.

Ecosystems — Final Words

This time, though, TensorFlow has the superior ecosystem. Google has made significant investments to ensure that there is a product available in each key area of an end-to-end Deep Learning process, however, the quality of these products differs throughout the landscape. Even Nevertheless, TensorFlow’s deep connection with Google Cloud and TFX makes the end-to-end development process quick and orderly, and the simplicity of transferring models to Google Coral devices gives TensorFlow a landslide triumph in various businesses.

TensorFlow is the winner of the PyTorch vs TensorFlow discussion in this section.

What is the best tool for my project? PyTorch or TensorFlow?

Photo by Matt Walsh on Unsplash

As you might assume, there is no one valid answer in the PyTorch versus TensorFlow discussion; it is only reasonable to state that one framework is preferable to another in terms of a given use-case. We’ve gathered all the ideas into the flow charts above to help you determine which framework is ideal for you, with each chart suited to a distinct area of interest.

You may have arrived with a slew of questions in your head.

What if I’m a Researcher?
What if I’m a Student?
What if I’m Looking for a Career Change?
What if I’m a Hobbyist?
What if I’m a Total Beginner?

Well… it depends on what we want to accomplish with it, and, more importantly, what tools we have to study with. We may be unable to agree because we have a predetermined attitude and it would be tough for us to modify our response to this question (the same is true for “fans” of PyTorch and TensorFlow 😉). But we can all agree that knowing how to program is essential. And, after all, everything we learn from programming in one language will be useful when we use the other, right? The same thing happens with frameworks; the key thing is to understand Deep Learning rather than the syntactic specifics of a framework, and then we will apply that knowledge to the framework that is popular at the time or to which we have more access.

As you can see, the PyTorch vs TensorFlow discussion is subtle, with a dynamic environment, and outdated information makes grasping this terrain much more challenging. PyTorch and TensorFlow are both fairly mature frameworks in 2022, with substantial overlap in their basic Deep Learning functionality. Today, the practical issues of each framework, such as model availability, deployment time, and related ecosystems, trump technical distinctions.

Pytorch is well-known for its OOP (Object Oriented Programming) approach. Since it is a newer framework with a larger community and is more Python-friendly. For example, while developing a custom model or dataset, you will almost certainly construct a new class that inherits the PyTorch libraries and then modify your own methods. Personally, I am not fond of OOP. Although it gives the code some structure, it makes implementations substantially longer in terms of the number of lines of code.

TensorFlow, on the other hand, is a very powerful and mature deep learning toolkit with great visualization features and a variety of choices for high-level model creation. It supports mobile platforms and provides production-ready deployment options.

You are not making a mistake in selecting either framework because they both offer strong documentation, a plethora of learning resources, and vibrant communities. While PyTorch has become the de facto research framework following its rapid acceptance by the research community, while TensorFlow is the heritage industrial framework, there are clearly use cases for each in both areas.

Hopefully, I’ve been able to direct you through and provide a fair comparison of the complex PyTorch versus TensorFlow landscape!

Follow me on Medium &Twitter


PyTorch vs TensorFlow 2022: Which Deep Learning Framework Should You Use? was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

 

Join thousands of data leaders on the AI newsletter. It’s free, we don’t spam, and we never share your email address. Keep up to date with the latest work in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓