Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

GPT-3: A Data Scientist in the Making
Latest

GPT-3: A Data Scientist in the Making

Last Updated on January 6, 2023 by Editorial Team

Last Updated on June 23, 2021 by Editorial Team

Author(s): Shubham Saboo

Natural Language Processing

Autopilot exploratory data analysis in pandas by leveraging the capabilities of the world’s most sophisticated language modelΒ GPT-3…

β€œErrors using inadequate data are much less than those using no data at allβ€β€Šβ€”β€ŠCharlesΒ Babbage

Pre-Requisites

I have collected the dots in the form of articles, please go through the below articles in the same order to connect the dots and understand the key tech stack behind the intelligent KubeΒ Bot:

  1. FastAPIβ€Šβ€”β€ŠThe Spiffy Way BeyondΒ Flask!
  2. Streamlitβ€Šβ€”β€ŠRevolutionizing Data AppΒ Creation
  3. A Brief Introduction toΒ GPT-3

Introduction toΒ Pandas

Pandas is a fast, powerful, and easy-to-use open-source data analysis and manipulation tool built on top of the Python programming language. It is widely accepted among the Python community and is used in many other packages, frameworks, and modules. Pandas is an extremely flexible framework and has a wide range of use-cases for preparing the data for machine learning and deep learningΒ models.

β€œTorture the data to the right extent and it will confess to anythingβ€β€Šβ€”β€ŠRonaldΒ Coase

Installing pandas

Pandas is available as a standard python library at PyPI, which can be easily installed using either pip or conda depending on the python environment. Due to the popularity of Pandas, it has its own conventional abbreviation, so the following command can be used for installing Pandas:

import pandas as pd

What kind of data pandas canΒ handle?

If you work with tabular data, such as data in spreadsheets or databases, pandas is the right tool for you. With Pandas, you can explore, clean, and process your data. In pandas, a data table is called a DataFrame.

Fig: Illustration of Pandas Dataframe

How to read and write tabular data withΒ pandas?

Pandas support the integration with many file formats or data sources out of the box (like CSV, excel, SQL, JSON, parquet, etc). It is fairly easy and straightforward to import data from these sources by using the prefix read_*. Similarly, we can use the to_* methods to export the data to the respective formats.

Fig: Illustration of import and export sources inΒ pandas

Application walkthrough

Now I will walk you through the GPT-3 powered pandas assistant application step byΒ step:

While creating any GPT-3 application the first and foremost thing to consider is the design and content of the training prompt. Prompt design is the most significant process in priming the GPT-3 model to give a favorable and contextual response.

As a rule of thumb while designing the training prompt you should aim towards getting a zero-shot response from the model, if that isn’t possible move forward with few examples rather than providing it with an entire corpus. The standard flow for training prompt design should look like: Zero-Shot β†’ Few Shots β†’Corpus-based Priming.

For designing the training prompt for the pandas assistant application, I have used the following structure for the trainingΒ prompt:

  • Description: An initial description of the context about what the pandas assistant is supposed to do and adding a line or two about its functionality.
  • Natural Language (English): This component includes a minimal one-liner description of the task that will be performed by the pandas assistant. It helps GPT-3 to understand the context in order to generate proper pandas code inΒ python.
  • Pandas Code: This component includes the pandas code corresponding to the English description provided as an input to the GPT-3Β model.

Input β†’ Natural LanguageΒ ; Output β†’ PandasΒ Code

Streamlit powered UI (All inΒ Python)
The magic of FastAPI β†’ On-the-fly API documentation

Let’s see an example in action, to truly understand the power of GPT-3 in generating pandas code from pure English language. In the below example, we will generate the pandas code by providing minimal instructions to the AI pandas assistant.

References

  1. https://en.wikipedia.org/wiki/GPT-3
  2. https://openai.com/blog/openai-api
  3. https://pandas.pydata.org/docs

If you would like to learn more or want to me write more on this subject, feel free to reachΒ out.

My social links: LinkedIn| Twitter |Β Github

If you liked this post or found it helpful, please take a minute to press the clap button, it increases the post visibility for other mediumΒ users.


GPT-3: A Data Scientist in the Making was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Published via Towards AI

Feedback ↓