How To Make a Career in GenAI In 2024
Last Updated on December 30, 2023 by Editorial Team
Author(s): Sudhanshu Sharma
Originally published on Towards AI.
I serve as the Principal Data Scientist at a prominent healthcare firm, where I lead a small team dedicated to addressing patient needs. Over the past 11 years in the field of data science, Iβve witnessed significant transformations. The industry has evolved from relying on tools like SAS and R to placing a spotlight on data visualization tools like Tableau and PowerBI. Black box algorithms such as xgboost emerged as the preferred solution for a majority of classification and regression problems. Later, Python gained momentum and surpassed all programming languages, including Java, in popularity around 2018β19. The advent of more powerful personal computers paved the way for the gradual acceptance of deep learning-based methods. The introduction of attention mechanisms has notably altered our approach to working with deep learning algorithms, leading to a revolution in the realms of computer vision and natural language processing (NLP).
In 2023, we witnessed the substantial transformation of AI, marking it as the βyear of AI.β This evolution became tangible and accessible to the general public through experiences like ChatGPT. To me, this emerging trend stands out as the most significant for the foreseeable future. Professionals who embrace and navigate this wave are poised to reap immense benefits in the coming years.
Iβm crafting this blog post for individuals aspiring to build a career in the GenAI field. Whether youβre already working as an analyst and seeking to elevate your skills or starting from scratch, this post aims to provide guidance and insights to help you navigate and thrive in the dynamic and evolving landscape of GenAI.
Here are 11 pillars for building expertise in GenAI:
- Basics of Python- Python serves as a prominent programming language for working with large language models (LLMs) due to its versatility, extensive libraries, and community support. Major language models like GPT-3 and BERT often come with Python APIs, making it easy to integrate them into various applications. So, python is the MOST important prerequisite for venturing into the GenAI world as a developer.
Introduction to Python for Data Science – Analytics Vidhya
Master the basics of Python with a detailed introduction to Python for data science analysts. Expand your skillset byβ¦
courses.analyticsvidhya.com
2. Deep learning fundamentals(with or without maths)– Major topics to focus on from LLM point of view are MP Neuron, perceptron, Sigmoid neuron, FFNN, Backpropagation, various types of Gradient descent, Activation functions, Representation of words like word2vec, RNN, GRU, LSTM.
CS6910/CS7015: Deep Learning
Mitesh M. Khapra Homepage
www.cse.iitm.ac.in
3. Attention models framework β The key idea behind attention models is to enable the model to dynamically focus on relevant parts of the input sequence, giving more attention to certain elements while ignoring others. This is especially useful in tasks involving sequential data, such as natural language processing, where understanding the context and relationships between words is crucial.
Attention models serve as a foundational component for Large Language Models (LLMs) because they address the challenges associated with processing and understanding sequences of information, such as language. LLMs, like GPT (Generative Pre-trained Transformer) models, leverage attention mechanisms to capture long-range dependencies and contextual relationships within input sequences, making them more effective in handling natural language tasks which was not possible earlier with RNN and LSTM.
Large Language Models – Deep dive into Transformers
This is Part 1 of a course on LLMs as taught by AI4Bharat, IIT Madras' Prof. Mitesh Khapra
courses.ai4bharat.org
4. One deep learning framework, preferably PyTorch β PyTorchβs dynamic computational graph, ease of use, strong community support, and integration with key libraries make it an essential tool for developing, training, and deploying Large Language Models in natural language processing tasks.
Welcome to PyTorch Tutorials – PyTorch Tutorials 2.2.0+cu121 documentation
Exploring TorchRec sharding This tutorial covers the sharding schemes of embedding tables by using EmbeddingPlanner andβ¦
pytorch.org
OR
5. NLP fundamentals β Under this, basics of NLP have to be learned, like tokenization, stemming, lemmatization, POS tagging, NER, Text representation- BOW, Word2vec, etc.
6. LLM basic conceptsβ Large Language Models (LLMs) are foundational machine learning models that use deep learning algorithms to process and understand natural language. These models are trained on massive amounts of text data to learn patterns and relationships in the language. LLMs can perform many types of language tasks, such as translating languages, analyzing sentiments, chatbot conversations etc. Learning the basics of transformers which is the core of LLM is imperative for a professional.
Large Language Models – Deep dive into Transformers
This is Part 2 of a course on LLMs as taught by AI4Bharat, IIT Madras' Prof. Mitesh Khapra
courses.ai4bharat.org
7. Reinforcement learning– LLMs have revolutionized natural language understanding by processing vast amounts of text data. When integrated with reinforcement learning, LLMs enhance their capabilities beyond language tasks. Reinforcement learning enables LLMs to optimize their performance by learning from interactions with an environment, receiving feedback, and adjusting their language generation strategies accordingly. Though I don't prefer to recommend a paid course this course is the best available course right now.
Reinforcement Learning
Master the Concepts of Reinforcement Learning. Implement a complete RL solution and understand how to apply AI tools toβ¦
www.coursera.org
8. LLM Deep-dive– Several topics within Large Language Models (LLMs) warrant exploration, including:
- Prompt Engineering: Delving into the art and strategy of formulating effective prompts to guide LLMs in generating desired responses. Prompt engineering refers to the practice of designing and crafting effective prompts/questions to elicit desired responses from language models or natural language processing systems. This concept has gained prominence with the rise of large language models (LLMs), such as OpenAIβs GPT-3, which are capable of generating human-like text based on given prompts.
generative-ai-for-beginners/04-prompt-engineering-fundamentals at main Β·β¦
12 Lessons, Get Started Building with Generative AI U+1F517 https://microsoft.github.io/generative-ai-for-beginners/ β¦
github.com
2. Parameter-Efficient Fine-Tuning (PEFT): Understanding techniques and methodologies that optimize model performance with fewer parameters, enhancing efficiency.
3. LoRA (Long-Range Dependencies): Exploring how LLMs handle and capitalize on long-range dependencies in input sequences for improved contextual understanding.
4. QLoRA (Quantum Long-Range Dependencies): Investigating advancements and implications of incorporating quantum computing principles into LLMs for enhanced processing capabilities.
5. Various LLMs (Open Source vs. GPTs): Comparing and contrasting different LLMs, particularly examining the distinctions between open-source models and those developed by entities like OpenAI, such as the Generative Pre-trained Transformers (GPTs).
9. Famous and fundamental research papers in this field:-
https://arxiv.org/abs/1706.03762- Attention is all you need
https://arxiv.org/abs/2303.10512- Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
https://arxiv.org/abs/2203.06904 β Parameter Efficient Methods for Pre-trained Language Models
https://arxiv.org/abs/2106.09685 β LoRA
https://arxiv.org/abs/2107.13586 β Prompting Methods
10. Langchain β LangChain is a framework for developing applications powered by language models. It enables applications that are 1) context-aware: connect a language model to sources of context and backed by a reason: rely on a language model to reason. Here is a full playlist of videos on Langchain from YT.
11. Vector Databases β Vector databases are specifically designed to store and retrieve vector data efficiently, which is important for developing high-performance LLM apps. I did a nice course on vectorDB from Udemy.
https://www.udemy.com/course/vector-db/
Upon finishing the aforementioned courses, individuals can effectively apply these concepts to develop end-to-end LLM applications, participate in some LLM competitions from Kaggle etc. Furthermore, they have the opportunity to share the links to their work on platforms like LinkedIn, thereby establishing a noteworthy personal brand in the field of Generative Artificial Intelligence (GenAI).
The duration for covering all these activities can vary for each candidate, ranging from 2 to 6 months. The timeframe largely depends on the depth of exploration and the extent to which an individual chooses to delve into the subject matter.
Additionally, try to follow popular AI contributors on Youtube, LinkedIn and X. Some of the popular ones which I follow are:-
Umar Jamil
I'm a Machine Learning Engineer from Milan, Italy currently living in China, teaching complex deep learning and machineβ¦
www.youtube.com
Alejandro AO – Software & Ai
U+1F525 I help you become a software developer and code AI applications.
www.youtube.com
Whispering AI
"Greetings! I'm a machine learning engineer with a passion for innovation and creativity. With my extensive technicalβ¦
www.youtube.com
Venelin Valkov
Misusings of AI, Stats & Programming
www.youtube.com
For any query/guidance on these topics, folks can approach me on LinkedIn
https://www.linkedin.com/in/sudhanshu-sharma-4a54291a0/
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI