Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Read by thought-leaders and decision-makers around the world. Phone Number: +1-650-246-9381 Email: [email protected]
228 Park Avenue South New York, NY 10003 United States
Website: Publisher: https://towardsai.net/#publisher Diversity Policy: https://towardsai.net/about Ethics Policy: https://towardsai.net/about Masthead: https://towardsai.net/about
Name: Towards AI Legal Name: Towards AI, Inc. Description: Towards AI is the world's leading artificial intelligence (AI) and technology publication. Founders: Roberto Iriondo, , Job Title: Co-founder and Advisor Works for: Towards AI, Inc. Follow Roberto: X, LinkedIn, GitHub, Google Scholar, Towards AI Profile, Medium, ML@CMU, FreeCodeCamp, Crunchbase, Bloomberg, Roberto Iriondo, Generative AI Lab, Generative AI Lab VeloxTrend Ultrarix Capital Partners Denis Piffaretti, Job Title: Co-founder Works for: Towards AI, Inc. Louie Peters, Job Title: Co-founder Works for: Towards AI, Inc. Louis-François Bouchard, Job Title: Co-founder Works for: Towards AI, Inc. Cover:
Towards AI Cover
Logo:
Towards AI Logo
Areas Served: Worldwide Alternate Name: Towards AI, Inc. Alternate Name: Towards AI Co. Alternate Name: towards ai Alternate Name: towardsai Alternate Name: towards.ai Alternate Name: tai Alternate Name: toward ai Alternate Name: toward.ai Alternate Name: Towards AI, Inc. Alternate Name: towardsai.net Alternate Name: pub.towardsai.net
5 stars – based on 497 reviews

Frequently Used, Contextual References

TODO: Remember to copy unique IDs whenever it needs used. i.e., URL: 304b2e42315e

Resources

Take our 85+ lesson From Beginner to Advanced LLM Developer Certification: From choosing a project to deploying a working product this is the most comprehensive and practical LLM course out there!

Publication

Building an Agentic RAG Pipeline: Gemini + ChromaDB + Kaggle → ResuMeme AI
Careers   Latest   Machine Learning

Building an Agentic RAG Pipeline: Gemini + ChromaDB + Kaggle → ResuMeme AI

Author(s): Anupama Garani

Originally published on Towards AI.

Building an Agentic RAG Pipeline: Gemini + ChromaDB + Kaggle → ResuMeme AI
Your resume says ‘team player.’ Your formatting says ‘I give up.’ — ResuMeme.AI”

It started during a WiDS Career Catalyst volunteering session.

This wasn’t a client. This wasn’t a test case. This was someone trying to rewrite her story — and she asked me for help.

A young woman just beginning her journey in data science booked a time with me to discuss career strategy.

She had curiosity. She committed. She’d done the courses, the projects, the hard work.

She sent me her resume ahead of time.

And while it listed everything she’d done — the tools, the internships, the certifications —

Something was missing.

Not in her skills, but in the story.

There were no metrics. No voice. No way to capture who she really was or what made her stand out.

And that’s when it hit me:

It’s not that people aren’t qualified.

It’s that we’ve never been taught how to show it — especially on a PDF scanned by algorithms.

If I could build something that helped people find their voice, reframe their impact, and yes, even laugh through the process…
I would.

So I built ResuMeme.AI

A full-stack GenAI resume assistant, powered by Gemini and orchestrated entirely inside a Kaggle notebook.
It doesn’t just tell you what’s wrong — it shows you what’s possible.

Here’s what it does:

You Give:

  • A resume feedback dictionary (parsed or LLM-generated)

You Get:

  • ✅ A full resume evaluation (score, strengths, gaps)
  • 💼 Real job matches based on RAG + embeddings
  • 🧼 Cleaned, formatted, ATS-ready resume
  • 📊 Simulated ATS layout score
  • 📝 PDF output
  • 🖼 A meme that roasts your job-hunting aura

Sample Meme Output

“Your resume says ‘team player.’ But your font choice says ‘menace.’”
— ResuMeme.AI

All outputs are grounded in real resume structure, actual job data, and prompt control — no hallucinated fluff.

It’s not a chatbot. It’s a career glow-up pipeline.
And it all runs inside a single notebook.

Let’s dive in.

I used a very badly formatted, outdated resume as the input for Kaggle.

It’s a 2-page long resume.

Input

Page 1 of Resume
Resume 2

The Workflow

ResuMeme.Ai workflow

This system uses:

  • Gemini JSON Mode
  • Zero shot prompting
  • Chain-of-Thought + few-shot prompting
  • LangGraph-style agentic routing
  • SERP API to pull live job listings
  • ChromaDB for embeddings
  • Function calling to extract skills
  • WeasyPrint for HTML-to-PDF export
  • Gemini’s image API for memes

All of it lives inside one Kaggle notebook — no frontend needed.

Agents Involved:

  • 🧠 ResumeCriticAgent → scores your resume like a hater
  • 🧼 FormatterAgent → fixes formatting messes
  • 👀 ATSVisionAgent → gives layout feedback
  • 🧭 RAGMatcherAgent → finds real job matches
  • 🎭 MemeGeneratorAgent → creates a custom meme

Real Outputs (What You See):

  • Resume tone + score + gaps
  • ATS layout feedback
  • Top job matches + fit %
  • What skills you’re missing
  • An improved version of your resume
  • A meme to keep you humble

🧠 Inside the Agent: ResumeCriticAgent

Purpose:
This agent performs three types of LLM-based reasoning to generate a comprehensive resume evaluation.

Steps:

  • We first read the PDF using the PyPDF pacakges
# Import required libraries
import os
import PyPDF2
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import json
import re
from IPython.display import display, HTML, Image
# import google.generativeai as genai

print("Libraries imported successfully!")

def read_pdf_text(file_path):
num_pages=0
full_path = os.path.join("/kaggle/input", file_path)

# Let's store the contents in text string
text = ""
with open(full_path, 'rb') as file:
pdf_reader = PyPDF2.PdfReader(file)
num_pages=len(pdf_reader.pages)
for page in pdf_reader.pages:
text += page.extract_text()

if not text.strip():
print("Warning: No text extracted from PDF!")
else:
print(f"Successfully read PDF - extracted {len(text)} characters")
cleaned = text.replace('\n', ' ').replace(' ', ' ')
return cleaned.strip(),num_pages

# Call the function to read pdf
resume_text,pno = read_pdf_text("anu-resume/Anupama Garani Sheshagiri Resume.docx.pdf")
resume_text[:2000]
pno

After cleaning the extra spaces and carriage returns we then passt he input to zero shot prompting to high level understand the issues with the resume.

  • Zero-Shot Prompting: Runs a no-context Gemini prompt for raw scoring
def zero_shot_prompt(resume_text_cleaned):
resume_prompt = f"""
You are ResumeCritic, a world-class GenAI career agent. Show all the errors with the resume.
Also fix each section
Resume:
\"\"\"
{resume_text_cleaned}
\"\"\"
"""


response = client.models.generate_content(
model="gemini-2.0-flash",
contents=resume_prompt)

return response.text

zero_shot_response = zero_shot_prompt(resume_text)
career_scorecard['zero_shot_feedback']=zero_shot_response
Markdown(zero_shot_response)
  • Chain-of-Thought Prompting: Prompts Gemini to reason step-by-step before returning judgment. Chain of thought forces the model to think deeper by saying “Think step by step” and also providing few scenarios. In my case, I provided the following positive and negative examples
 You are ResumeCritic, a GenAI resume evaluation agent trained to think step-by-step.

You will evaluate a candidate's resume, structured into sections:
- Summary
- Bullet Points
- Skills (based on target job title)
- Tools (based on target job title)
- Projects
- Job Type
- Target Title
---

🧪 For each section, think step by step and follow this process:
1. Carefully read the provided content
2. Apply the scoring rules and best practices
3. Compare against the positive and negative examples
4. Provide a final score (1-10) for the section
5. Give specific feedback - identify exact lines needing improvement and explain why
6. Suggest concrete updates for each issue identified

Be thorough in your analysis and detailed in your feedback.

---

🎯 Scoring Rules:

Based on the best practices for summary,
- **Summary**
- +2 if under 3 lines with 5+ job-aligned keywords
- -1 for buzzwords like "passionate", "dynamic"
- Best Practice: "Keep it factual, job-targeted, and metric-aligned"
- Keep it to 1 line summary

✅ GOOD EXAMPLES:
- "ML engineer with 5+ years experience building recommendation systems that increased revenue by 23% across SaaS platforms."
- "Data scientist specializing in NLP, predictive modeling, and A/B testing with proven 35% accuracy improvements for Fortune 500 clients."
- "Machine learning developer who reduced customer churn by 18% using clustering, Python, and AWS at scale for fintech applications."
- "Results-driven data engineer leveraging Spark, Airflow, and SQL to process 5TB daily data, enabling 40% faster decision-making."
- "Analytics specialist with expertise in TensorFlow, scikit-learn, and dashboard creation that increased marketing ROI by 27%."

❌ BAD EXAMPLES:
- "Passionate data scientist with a dynamic approach to solving complex business problems and a proven track record of success."
- "Hardworking and detail-oriented data professional seeking to leverage my skills and knowledge in a challenging role with opportunity for growth."
- "Innovative and creative thinker with extensive experience in data analysis and a strong background in mathematics and statistics, eager to make an impact."
- "Team player with excellent communication skills who is passionate about data science and machine learning, looking for new opportunities."
- "Detail-oriented data scientist with experience in Python, R, and data analysis with strong analytical and problem-solving skills."

evaluate {summary_text}

Output:

  • Score (1–10)
  • Tone analysis
  • Strengths
  • Gaps
  • Suggestions

In order to store the output in a structured format, I chose the TypeDict and enforced the model to use the format

class SectionScore(typing.TypedDict):
score: float
feedback: List[str]
updates: List[str]

class ResumeEvaluation(typing.TypedDict):
summary: SectionScore
experience: SectionScore
skills: SectionScore
tools: SectionScore
projects: SectionScore
overall_score: float
seniority_match: Literal["Under-leveled", "Proper", "Over-leveled"]

Result:
All three evaluation outputs are merged and transformed into a structured JSON Result ScoreCard used by downstream agents.

🧼 Inside the Agent: ResumeFormatterAgent

🧾 Purpose:
Clean up and reformat your resume to be ATS-compliant and easier for hiring managers to scan.

💡 Prompt Techniques Used:

  • Few-shot prompting using examples of “bad” and “good” resume blocks
  • HTML or markdown-style formatting for structured output

📤 Output:

  • Cleaned resume (text or HTML)
  • Removed fluff
  • Rewritten headers
  • Unified layout

🔥 Why it’s cool:
No drag-and-drop builder, no template dependency. This agent restructures your resume like a savvy AI resume consultant.

👀 Inside the Agent: ATSVisionAgent

🧠 Purpose:
Simulates layout scoring of your resume as if processed by a real ATS system.

💡 Prompt Techniques Used:

  • Gemini prompt that scores based on layout (headings, spacing, section order)
  • Future-ready for Gemini Vision API
ats_vision_response = image_client.models.generate_content(
model="gemini-2.0-flash-exp",
config=types.GenerateContentConfig(
temperature=0,
response_mime_type="application/json",
response_schema=ATSVisionLayoutFeedback
),
contents=[ats_vision_prompt,img])

📤 Output:

  • ATS Layout Score (1–10)
  • Specific layout issues (spacing, ordering, readability)

🔥 Why it’s cool:
No actual vision model needed — yet. But your system is future-proof and simulates how ATS views your visual layout.

🧭 Inside the Agent: RAGMatcherAgent

🧲 Purpose:
Find real jobs, analyze your fit, extract skill gaps, and generate custom job search filters.

RAG Agent

💡 Tech & Prompt Techniques:

  • Function calling: Extract job titles, companies, skills
  • Google SERP API to get job listings
  • SentenceTransformers for resume & job embeddings
  • Cosine similarity scoring
  • Gemini prompt to suggest improvements

I first used the embedding approach to match the data science jobs with the resume. Here is the heat map that was generated

Heatmap for job matches with the resume

The “Data scientist” has a correlation of 0.84 which is the highest. I use this to find the jobs using the SERP API

from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()

from serpapi import GoogleSearch
import os
Serp_api_key = user_secrets.get_secret("SERPAPI_KEY")
def search_real_job_links(role="Data Scientist", location="Austin, TX"):
params = {
"engine": "google",
"q": f'site:greenhouse.io OR site:lever.co "{role}" "{location}"',
"location": location,
"hl": "en",
"gl": "us",
"api_key": Serp_api_key
}

search = GoogleSearch(params)
results = search.get_dict()

jobs = []
for result in results.get("organic_results", []):
if "title" in result and "link" in result:
jobs.append({
"title": result["title"],
"link": result["link"],
"snippet": result.get("snippet", "")
})

return jobs

jobs = search_real_job_links(target_role,career_scorecard['resume_entity_mapped']['location'])
job_links = []
# print(jobs)
for job in jobs[:5]:
print("✅", job.get("title"), "-", job.get("company_name"))
print("Link:", job.get("link", "N/A"))
print("="*80)
job_links.append(job)
✅ Senior Data Scientist - Austin, Texas, United States - None
Link: https://boards.greenhouse.io/roku/jobs/6726915
================================================================================

✅ Job Application for Data Scientist at Base Power Company - None
Link: https://job-boards.greenhouse.io/basepowercompany/jobs/4551163008
================================================================================

✅ Jobs at IntegraFEC - None
Link: https://boards.greenhouse.io/integra
================================================================================

✅ Job Application for Data Scientist at YETI Test Events Job Board - None
Link: https://job-boards.greenhouse.io/yetitestevents/jobs/4001110004
================================================================================

✅ Jobs at IntegraFEC - Internships - None
Link: https://job-boards.greenhouse.io/integrainterns/jobs/4522535008
================================================================================

I then used Gemini with Beautiful soup for retrieving details about the jobs. Stored all this in ChromaDB for RAG

# Define a helper to retry when per-minute quota is reached.
is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

class GeminiEmbeddingFunction(EmbeddingFunction):
# Specify whether to generate embeddings for documents, or queries
document_mode = True

@retry.Retry(predicate=is_retriable)
def __call__(self, input: Documents) -> Embeddings:
if self.document_mode:
embedding_task = "retrieval_document"
else:
embedding_task = "retrieval_query"

response = client.models.embed_content(
model="models/text-embedding-004",
contents=input,
config=types.EmbedContentConfig(
task_type=embedding_task,
),
)
return [e.values for e in response.embeddings]


import chromadb

DB_NAME = "googlejobsdb"

embed_fn = GeminiEmbeddingFunction()
embed_fn.document_mode = True

chroma_client = chromadb.Client()
db = chroma_client.get_or_create_collection(name=DB_NAME, embedding_function=embed_fn)
documents=all_jobs

### Step 5: Store all the retrieved jobs in Chroma DBl_jobs
db.add(documents=documents, ids=[str(i) for i in range(len(documents))])
db.count

Implemented RAG on the documents and the resume with the following code

# Define the schema
class JobEntry(typing.TypedDict):
title: str
match_percentage: int
skills_present: list[str]
skills_missing: list[str]
match_analysis: str
improvement_suggestions: list[str]

class JobMatchResponse(typing.TypedDict):
jobs: list[JobEntry]
overall_recommendation: str

embed_fn.document_mode = False

# Search the Chroma DB using the specified query.
query = "Show me the closest job matches for my resume"

result = db.query(query_texts=[query], n_results=5)
[all_passages] = result["documents"]

all_passages

query_oneline = query.replace("\n", " ")

job_match_prompt = f"""
Given a question by job seeker : {query_oneline} and the resume {resume_text}
Generate a detailed analysis of how well this candidate's resume matches each Google job. For each job:
1. Calculate a match percentage (0-100%) based on alignment of skills, experience, location, and education
2. Identify skills from the job that are present in the resume
3. Identify important skills from the job that are missing in the resume
4. Explain why this role might be a good or poor fit
5. Suggest specific improvements to make the resume more competitive

Format your response as JSON:
{{
"jobs": [
{{
"title": "Job title",
"match_percentage": number,
"skills_present": ["skill1", "skill2"...],
"skills_missing": ["skill1", "skill2"...],
"match_analysis": "analysis of fit",
"improvement_suggestions": ["suggestion1", "suggestion2"...]
}},
...
],
"overall_recommendation": "overall career advice"
}}
"""



# Add the retrieved documents to the prompt.
for passage in all_passages:
passage_oneline = passage.replace("\n", " ")
job_match_prompt += f"PASSAGE: {passage_oneline}\n"


# Now call the model
answer = client.models.generate_content(
model="gemini-2.0-flash",
contents=job_match_prompt,
config=types.GenerateContentConfig(
response_mime_type="application/json",
response_schema=JobMatchResponse
)
)


Markdown(answer.text)
data = answer.text
print(data)
Most similar matches based on embeddings

📤 Output:

  • Top 5 job matches
  • Resume-job fit %
  • Strengths aligned
  • Missing skills
  • Suggested improvements
  • Job search URL

🔥 Why it’s cool:
It’s a personalized job-hunting RAG agent that works without a proprietary job board. It adapts to your resume, not the other way around.

🎭 Inside the Agent: MemeGeneratorAgent

🖼 Purpose:
Generate a custom meme that reflects your resume’s tone, score, and career vibe.

💡 Prompt Techniques Used:

  • Emotion-aware meme prompt with embedded tone
  • Style tag (savage, wholesome, motivational)
  • Optional retry logic for image generation

📤 Output:

  • Meme image (.png/.jpeg)
  • Meme caption (in bold markdown)
  • Style tag (used in PDF report)

💡 View the AI-Generated Resume Scorecard

You can view the full structured HTML output generated by ResuMeme.AI here:

Page 1 Report
Page 2 of the report
Page 3 of the report

This scorecard includes:
– Section-wise evaluation of your resume
– Embedded metrics, formatting feedback, and keywords
– Generated by Gemini + custom evaluation logic

And. The. Meme

Meme generated by ResuMeme.AI
Generated by ResuMeme.AI

🔥 Why it’s cool:
It’s not just for laughs — the meme actually reflects your evaluation. The humor softens the critique and makes it memorable.

🔑 Resume best practices that work

These came from sitting with close resumes… but just not getting callbacks. Some were mine. Some were friends. Some were people like Tina who showed up ready and just needed the right nudge.:

1. Quantify Everything

📊 Don’t say “Worked on dashboards” —
Say “Built 3 dashboards used weekly by execs, saving 10+ hours/month.”

2. Use the “X by Y that resulted in Z” Structure

🧱 Accomplished X by doing Y, which led to Z
✅ “Reduced churn by 15% by personalizing onboarding emails, resulting in $120K ARR boost.”

3. Keep Your Summary to 2–3 Lines

📏 No adjectives. No fluff. No life story.
✅ “Data Scientist with 3 years’ experience in fraud detection and cloud deployments. Skilled in Python, AWS, and LLM fine-tuning.”

4. Action Verbs Only

⚙️ Optimized, launched, deployed, scaled.
Avoid: helped, participated, involved in

5. Core Competencies = Sentences, Not Multi-Column Lists

🛠️ Group your skills by how you use them.
✅ “Skilled in cloud orchestration using AWS Lambda and API Gateway for real-time data pipelines.”

6. Cut Weak Experience Sections

✂️ Tutor roles? Internships from 5 years ago? Remove or consolidate into one bullet.

7. Don’t Let One Role Eat Half the Page

🧹 Substitute teacher? 2 bullets max. Focus on transferable skills.

8. Projects Should Be 2 Lines Max

🔍 One line for the what, one for the impact.
✅ “Built RAG-based chatbot for housing search. Increased match relevance by 25% using Llama3 + FAISS vector store.”

9. Education Should Be One Line

🎓 Clean and tight:

University of X — B.S., CS, 2022

10. Aim for One Page

🧼 Unless you’re a CTO or PhD with patents, 1 page > 2 pages.
ResuMeme.AI will judge your whitespace choices. And so will recruiters.

Why I Built It

I was tired of boring AI projects.
And I was tired of watching people blindly apply to jobs with no idea if their resume even made sense.

This project gave me everything I wanted:

  • Structure
  • Motivation
  • Vibes
  • Feedback
  • A laugh

Ready to Run It?

Let the AI judge you.
Fix your resume.
Find a job.
And laugh a little.

Launch the Notebook on Kaggle

Built in Kaggle.
Powered by Gemini.
Styled by your chaos.

This project was created as part of the Google GenAI Hackathon on Kaggle, where over 5,600 participants worldwide explored the cutting edge of agentic workflows, Gemini APIs, and RAG systems.

ResuMeme.AI emerged as a full-stack, multi-agent system that doesn’t just analyze resumes — it turns them into insight, action… and laughter. Built entirely within a Kaggle notebook, using the techniques taught across Days 1–4 of the bootcamp (from JSON mode to function calling and LangGraph), this system proves that practical AI can be both technically sharp and surprisingly human.

Author Bio

Anupama Garani is an AI Product Strategist and Builder obsessed with agentic workflows, retrieval, and building AI products that actually feel good to use.
She loves mixing serious AI pipelines with a little edge, a little chaos, and a lot of practical value.

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓