| Towards AI

Author(s):

Originally published on Towards AI.

Design and Implementation of an LLM-Powered Intelligent Tutoring System for Computer Science Education

Overview:

In this article, I’ll be presenting CS Tutor, an AI-powered tutoring assistant designed by Computer Science students (my teammate and I) for Computer Science students as the final project of our Cloud Computing lecture. CS Tutor promises to ease students' learning by helping them clarify complex concepts. Built with a focus on 3 core CS topics: Algorithms and Data Structures, Computer Networks, and Operating Systems in the initial version. Other lecture concepts can be easily added in later versions. The architecture combines the powers of custom prompt engineering with a strong system design, containerized via Docker. The project is tested by internal user testing with classmates. This article explains technical details of our project, development process, challenges we faced, as well as possible advice for you if you want to do a similar project. You can find the code on GitHub.

Problem Definition and Educational Motivation:

The idea came from our own needs. Of course, all of us use AI in our daily lives, but wouldn’t it be better if we could personalize AI for the lessons we take and also for our background? Some Computer Science concepts can be tough to understand(e.g., memory management, semaphores, TCP handshake), and students can miss the lessons, or even if they attended the lessons, some concepts can still not be clear. CS Tutor fills this gap by generating multiple-choice quizzes through LLM prompts based on the questions you asked & the PDF summaries you generated with AI. In-context learning is the feature of CS Tutor that makes it so special via domain-specific knowledge adaptation. Keeping in mind that all of these resources are free, we consider this overall project a success. The best thing about it is that it can be specialized in your field once the structure is constructed, or even for other grade levels, like high school, too. During the testing, we used our lecture slides, so we were able to easily understand if the system works correctly.

System Architecture and Components:

Our system design is designed as a modular, containerized web application utilizing microservices to handle natural language interaction, caching, database management, and asynchronous task processing. Below are the details and the core components of the system;

Frontend: User interaction is provided by Django[1] templates and presents a chat-based interface. We welcome the user to the Login page, which can easily be switched to the Sign-up page. After logging in, the user encounters the main ‘Lectures’ page. Here, the user can choose the lecture he/she want. Pressing the lecture takes the user to a chat message (specified for the lecture)

2. Backend: The Main backend is a Django-based web application composed of a Docker [3] container. Container receives and routes user queries, formats prompts for LLM requests, handles user session state(caching), and manages authentication for a better user experience.

3. LLM Integration(Ollama): System integrates with Ollama[4], specifically chosen Gemma 3:1B [5] as model. We preferred this one because it was suitable for our hardware resources and one of the easiest models in terms of usage. Model’s responsibilities are;
-conceptual explanations for Computer Science topics
-preparing personalized quizzes
-creating a summary of the uploaded document and Q&A about the document.

4. Asynchronous Task Management(Celery): Celery[6] service manages the tasks like quiz generation and PDF summarization in parallel.

5. Database(PostgreSQL): Stores user information, user chats, cached summaries, and quizzes.[8]

6. Cache(Redis): A personalized cache is used for already asked questions or generated quizzes. If the user asks a similar question(calculated by semantic similarity), the response is automatically pulled from cache. With the help of caching, we improve the response time and memory usage. We used a Jaccard-like keyword-overlap method. We have 2 functions: extract_keywords() and keyword_overlap(). Extract_keywords function extracts keywords from both up-to-date questions and previous questions. Keyword_overlap function calculates the overlap between these two keyword sets. Therefore, the system checks “which previous question contains similar words to the current one?” If the similarity score is greater than 0.6, it is considered a cache hit and uses the answer given before. [9]

Sample Code of Keyword Overlap Calculation;

def keyword_overlap(keywords1, keywords2):
 if not keywords1 or not keywords2:
 return 0.0
 intersection = len(keywords1 & keywords2)
 union = len(keywords1 | keywords2)
 return intersection / union if union > 0 else 0.0

 start_time = time.time()
 normalized_input = user_input.lower().strip()
 cache_key = f'model_response:{request.user.id}:{hashlib.md5(normalized_input.encode()).hexdigest()}'
 cached_response = cache.get(cache_key)
 logger.info(f"Checked exact cache for {cache_key}: {'Hit' if cached_response else 'Miss'}, time: {time.time() - start_time:.3f}s")

7. Containerization: All the above services are containerized using Docker. Using Docker maintains isolation between services and ease of deployment for plans(a.k.a, deploying to the cloud) or ease in usage.

Workflow Overview: User submits a question via the web UI -> Django backend preprocesses the request -> if the query is about explaining a concept, it is routed to LLM directly, if it’s not(like PDF summarization/quiz), then routed to Celery. -> Redis cache is checked before calling the LLM ->Response is given and stored in PostgreSQL.

Hardware Requirements: No GPU needed. The whole project is done without a GPU.

Technical Stack(requirements.txt):

-Django 5.1.2 ~Python web framework

-Asgiref 3.8.1 , sqlparse 0.5.1 ~Django dependencies

-Whitenoise 6.8.2 ~static file serving related to product deployment

-python-decouple >=3.5 ~managing environment variables

-dj-database-url 2.2.0 ~database configuration

-requests 2.32.3 ,urllib 2.3.0,idna 3.10,certifi 2025.1.31 ~managing HTTP requests

-transformers ~interacting with LLM

-celery 5.3.6 ~background tasks

-redis 5.0.1 & django-redis 5.4.0 ~caching backend

-pycopg2-binary 2.9.9 ~PostgreSQL adapter for Python

-PyPDF2 3.0.1 , pdfplumber 0.10.4 ~text extracting from PDFs

Evaluation and Challenges:

The most exhausting part to test was the PDF feature. We tested with a certain PDF (full of text, no image) because the AI was confused by images at first. I focused on solving this issue first because whenever a PDF contains an image, the system immediately gives errors. I also noticed that our libraries don’t have OCR support whilst trying with one of my lecture slides, because the slides were full of images and had no text. After solving this issue, the initial version took approximately 5 minutes to process and generate a summary. Then our next problem was optimizing. Optimizing the response time was a journey full of trials; we didn’t know what would be the best trade-off between response time and response quality. Also, during optimization attempts, we figured out that the pdfplumber[11] library performs better than the pyPDF2[12] library, so we switched to that. The final version was between 20–40 seconds(depending on the PDF size). We managed to lower it to 10 seconds, but didn’t choose this version because of the response quality. (e.g., summaries consisted of random words)

Docker-compose file:

services:
 web:
 build: .
 command: python manage.py runserver 0.0.0.0:8000
 volumes:
 - .:/app
 ports:
 - "8000:8000"
 environment:
 - PYTHONUNBUFFERED=1
 - DATABASE_URL=postgresql://myuser:mypassword@postgres:5432/mydb
 depends_on:
 - ollama
 - redis
 - postgres
 networks:
 - app-network

 ollama:
 image: ollama/ollama:latest
 command: serve
 volumes:
 - ollama-data:/root/.ollama
 ports:
 - "11434:11434"
 networks:
 - app-network

 redis:
 image: redis:latest
 command: redis-server --requirepass mysecretpassword --maxmemory 100mb --maxmemory-policy volatile-lru
 ports:
 - "6379:6379"
 networks:
 - app-network

 celery:
 build: .
 image: celery:latest
 command: celery -A ai_project worker --loglevel=info
 volumes:
 - .:/app
 depends_on:
 - redis
 - ollama
 - postgres
 environment:
 - PYTHONUNBUFFERED=1
 - DATABASE_URL=postgresql://myuser:mypassword@postgres:5432/mydb
 networks:
 - app-network

 postgres:
 image: postgres:latest
 environment:
 - POSTGRES_USER=myuser
 - POSTGRES_PASSWORD=mypassword
 - POSTGRES_DB=mydb
 volumes:
 - postgres-data:/var/lib/postgresql/data
 ports:
 - "5432:5432"
 networks:
 - app-network

volumes:
 ollama-data:
 postgres-data:

networks:
 app-network:
 driver: bridge

Future Work& Research Directions:

We had approximately 3 months as the deadline to complete this project, but during these months, we focused on other lessons too. Since this was a university project, we had limited time, hence pressure. If we had more time, we planned to add more lessons too, but one thing that we want to do is a section that takes your syllabus as input and gives you a personalized study plan. This would indeed enhance CS Tutor’s functionality! Also, we could specialize the home page in a way that it looks more like a blog website. That would create a more interactive environment for students.

Technical roadmap would be: knowledge tracing + student modeling -> embedding-based semantic search-> domain-specific fine-tuning with university QA pairs-> OCR integration + multimodal LLMs.

I can surely say that this project highly motivated us to make more projects in the AI field. This experience made me realize how much I love working with AI.

Broader Impact:

CS Tutor provides students with access to resources beyond lecture hours with ease. It promotes equality of opportunity in education among students who learn on their own. Technical terms can be learned easily with natural language support. Learning is personalized by creating content based on documents uploaded by the student and the questions the student asked. It adapts to different learning styles. Long research periods are shortened with the help of the system’s fast response time. These features can also be used by professors.

Accuracy detection and resources play an important role in case incorrect or incomplete information is produced. Protecting personal data is also critical(e.g., user chat history and PDF contents)

References

[1] Django Software Foundation. Django Documentation. Available at: https://docs.djangoproject.com/

[2] Background Patterns. Available at: https://heropatterns.com/

[3] Docker Inc. Docker Documentation. Available at: https://docs.docker.com/

[4] Ollama. Ollama Documentation. Available at: https://ollama.com/

[5] Google. Gemma Model Card. Available at: https://ai.google.dev/gemma

[6] Celery Project. Celery Documentation. Available at: https://docs.celeryq.dev/

[7] CSS Cards Design. Available at: https://uiverse.io/ahmed150up/afraid-octopus-75

[8] PostgreSQL. PostgreSQL Documentation. Available at: https://www.postgresql.org/docs/

[9] Redis. Redis Documentation. Available at: https://redis.io/documentation
[10]Draw.io website. Available at: draw.io

[11] Pdfplumber. PDFplumber Documentation. Available at: https://github.com/jsvine/pdfplumber

[12] PyPDF2. PyPDF2 Documentation. Available at: https://pypdf2.readthedocs.io/

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Publication

Author(s):

Design and Implementation of an LLM-Powered Intelligent Tutoring System for Computer Science Education

Overview:

Problem Definition and Educational Motivation:

System Architecture and Components:

Technical Stack(requirements.txt):

Evaluation and Challenges:

Future Work& Research Directions:

Broader Impact:

References

Popular posts

Best Laptops for Deep Learning, Machine Learning (ML), and Data Science for 2023

Best Workstations for Deep Learning, Data Science, and Machine Learning (ML) for 2022

Descriptive Statistics for Data-driven Decision Making with Python

Best Machine Learning (ML) Books - Free and Paid - Editorial Recommendations for 2022

Best Data Science Books - Free and Paid - Editorial Recommendations for 2022

Updates

Recent Posts

Why Knowledge Graphs Are the Missing Piece in AI Agent API Discovery

The Complexity of Self-Driving Cars Explained Simply

Bridging Symbolic AI and Deep Learning: How Knowledge Graphs are Revolutionizing ResNets

LAI #93: Smarter Model Choices, Multi-Agent Systems, and Cutting Through AI Noise

Who Wins Purview vs Rogue AI in Data Control

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Publication

Author(s):

Design and Implementation of an LLM-Powered Intelligent Tutoring System for Computer Science Education

Overview:

Problem Definition and Educational Motivation:

System Architecture and Components:

Technical Stack(requirements.txt):

Evaluation and Challenges:

Future Work& Research Directions:

Broader Impact:

References

Related posts

Popular posts

Updates

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement