From Résumés(PDFs) to Clean Data: Layout-Aware Parsing with Tiny LLMs
Author(s): Florian June Originally published on Towards AI. Have you ever wondered how to parse a résumé, or had to work with résumés on the job? This article will give you some useful insights. Figure 1: Overview of the layout-aware, LLM-powered resume …
DecEx-RAG: A Paradigm Shift from Outcome to Process in Agentic RAG
Author(s): Florian June Originally published on Towards AI. Have you encountered Agentic RAG in your work or research? Today we will look at the progress of Agentic RAG. Figure 1: Illustration of the framework for DecEx-RAG, which demonstrates the process of search …
TURA: Unifying RAG and Agents to Revolutionize AI Search
Author(s): Florian June Originally published on Towards AI. AI Innovations and Insights 70 Standard RAG systems are starting to show their limits. Figure 1: Demonstration of TURA’s agentic capabilities. Given a query on July 31, 2025: (a) TURA autonomously utilizes a tool …
How to Turn RAG into an “Information Sieve” — AI Innovations and Insights 68
Author(s): Florian June Originally published on Towards AI. How to Turn RAG into an “Information Sieve” — AI Innovations and Insights 68 This is Chapter 68 of this insightful series! Figure 1: Motivation and Overview: Left: Compositional queries are hard to answer …
Model Context Protocol (MCP): Foundation for AI or a Looming Risk? — AI Innovations and Insights 37
Author(s): Florian June Originally published on Towards AI. Welcome to the 37th installment of this elegant series. Model Context Protocol (MCP) was introduced by Anthropic in 2024 to tackle the complexity and scalability challenges of manually connecting AI applications to external APIs. …
O1 Replication Journey Part 2: Let a Great Teacher Guide Students
Author(s): Florian June Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. In my view, any kind of learning boils down to two key elements: training data and training methods. For enhancing LLM reasoning …
AI Innovations and Insights 27: OCR Hinders RAG and RAGChecker
Author(s): Florian June Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. This article is the 27th in this mind-expanding series. Today, we will explore two enlightening topics in AI, which are: OCR Hinders …
AI Innovations and Insights 23: KAG, AlphaMath, and Offloading
Author(s): Florian June Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. This article is the 23rd in this compelling series. Today, we will explore three intriguing topics in AI, which are: KAG: Brilliant …
AI Innovations and Insights 16: ASSISTRAG
Author(s): Florian June Originally published on Towards AI. The Dual RAG Engine of Thinking and Memory This member-only story is on us. Upgrade to access all of Medium. This article is the 16th in this promising series. Today, we will explore the …
Unveiling LLM-Enhanced Search Technologies
Author(s): Florian June Originally published on Towards AI. Principles, Key Features and Insights This member-only story is on us. Upgrade to access all of Medium. In the internet age, the explosion of information has created a growing need for efficient content retrieval. …
Let AI Instantly Parse Heavy Documents: The Magic of MPLUG-DOCOWL2’s Efficient Compression
Author(s): Florian June Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. Today, let’s take a look at one of the latest developments in PDF Parsing and Document Intelligence. In our digital age, the …
Unlocking Key Technologies in Document Parsing
Author(s): Florian June Originally published on Towards AI. A Comprehensive Guide with Insights This member-only story is on us. Upgrade to access all of Medium. A large number of documents — including technical documentation, historical records, academic publications, and legal files — …
Key Insights and Best Practices on Instruction Tuning
Author(s): Florian June Originally published on Towards AI. This member-only story is on us. Upgrade to access all of Medium. Recently, I’ve been involved in projects related to instruction tuning for large language models(LLMs). I felt it was time to summarize some …
Teaching RAG to “Remember”: How MemoRAG Enhances Question-Answering Through Memory
Author(s): Florian June Originally published on Towards AI. Underlying Principles, Source Code, and Insights This member-only story is on us. Upgrade to access all of Medium. Existing RAG systems are limited in handling complex or ambiguous information needs that cannot be directly …
Demystifying PDF Parsing 05: Unifying Separate Tasks into a Small Model
Author(s): Florian June Originally published on Towards AI. Mechanics, Code, Insights on GOT, DLAFormer, and UNIT This member-only story is on us. Upgrade to access all of Medium. This article is the fifth in the series. The previous articles introduced several mainstream …