Fine-Tuning DeepSeek-VL2 for Multimodal Instruction Following: A Comprehensive Technical Guide
Author(s): Ojasva Goyal Originally published on Towards AI. Fine-tuning large-scale vision-language models with detailed error breakdowns and best practices. Unlocking advanced vision-language capabilities with parameter-efficient adaptation Image by Alex Shuper on Unsplash Introduction DeepSeek-VL2 is a multimodal large language model (MLLM) capable …
CUDA vs cuDNN: The Dynamic Duo That Powers Your AI Dreams
Author(s): Ojasva Goyal Originally published on Towards AI. The secret sauce has a name — actually, two names: CUDA and cuDNN. Image by Kevin Ache on Unsplash The Superhero Origin Story Picture this: It’s 2006, and NVIDIA realizes their graphics cards have …