Towards AI Tested Launchpad by Latitude.sh: A Container-based GPU Cloud for Inference and Fine-Tuning
Author(s): Towards AI Editorial Team
Originally published on Towards AI.
At Towards, weβre particularly interested in how Latitude.shβs platform supports machine learning models, especially for inference and fine-tuning large language models (LLMs). Our team recently explored their latest offering, Launchpad β a container-based solution that provides stable and adaptable access to dedicated cutting-edge GPUs, such as H100 in the cloud, and was introduced to the public in a beta version released in January this year. Launchpadβs notable feature is its advanced, high-level dedicated container-based GPUs, capable of handling the significant computational demands of AI workloads. This specialized feature is designed to provide a dependable, flexible, and high-performance environment for artificial intelligence projects.
With its seamless Docker integration, Launchpad is an excellent option for configuring and replicating development environments on servers. This flexible service is ideal for the quick deployment of popular open-source LLMs, fine-tuning custom models, hosting backend APIs or entire websites, quickly deploying popular open-source LLMs, and providing the resources necessary for fine-tuning models with custom datasets.
Launchpad is a promising step towards keeping artificial intelligence open, accessible, and transparent by providing developers with the tools to quickly adapt and deploy open-source AI models. We believe the service is a robust offering that gives users access to high-end hardware at fair prices, with options that include the NVIDIA H100 with 80GB of memory at $2.1/hr or the L40S with 48GB memory at $1.32/hr. Each instance comes with 14 CPUs and 185GB of RAM. The deployment interface supports Docker image configuration, model settings like naming, Docker commands, network ports, persistent storage connections, and custom environment variables. Users can save configurations as blueprints for quick redeployment and scaling. The popularity consensus of this cloud service among the community stems from its fair pricing, as evidenced by the reviews on G2.
Hands-on Experience with Launchpad by Towards AI
After spending a few days with the platform, a few features stood out for us. While their dedicated server offerings equipped with high-performance configurations are amazing (including instances with up to 8xH100 80GB GPUs), what really captured our interest is their new service named Launchpad. As previously mentioned, Launchpad simplifies server setup through Docker images, clearly aiming to facilitate the creation of production environments directly from an image. The setup process is impressively quick, with environments ready to go in seconds. Additionally, itβs user-friendly, offering options to configure environment variables and choose one of the two server sizes, as mentioned before. (H100 80GB or L40S 48GB).
We first tested the official inference image from the Latitude team, which offers an Inference API for using the LLaMA2 model. It was a great experience to quickly spawn a LLaMA 2 endpoint without coding in a matter of seconds. The endpoint receives POST requests with a prompt and returns the modelβs response. We tested both instance sizes and observed low latency in receiving responses with each. While both types are highly effective, the H100 instances are better equipped to handle higher workloads or more powerful models. Since the image is ready to use, you can download it from the Docker library and customize it as needed. For instance, you can add support for the latest LLaMA model, implement simple authentication, or enable SSH access.
Next, we tried switching to another image from their pre-defined library, designed for fine-tuning. This image arrives pre-installed with all the necessary packages, enabling you to start the fine-tuning process immediately. (Depending on your use case, it is also more than capable of training smaller models.) It includes Jupyter Notebook and SSH capabilities, allowing you to code directly in the notebook or SSH into the instance to run your scripts. We also tested one of our fine-tuning scripts, and the process started flawlessly. These pre-defined Docker images can save you considerable time, effort, and money. Additionally, you have the option to use your custom image to set up the server instance.
Launchpad is ideal for fine-tuning tasks or creating API endpoints for serving LLMs. It is an excellent option for deploying your applications using Docker, eliminating the hassle of configuring and replicating your development environment on a server. It is a flexible service that can be used to host backend APIs (or entire websites) and provide the resources needed to fine-tune models using your custom datasets. With up to 1.5 TB of storage capacity, you wonβt have to worry about data storage constraints.
More About Latitude.sh
Latitude.sh, founded in 2001, offers a suite of cloud computing products across diverse industries. With customizable bare metal servers, robust APIs, and an extensive array of features β including 20 TB of free bandwidth per server, bandwidth pooling, DDoS protection, isolated IPv4/IPv6 addresses, rapid 5-second deployments, and 24/7 support β the company is pushing the boundaries of cloud services.
A Look at Latitude.shβs Three Key Cloud Infrastructure Products
Latitude.sh offers three specialized compute product services tailored to suit different workloads and use cases: Metal, Accelerate, and Launchpad. Their Metal product line provides bare metal instances ideal for a wide range of use cases, from general-purpose computing tasks to high-demanding computational workloads, such as deploying mission-critical enterprise applications. The Accelerate product provides dedicated instances equipped with the latest and most powerful GPUs, making it a top choice for resource-intensive machine learning workloads, including inference, fine-tuning, and model training. Last but not least, you have Launchpad, which is currently in public beta.
Metal instances are designed to simplify server deployment, management, and programming. They come packed with features aimed at enhancing efficiency. Other notable deployment features include rapid server provisioning with 5-second deployments, a rescue mode for server recovery, custom image deployment, out-of-band management, multiple hardware configurations, SSD/NVMe disks, and GPU instances. Post-deployment support includes managing features like DDoS protection, bandwidth monitoring, private networking, and an upcoming elastic IP feature.
Latitude.sh also provides a comprehensive suite of tools for programming your instances, including various APIs and SDKs for practical server management. Businesses using Latitude.sh also benefit from Terraform, an Infrastructure as Code (IaC) tool, to programmatically manage their Metal cloud infrastructure. This enables automated deployments and ensures consistent infrastructure across projects through code-based configuration and version control. You can read more about their features here.
Accelerate offers a variety of high-performance server designs tailored to the demands of machine learning and artificial intelligence projects. The lineup includes a broad range of GPUs, such as the H100, A100, L40s, and the Grace Hopper Superchip. The H100 80 GB memory GPUs are available in small, medium, and large instances with 1, 4, and 8 GPUs, respectively, making them suitable for a wide array of computational tasks. Additional options include instances with 8xA100 80GB, 8xL40s 48GB, and a dual Grace Hopper Superchip. These powerful nodes excel at complex tasks such as simulations, graphic rendering, machine learning training, fine-tuning, or inference.
Launchpad is their new container-based product that provides a stable, adaptable, and high-performance environment for fine-tuning and deploying machine learning models. Its key strength is the availability of container-based dedicated cutting-edge GPUs in the cloud, making it ideal for the high computational requirements of artificial intelligence projects, particularly during the fine-tuning and inference of LLMs. This service offers a variety of containers to assist in the creation and deployment of machine-learning models. These containers come with pre-defined environments that simplify the server setup process for tasks such as API for inference, fine-tuning, and text-generation UI, among others.
Final Thoughts
Overall, testing the platform was a refreshing experience relative to complex custom deployments and fine-tuning, showcasing how quickly we could move from an image to a production environment. Their products are affordable, suit various needs, and are supported by 24/7 assistance. Whether you require dedicated servers for resource-intensive tasks like training models from scratch or fine-tuning open-source large language models (LLMs), or youβre looking to deploy your applications quickly with their newly released container-based service Launchpad, Latitude has you covered.
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming aΒ sponsor.
Published via Towards AI