Upbound open-sources Modelplane to optimize inference clusters
Upbound Inc. today released Modelplane, a new open-source tool for managing artificial intelligence inference clusters. San Francisco-based Upbound is backed by $69 million from Alphabet Inc.’s GV fund, Intel Capital and others. It’s best known as the creator of Crossplane, an open-source infrastructure management engine. It’s an upgraded version of the Kubernetes control plane, a […] The post Upbound open-sources Modelplane to optimize inference clusters appeared first on SiliconANGLE.
Upbound Inc. today released Modelplane, a new open-source tool for managing artificial intelligence inference clusters. San Francisco-based Upbound is backed by $69 million from Alphabet Inc.’s GV fund, Intel Capital and others. It’s best known as the creator of Crossplane, an open-source infrastructure management engine. It’s an upgraded version of the Kubernetes control plane, a part of the framework that automates key tasks such as provisioning servers. The Kubernetes control plane is designed to manage container clusters. Crossplane, in contrast, can also coordinate other types of infrastructure. Additionally, the software includes extensibility features that enable developers to customize it to specific use cases. Modelplane, the new open-source tool that Upbound debuted today, is a version of Crossplane optimized for AI inference workloads. One of the tasks that the tool promises to ease is spreading inference workloads across multiple clouds. In the past, that approach was difficult to implement because each cloud platform must be managed separately. Modelplane eases the workflow by enabling developers to centrally configure infrastructure resources across multiple platforms. The tool automatically determines what workload should run on which cloud. When the request volume processed by an AI model increases, Modelplane adds capacity by spinning up new replicas. Those are identical copies of the neural network deployed on different instances. The servers that run an AI model often keep its weights in a remote storage system. When a user enters a prompt, the weights have to be loaded from the remote storage to servers’ built-in memory, which slows down processing. Modelplane includes a distributed caching feature that stores weights on server clusters’ local storage to reduce response times. According to Upbound, the tool doesn’t send user requests directly to inference servers but rather routes them through a gateway. It’s a component that ensures prompts comply with cybersecurity and cost-efficiency requirements. Additionally, the gateway doubles as a disaster recovery tool: It can route requests to an external inference environment when there’s an outage. “We’ve been watching Crossplane adopters build inference platforms across clusters and operate it at large scale, composing the clusters, the GPUs, the serving stacks and the routing into their own control planes,” Upbound founder and Chief Executive Officer Bassam Tabbara wrote in a blog post today. “We wanted to standardize those patterns, make them far easier to get started with, and contribute the result back to the community as open infrastructure.” Modelplane is available on GitHub under an Apache 2.0 license. Image: Unsplash A message from John Furrier, co-founder of SiliconANGLE: Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities. 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network. About SiliconANGLE Media