“The issue is, should you’ve received a roadblock on the different finish of the wire, then Extremely Ethernet isn’t environment friendly in any respect,” Metz defined. “Whenever you begin to piece collectively how the info strikes via buffers, each out and in of a community, you begin to notice that you’re piling up issues should you don’t have an end-to-end answer.”
Storage.AI targets these post-network optimization factors relatively than competing with networking protocols. The initiative focuses on data-handling effectivity after packets attain their locations, making certain that superior networking investments translate into measurable utility efficiency enhancements.
AI information usually resides on separate storage networks relatively than the high-performance materials connecting GPU clusters. File and Object over RDMA specs inside Storage.AI would allow storage protocols to function straight over Extremely Ethernet and related materials, eliminating community traversal inefficiencies that pressure AI workloads throughout a number of community boundaries.
“Proper now, the info is just not on Extremely Ethernet, so we’re not utilizing Extremely Ethernet in any respect to its most potential to have the ability to get the info within a processor,” Metz famous.
Why AI workloads break conventional storage fashions
AI purposes problem assumptions about information entry patterns that community engineers take with no consideration.
Metz famous that machine studying pipelines include distinct phases, together with ingestion, preprocessing, coaching, checkpointing, archiving and inference. Every of these phases requires totally different information constructions, block sizes and entry strategies. Present architectures pressure AI information via a number of community detours.
