Big Data Cluster

Avoid Storage Bottlenecks and Improve Analytics

Don’t Let Legacy Storage Stop You from Delivering Results

Workloads that extract value from massive data sets with accelerated computing (HPC or AI/ML), while highly desirable, can suffer from computing bottlenecks and poor performance. And, even if you deploy all flash, using DAS and NAS can mean additional challenges. The Big Data Cluster removes bottlenecks via a shared pool of NVMe over fabric (NVMeOF) that enables jobs to run up to 10x faster. And S3-compliant storage allows you to control costs.

Ideal Use Cases

  • Business Intelligence
  • Real-time HPC Analytics
  • ML Training
  • Predictive Analytics
  • Data Visualization

Relevant Industries

  • Oil & Gas
  • Financial Services
  • Life Sciences
  • Media & Entertainment
  • Aerospace & Defense

Powerful Compute, Optimized Storage

  • Software-defined WekaFS (NVMe-over-fabric)
  • Optional S3 Tier
  • Scalable with 6+ servers
  • Ideal for HPC & AI training
  • EDSFF enterprise SSD storage
  • AMD EPYC, PCIe 4 support

Inside the Big Data Cluster

Compute

AMD EPYC Processors

 

Networking

NVIDIA Networking Ethernet and InfiniBand

 

Storage

425TB of Capacity per Node

Software-defined storage using Weka.io file system (WekaFS), optimized for large datasets

High-speed, low-latency NVIDIA adapters

About Weka Software-Defined-Storage

Weka.io’s key offering, WekaFS, is an alternative to GPFS, IBM Spectrum Scale or Lustre. Because legacy storage designs forced customers to deploy different architectures to satisfy the needs of different workloads, WekaFS was built from the ground up to address the diverse requirements of modern workloads. This technology enables clients to pool all their data and manage it through a global namespace. With dramatically simplified administration of storage, Big Data Cluster users can easily access and manage data at scale and deliver better outcomes.