2

I'm designing a cluster for a small research institute. Since our computations require a large amount of memory, I'm looking for a solution that will allow our applications access to the whole memory distributed across different nodes. The access has to be "transparent", since we don't want to modify programs we are using, so solutions like RDMA are excluded. For this reason also transparent access to other resources, like GPGPUs, storage, I/O and CPUs at different nodes would be desired.

I know there are hardware implementations connecting nodes directly via UPI links between CPUs, like HPE Superdome and Atos BullSequana. There are also software solutions implementing virtualization for aggregation, like ScaleMP and TidalScale that connect nodes using ordinary Ethernet interconnect plus some AI memory use prediction for performance improvement.

A similar question has been asked here some time ago, viz. Alternative to ScaleMP?, but it seems that the market has drastically changed since that time.

I have two questions:

  1. What is the performance difference between hardware and software solutions, especially in terms of latencies in memory access?
  2. Are there currently available any other hardware or software solutions providing the described functionality?
Piotr M
  • 23
  • 3
  • "The access has to be "transparent", since we don't want to modify programs we are using, so solutions like RDMA are excluded." - Then I think you're out of luck as I don't believe such systems exist sorry. – Chopper3 Apr 09 '21 at 17:03
  • As far as I understand the hardware and software solutions I mentioned in the question have this propoerty, so they do exist :) Those software SSI implementations introduce an abstraction layer, and makes the fact that nodes are separate physically invidible to the OS. Am I wrong? – Piotr M Apr 09 '21 at 17:41
  • I genuinely don't believe they run unmodified code. – Chopper3 Apr 13 '21 at 07:25
  • In fact I've just checked all of those products you mentioned - none of them do what you want, I really don't believe it can be done without major code modification. Now what I do know works exactly as you need is RDMA, ideally over Infiniband rather than RoCE in my experience, but that absolutely needs to be rewritten. I've been building server environments for 31 years and if such a thing existed where unmodified code could somehow use the memory of another machine without rewritten code I'd have heard of them, as would the rest of the world as they'd be an enormous company/product. – Chopper3 Apr 13 '21 at 09:45
  • Thank you very much. I have no experience in administration of servers yet, but please help me be sure that the above products really don't provide the functionality I need with having shared the memory and other resources without modification of the software we are using. As I read on their websites (see the next comments): – Piotr M Apr 13 '21 at 19:28
  • "The vSMP architecture provides cache coherency, shared I/O and the system interfaces (BIOS, ACPI), which are required by the OS. It is implemented in a completely transparent manner; no additional device drivers are required and no modifications to the OS or the applications are necessary. (...) Once loaded into the memory of each of the system boards, vSMP Foundation aggregates the compute, memory and I/O capabilities of each system and presents a unified virtual system to both the OS and the applications running above the OS." (www.scalemp.com/technology/versatile-smp-vsmp-architecture/) – Piotr M Apr 13 '21 at 19:28
  • "TidalScale’s software solution “glues” your commodity servers together so that they function as a single system. The software accomplishes this by aggregating the cores, memory, and I/O of multiple physical servers, virtualizing these resources, and then presenting them as a unified “software-defined server” to the operating system. All of this happens with no changes to applications or operating systems." (www.tidalscale.com) How should I understand it? What am I missing? Please, take a look, as this is very important for our design decisions. – Piotr M Apr 13 '21 at 19:29

0 Answers0