2

I am trying to configure and test beegfs with RDMA as explained in:

https://community.mellanox.com/s/article/howto-configure-and-test-beegfs-with-rdma?t=1570613300675

My test configuration:

OS: ubuntu 16.04 on both servers (Kernel version 4.15.0-65-generic)
OFED: MLNX_OFED_LINUX-4.6-1.0.1.1
beegfs version: 1.7.3 (latest)
Adapter: ConnectX-3 VPI

Servers: Two similar server systems (128GB RAM). One acting as BeeGFS-server, and the other one as BeeGFS-client. In the example below, systems with 2x Intel Xeon CPU E5-2697v2 (Ivy Bridge) are used.

Everything works up until the point I try to rebuild the client. The rebuild process gives me two warnings:

CC [M]  /opt/beegfs/src/client/client_module_7/build/../source/common/net/sock/RDMASocket.o
/bin/sh: 1: [: 0005: unexpected operator

CC [M]  /opt/beegfs/src/client/client_module_7/build/../source/common/net/sock/ibv/IBVSocket.o
/bin/sh: 1: [: 0005: unexpected operator

If I try to restart the client I receive an error:

root@optiplex2:~# systemctl status beegfs-client.service
● beegfs-client.service - Start BeeGFS Client
   Loaded: loaded (/lib/systemd/system/beegfs-client.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since wo 2019-10-09 10:07:35 CEST; 16s ago
  Process: 17984 ExecStop=/etc/init.d/beegfs-client stop (code=exited, status=0/SUCCESS)
  Process: 18007 ExecStart=/etc/init.d/beegfs-client start (code=exited, status=1/FAILURE)
 Main PID: 18007 (code=exited, status=1/FAILURE)

okt 09 10:07:18 optiplex2 beegfs-client[18007]: modprobe: ERROR: could not insert 'beegfs': Unknown symbol in module, or unknown parameter (see dmesg)
okt 09 10:07:18 optiplex2 beegfs-client[18007]: - BeeGFS module autobuild
okt 09 10:07:19 optiplex2 beegfs-client[18007]: Building beegfs client module
okt 09 10:07:22 optiplex2 beegfs-client[18007]: /bin/sh: 1: [: 0005: unexpected operator
okt 09 10:07:23 optiplex2 beegfs-client[18007]: /bin/sh: 1: [: 0005: unexpected operator
okt 09 10:07:35 optiplex2 beegfs-client[18007]: modprobe: ERROR: could not insert 'beegfs': Unknown symbol in module, or unknown parameter (see dmesg)
okt 09 10:07:35 optiplex2 systemd[1]: beegfs-client.service: Main process exited, code=exited, status=1/FAILURE
okt 09 10:07:35 optiplex2 systemd[1]: Failed to start Start BeeGFS Client.
okt 09 10:07:35 optiplex2 systemd[1]: beegfs-client.service: Unit entered failed state.
okt 09 10:07:35 optiplex2 systemd[1]: beegfs-client.service: Failed with result 'exit-code'.

Everything works if I do not use the OFED driver. In that case one uses the TCP/IP stack instead of RDMA.

Any Idea how one can solve this? I have also tried older versions of the OFED driver but does did not compile on my system.

Best,

Remco

Vinícius Ferrão
  • 5,400
  • 10
  • 52
  • 91
Ramon
  • 21
  • 2

1 Answers1

0

Try this in your /etc/beegfs/beegfs-client-autobuild.conf:

buildArgs=-j8 BEEGFS_OPENTK_IBVERBS=1 OFED_INCLUDE_PATH=/usr/src/ofa_kernel/default/include/
Howie
  • 1