0

we have several DOS applications (Clipper) which share dbase files on a file server. The applications run under XP. This has worked for about two decades with Netware and for years with Samba (member server) without problems.

Some weeks ago I upgraded openSUSE from 11.4 (samba-3.6.3) to 12.2 (samba-3.6.7) and changed the hardware (to AMD E-450 with 6 GiB RAM). To make it worse (from a debugging point of view) about that time the switch was changed (from 100 Mbit to a 48 port Gb switch).

Since then (it is not clear since which change exactly because the users don't tell us immediately...) a few users face severe problems with certain of these DOS applications which are not precisely reproducible. This seems to be about access rights or (more probable) file locking. As far as we know these applications do byte range locking on the files. I do not know whether (and how) I can get this kind of debugging information from samba. There are no general problems accessing these files. Oplocks are enabled (disabling is inacceptable and does not solve the problem, too).

Then I changed the server structure: Earlier Samba ran on real hardware. I made the host OS a simple installation just serving as a host for VMs) and put Samba into a VM, using the openSUSE 11.4 installation which worked without problems before. The problems have not disappeared since. An upgrade of the Samba VM (to 12.2) seems to have made it even worse. Regular Windows share access seems to not have been affected in any of these configurations. ifconfig shows that about one of every 4000 RX packets is dropped on the interface which seems OK to me.

Any ideas, either for the real problem or for a precise Samba debugging / tracing that shows me what exactly the problem is in the communication between Samba and the XP clients?

Without better ideas I will probably first try a different NIC. Years ago that has solved a (general, not DOS related) Samba problem for me.

Hauke Laging
  • 5,157
  • 2
  • 23
  • 40
  • 1
    Could you describe the users' problems in more detail? Also, have you considered recording a continuous network trace and looking into the time ranges where users were reporting problems? If you specifically suspect locking, consider enabling debug logging for the locking debug class using `log level = 0 locking:10` in the `[global]` section of your `smb.conf`. – the-wabbit Feb 19 '13 at 13:27
  • 1
    BTW: losing one out of 4000 frames would be a lousy error rate, although the RX error counter would [increment on a whole number of occasions](http://www.novell.com/support/kb/doc.php?id=7007165) which are not necessarily network errors. Do you have a managed switch where you could look at the RMON counter values of the port connecting the server? – the-wabbit Feb 19 '13 at 13:43
  • @syneticon-dj All detail we can see is a crash with "DOS error 5" which means file access problem (probably issued by the Clipper runtime). The problem is not exactly reproducable. Thus my colleague who maintains these programs cannot tell me what exactly is happening then. Is it possible to limit the verbose logging to one user (or IP address)? Several users claim that the performance had degraded but mainly one has the crash problems. – Hauke Laging Feb 19 '13 at 13:47
  • @syneticon-dj We have a Netgear GS748T. I just searched the manual. "rmon" doesn't occur there. I consider this a managed switch, though (it supports VLAN, ports can be configured, there are mirror ports). I had a look with strace at samba. It seemed to me that the performance problems were due to the client (not that the users care...). – Hauke Laging Feb 19 '13 at 14:21
  • Look at the "port statistics" from the "Monitoring" tab in the switch web interface. Netgear has many switches which are calles "smart managed" and is synonymous to "has a web interface which is doing something" - the GS748T is one of them. At least it has statistics. You can't increase log levels just for one user but if you have tcpdump network traces (which obviously can be filtered to only cover one IP address' communications) and a time frame to search for in the debug logs, you should be able to figure out what is wrong. – the-wabbit Feb 19 '13 at 14:59

1 Answers1

1

Make sure to isolate the error from the client perspective. You must understand what the client tries to do before you debug at the server side.

Note that DOS doesn't know anything about Oplocks - so I don't really see how they're affecting you with this particular issue. When a client locks a file with the standard DOS system call for this, it will get locked as a whole. A second client will then encounter the "Error 5" described.

Since this worked earlier, it suspect that the application does not use the standard locking mechanism, but instead uses it's own - whatever it is. This would mean that some other process locks the file. You can hunt for open files (locks) with lsof.

Roman
  • 3,825
  • 3
  • 20
  • 33
  • 1
    DOS does not need to know about oplocks to use shared locks. The API for the latter (including byte-range locking) has been introduced with MS-DOS 3 `share.exe` utility, which is apparently the API [Clipper uses as well](http://groups.google.com/group/comp.lang.clipper/tree/browse_frm/month/1996-08/6eba0a2add7f7478?rnum=121&_done=%2Fgroup%2Fcomp.lang.clipper%2Fbrowse_frm%2Fmonth%2F1996-08%3F). Even the error code is suspicious since *"sharing violation"* has its own (`net helpmsg 32`). – the-wabbit Feb 23 '13 at 13:19