0

My system reboots when i give the "date -s" command. I collected the log and it says this

Jan 18 13:27:46 watchdog[2421]: file /tmp/cmm/strobeWDT was not changed in 1 seconds.
Jan 18 13:27:46 watchdog[3303]: shutting down the system because of error 2
Jan 18 13:27:47 watchdog[2421]: stopping daemon (5.2)

My application run on the system writes into the file /tmp/cmm/strobeWDT. If it fails to write to the file periodically, the watchdog daemon sends a reboot command. That is understandable. But only when i give the "date -s" command to set a new date, the system logs the above message and reboots.

Where is the problem existing.

I dont know whether the above info is enough to solve the problem from your side. Kindly help

Adding more info

I am getting the above log message from the code in file_stat.c (file inside the source code for watchdog daemon)

file_stat.c

#if USE_SYSLOG
   /* do verbose logging */
   if (verbose && logtick && ticker == 1)
       syslog(LOG_INFO, "file %s was last changed at %s.", file->name, ctime(&buf.st_mtime));
#endif

   if (time(NULL) - buf.st_mtime > file->parameter.file.mtime) {
       /* file wasn't changed often enough */
#if USE_SYSLOG
       syslog(LOG_ERR, "file %s was not changed in %d seconds.", file->name, file->parameter.file.mtime);
#else          /* USE_SYSLOG */
       fprintf(stderr, "file %s was not changed in %d seconds.", file->name, file->parameter.file.mtime);
#endif            /* USE_SYSLOG */

I think problem exits in this part of the code, bcause the the error comes from this code. parameter.file.mtime is configured in /etc/watchdog.conf as "change=1"

LinuxPenseur
  • 423
  • 1
  • 5
  • 16
  • Why are you manually running date anyway, why not set the date with ntpdate at startup, and then use ntp to stay in sync? Ntp will slowly slowly correct the time if it there is a small ammount of error. – Zoredache Jan 20 '11 at 08:55
  • 1
    Or, if you really want to use date, you could stop watchdog, set the date, then restart watchdog. – Zoredache Jan 20 '11 at 09:26

2 Answers2

3

So you're setting the date when you run date -s. Are you setting the clock backwards or forwards? It sounds like when you set the system date the watchdog gets confused. The clock jumps and then the watchdog says 'hey, times up, let's reboot!'.

Solution: don't run date -s? You should set up and run ntp instead to gradually slew the clock instead of making it jump via using date -s. Here's an ntp setup howto for example.

Phil Hollenback
  • 14,647
  • 4
  • 34
  • 51
  • Thanks. But why do linux provide such an option "date -s" to set the date, if the watchdog gets confused. Is there any way inorder to execute date -s and watchdog does not get confused? – LinuxPenseur Jan 20 '11 at 08:47
  • 2
    @LinuxPenseur, you could ask the question the other way to. Why does Linux provide watchdog. Since people can manually modify the time and confuse watchdog. Both are tools, if you configure your system so that watchdog reboots if something is hung it makes perfect sense, that if you modify the system date/time things get confused. You could also ask, why Linux even allows people to login, since people can break things. – Zoredache Jan 20 '11 at 08:53
  • @LinuxPenseur, I really advise you use ntp and/or only run 'date -s' when you aren't using the watchdog. – Phil Hollenback Jan 20 '11 at 18:23
0

I'm not familiar with watchdog, but I want to give some ideas: - why did you want to send to watchdog a reboot command if your app failed? - did you check to see any command date -s alias to? - what value of your load average is?

quanta
  • 50,327
  • 19
  • 152
  • 213
  • My application controls the entire system. So watchdog checks whether it is hung. If it is hung, it resets the system. No alias for date -s. how to find the load average? – LinuxPenseur Jan 20 '11 at 07:39