[SOLVED] Monit wont start, swap full, web services fail.

This forum is for all copfilter support related questions in English.

[SOLVED] Monit wont start, swap full, web services fail.

Postby moshari_3 » 08 Apr 2011 13:10

Hi all.

I have posted a 3 part problem because I think they are all related to the same cause which so far I can't find. :(
I am seeing this on 2 systems, 1 started 3 weeks ago & my personal one started just 2 days ago.

1st problem system specs:
IPC Hardware first : Intel P4 3.0Ghz, Mem = 2GB, HDD = 40GB, /dev/root = 8084mb, /dev/harddisk1/boot = 16mb, /dev/harddisk2/var/log = 67020mb, swap = 32764
My system specs:
IPC Hardware first : Intel Duo 2 2.6Ghz, Mem = 2GB, HDD = 80GB, /dev/root = 8084mb 1061 used, /dev/harddisk1/boot = 16mb 4 used, /dev/harddisk2/var/log = 67020mb 761mb used, swap = 32764 0 currently used.
Both systems have exactly the same updates & plugins listed bellow:
IPC Ver 1.4.21 - AdvancedProxy 3.0.5 - UrlFilter 1.9.3 - CopFilter 0.85.2 - Monit 5.2.5 - P3Scan 2.3.2 - ProxSMTP 1.8 - HAVP 0.92a - Privoxy 3.0.17 - Frox 0.7.18 - SpamAssasin 3.3.1
ClamAV 0.97 - ClamAV 3rd Party Signature 0.50.1 - FProt 6.3.3 - Renattach 1.2.4 - Rules Du Jour 1.30

So my problem is that at least once every 12 hours the swap usage goes up to 97% - 100% used & all web services shutdown.
On the first system there is no connectivity at all, ping ssh dhcp etc and I have to press the reset button to get things working but monit never starts.
I have not tried the local keyboard & screen yet though.

On my system the web services stop working but ping, ssh & local gui access to the ipc keep working.
In the Gui status page most services show as stopped.
Swap usage is at 97%.
Clearing the cache or restarting advanced proxy does not change anything.
Restarting URLFilter has the same dissapointing result.
Restarting Copfilter does clear the swap usage back to 0% but Monit does not start also to get other services running again I have to restart ipc from the system shutdown menu. I don't know how to start services from terminal.

So for Monit I have checked the ownership & rights:
ls /var/log/copfilter/default/opt/monit/etc/ -l
This is what displayed which I believe is correct:
drwxr-xr-x 2 root root 4096 2011-04-08 18:08 init.d
-rwx------ 1 root root 0 2011-04-08 19:33 monitrc

I then tried to run it from the terminal (I followed instructions from another post)
copfilter_restartmonit
Nothing informative displayed:
mo:2345:respawn:/var/log/copfilter/default/opt/monit/default/bin/monit -I -c /var/opt/copfilter/default/monit/etc/monitrc
monit is not running <BR>
waiting 0 second(s) <BR>
monit is not running <BR>
starting monit <BR>
waiting 3 second(s) <BR>
monit is not running <BR>

So then I looked at the following again from another post.
tail -n 70 -f /var/log/messages
This is what displayed after I typed the restart command (I have no idea what it means):
April 8 21:02:08 HOME init: re-reading inittab
April 8 21:02:08 HOME init: re-reading inittab
April 8 21:02:08 Id "mo" respawning too fast: disabled for 5 minutes

Unfortunately that is as far as I have gone.

I have no idea where or what to look for. :(

Any suggestions or help is greatly appreciated.

Thanks.
Last edited by moshari_3 on 14 Apr 2011 05:35, edited 2 times in total.
moshari_3
 
Posts: 149
Joined: 07 Apr 2010 03:43
Location: Australia

Re: Monit wont start, swap full, web services fail.

Postby Severus » 08 Apr 2011 16:56

As a first aid try top command on your shell (or putty) to search for the service abusing your swap and memory.
At second browse your log files at any hint from the abusing service or any problems.
Post results.
You said once every 12 hours this occurs? At what time? An is there any reference to a cron job?
Another problem is your result
-rwx------ 1 root root 0 2011-04-08 19:33 monitrc
This reclaims an empty monitrc (size 0) and wrong rights -rwx. Must be a size about 6000 and rights -rw like mine:
-rw------- 1 root root 6722 2011-04-04 02:23 monitrc
Without a correct monitrc monit will not start. Please check your monitrc and replace it by the file from the copfilter package.

in the meantime you may increase your swapfile as a workaround like viewtopic.php?p=2353#p2353
Regards Severus
Severus
Site Admin
 
Posts: 454
Joined: 10 Dec 2009 07:01
Location: Nürnberg - Germany

Re: Monit wont start, swap full, web services fail.

Postby moshari_3 » 09 Apr 2011 05:19

Hi Severus,

Thanks so much for the help. :)
Everything I have checked or changed is on my personal ipc, I don't have remote access to the other one at the moment.

Sorry I should have picked up on the file 0 size for monitrc, it must have been too late & I was too tired.
Copied the replacement over and changed the rights so now moint is working again.
Used chmod 600 /var/log/copfilter/default/opt/monit/etc/monitrc
-rw------- 1 root root 5649 2011-04-09 11:42 monitrc

So going through logs & system graphs etc I have found in the status>system graphs section that the swap file usages went from 0% to 95%+ at about 02:35 local time this morning & yesterday morning.
Also ram usage is staying bellow 40% which is unusual it normaly sits at around 87%

In the log files the only thing I found that looked odd was in the crondaily.log
_______________________________________________________________________________________________________________________________
[EST] 2011-04-07 23:55:00
backing up current databases... done
searching for updates...
updates downloaded...
checking for corrupted databases...no newer files available. Nothing updated!
restart of clamd suspended...
[EST] 2011-04-07 23:56:57
_______________________________________________________________________________________________________________________________
database secinfosh.hdb is corrupted! deleting...
[EST] 2011-04-08 23:55:00
backing up current databases... done
searching for updates...
updates downloaded...
checking for corrupted databases...no newer files available. Nothing updated!
restart of clamd suspended...
[EST] 2011-04-08 23:57:04
_______________________________________________________________________________________________________________________________

Using TOP the only thing I thought was high usage of swap was in fact spamd assuming I read the screen right.
I would upload the jpg snapshot I took but I couldn't work out how to from the FAQ.

I did read the How to increase your swap size post before posting but was reluctant to do it only as it was a workaround & not a fix.
It's very nicely worded and easy to follow.

If I need to I'll uninstall / reinstall copfilter but it doesn't find the main cause.
At the moment I am technically on vacation so I am happy to continue to try and find the real problem rather than going for the quick heavy handed fix.

Again thanks so far. :D
moshari_3
 
Posts: 149
Joined: 07 Apr 2010 03:43
Location: Australia

Re: Monit wont start, swap full, web services fail.

Postby Severus » 09 Apr 2011 06:03

ok. The crondaily.log is unhesitating. If nothing is to update a reload of clamd is not necessary and if a downloaded database is found corrupted it will be deleted. That's ok.
Anything remarkable in spamd.log? Best would be to run top command immediately when services shut down and memory and swap is abused to find the culprit. As I said in my former post: Any cron job starting at the time of abuse? Please also check your /var/log/messages file for the times of the crashes. There may be another hint to the culprit.
Severus
Severus
Site Admin
 
Posts: 454
Joined: 10 Dec 2009 07:01
Location: Nürnberg - Germany

Re: Monit wont start, swap full, web services fail.

Postby moshari_3 » 09 Apr 2011 09:39

Hi,

There are no othe entries for the last month crondaily.log except for what I already posted.
The spamd.log looks perfectly normal & no entries at all for that time of day.

The message log does show a frcon process that is actually running every 5 minutes. Bellow is a copy of each day at the time swap fills up.
I have included a couple of days before it started as a comparison but not much different. The swap problem first started on the 7th April.

Apr 4 02:35:00 HOME fcron[5063]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 5064)
Apr 4 02:35:00 HOME fcron[5068]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 5071)
Apr 4 02:35:00 HOME fcron[5065]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 5067)
Apr 4 02:35:00 HOME fcron[5074]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 5075)
Apr 4 02:35:02 HOME fcron[5063]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 4 02:35:02 HOME fcron[5065]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 4 02:35:02 HOME fcron[5074]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 4 02:35:08 HOME fcron[5068]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 4 02:35:26 HOME clamd[32381]: SelfCheck: Database status OK.


Apr 5 02:35:00 HOME fcron[28324]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 28325)
Apr 5 02:35:00 HOME fcron[28329]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 28332)
Apr 5 02:35:00 HOME fcron[28326]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 28328)
Apr 5 02:35:00 HOME fcron[28335]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 28336)
Apr 5 02:35:02 HOME fcron[28324]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 5 02:35:02 HOME fcron[28326]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 5 02:35:02 HOME fcron[28335]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 5 02:35:08 HOME fcron[28329]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 5 02:36:09 HOME clamd[32381]: SelfCheck: Database status OK.


Apr 6 02:35:00 HOME fcron[19234]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 19235)
Apr 6 02:35:00 HOME fcron[19239]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 19242)
Apr 6 02:35:00 HOME fcron[19236]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 19238)
Apr 6 02:35:00 HOME fcron[19245]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 19246)
Apr 6 02:35:02 HOME fcron[19234]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 6 02:35:02 HOME fcron[19236]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 6 02:35:02 HOME fcron[19245]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 6 02:35:08 HOME fcron[19239]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 6 02:35:21 HOME clamd[32381]: SelfCheck: Database status OK.


Apr 7 02:35:00 HOME fcron[8906]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 8907)
Apr 7 02:35:00 HOME fcron[8911]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 8914)
Apr 7 02:35:00 HOME fcron[8908]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 8910)
Apr 7 02:35:00 HOME fcron[8917]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 8918)
Apr 7 02:35:02 HOME fcron[8906]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 7 02:35:02 HOME fcron[8908]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 7 02:35:02 HOME fcron[8917]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 7 02:35:08 HOME fcron[8911]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 7 02:40:00 HOME fcron[9057]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 9058)
Apr 7 02:40:00 HOME fcron[9062]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 9065)
Apr 7 02:40:00 HOME fcron[9059]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 9061)
Apr 7 02:40:00 HOME fcron[9068]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 9069)
Apr 7 02:40:02 HOME fcron[9057]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 7 02:40:02 HOME fcron[9059]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 7 02:40:02 HOME fcron[9068]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 7 02:40:08 HOME fcron[9062]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 7 02:40:49 HOME clamd[32381]: SelfCheck: Database status OK.


Apr 8 02:35:00 HOME fcron[1744]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 1745)
Apr 8 02:35:00 HOME fcron[1749]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 1752)
Apr 8 02:35:00 HOME fcron[1746]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 1748)
Apr 8 02:35:00 HOME fcron[1755]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 1756)
Apr 8 02:35:02 HOME fcron[1744]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 8 02:35:02 HOME fcron[1746]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 8 02:35:02 HOME fcron[1755]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 8 02:35:08 HOME fcron[1749]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 8 02:39:17 HOME clamd[32381]: SelfCheck: Database status OK.


Apr 9 02:35:00 HOME fcron[23576]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 23577)
Apr 9 02:35:00 HOME fcron[23581]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 23584)
Apr 9 02:35:00 HOME fcron[23578]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 23580)
Apr 9 02:35:00 HOME fcron[23588]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 23589)
Apr 9 02:35:02 HOME fcron[23576]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 9 02:35:02 HOME fcron[23581]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 9 02:35:02 HOME fcron[23588]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 9 02:35:08 HOME fcron[23578]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 9 02:37:10 HOME init: Id "mo" respawning too fast: disabled for 5 minutes
Apr 9 02:40:00 HOME fcron[23637]: Job /usr/local/bin/timecheck > /dev/null 2>&1 started for user root (pid 23638)
Apr 9 02:40:00 HOME fcron[23642]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl started for user root (pid 23645)
Apr 9 02:40:00 HOME fcron[23639]: Job /usr/local/bin/makegraphs >/dev/null started for user root (pid 23641)
Apr 9 02:40:00 HOME fcron[23649]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 started for user root (pid 23650)
Apr 9 02:40:02 HOME fcron[23637]: Job /usr/local/bin/timecheck > /dev/null 2>&1 completed
Apr 9 02:40:02 HOME fcron[23642]: Job [ -f "/var/ipcop/red/active" ] && /usr/local/bin/setddns.pl completed
Apr 9 02:40:02 HOME fcron[23649]: Job /usr/local/bin/copfilter_cron >> /var/log/copfilter/default/opt/tools/var/log/copfilter_cron.log 2>&1 completed
Apr 9 02:40:08 HOME fcron[23639]: Job /usr/local/bin/makegraphs >/dev/null completed
Apr 9 02:40:50 HOME clamd[740]: SelfCheck: Database status OK.

Assuming it happens again tonight I will remember to get the TOP info before doing anything else.

Sorry I'm not very helpful I'm just not sure what I'm looking for.
moshari_3
 
Posts: 149
Joined: 07 Apr 2010 03:43
Location: Australia

Re: Monit wont start, swap full, web services fail.

Postby Severus » 09 Apr 2011 09:58

moshari_3 wrote:Sorry I'm not very helpful I'm just not sure what I'm looking for.

Same here! :D
Seems all ok. But crashing processes should be traceable.
Let's wait for your top results.
BTW are there any zombies in your precess list running top?
Severus
Severus
Site Admin
 
Posts: 454
Joined: 10 Dec 2009 07:01
Location: Nürnberg - Germany

Re: Monit wont start, swap full, web services fail.

Postby moshari_3 » 09 Apr 2011 10:09

No Zombies :(

Like you said will just have to wait for the next event.

BTW how do I upload an image into a reply without placeing it on another web host or is that the only way & insert a link?

thanks.
moshari_3
 
Posts: 149
Joined: 07 Apr 2010 03:43
Location: Australia

Re: Monit wont start, swap full, web services fail.

Postby karesmakro » 09 Apr 2011 12:38

To give you a better flexibility to handle, if cpu and swap usage is on 100%, you should inrease your swapfile:
Code: Select all
swapoff /swapfile
rm /swapfile
dd if=/dev/zero of=/swapfile bs=1024k count=1500
mkswap /swapfile
swapon / swapfile
chmod 600 /swapfile

where count should be in MB !
This should be enough, to have the chance to handle or to search for the causing process!

Sorry, do not have more time for having a closer look at this moment!
karesmakro
Site Admin
 
Posts: 1275
Joined: 09 Dec 2009 21:17

Re: Monit wont start, swap full, web services fail.

Postby moshari_3 » 10 Apr 2011 03:41

OK so now I am really confused. :?

It didn't happen.
Tasks 112 1 Running 111 Sleeping 0 Stopped 0 Zombie
RAM 46% CPU 1% Cache 19% Swap 0

The only change is that the empty monitrc file was replaced & the rights fixed for it.
I'll check the other IPC on wednesday & let you know how it goes.

I guess I'll just have to watch closely for a while.

Thanks soooo much for all the help.
Hopefully I can put a solved on this by Friday.

Regards.
moshari_3
 
Posts: 149
Joined: 07 Apr 2010 03:43
Location: Australia

Re: Monit wont start, swap full, web services fail.

Postby Severus » 10 Apr 2011 03:47

Glad to hear! :D
Nevertheless it would be no error to increase your swap file. The default swap is only 32 MB and this is very less.
A swap of 1GB would buffer shortly increasing RAM use.
Severus
Severus
Site Admin
 
Posts: 454
Joined: 10 Dec 2009 07:01
Location: Nürnberg - Germany

Next

Return to English Copfilter Support

Who is online

Users browsing this forum: Bing [Bot] and 1 guest

cron