Hallo Andreas,
thank you for your hint to start basex with the -z option.
But I'm sorry to say that it did not change the basex servers reactions.
1. lsof does report the following lines:
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME java 23767 userXY 30u sock 0,6 0t0 92598 can't identify protocol
*u* for a read and write lock of any length ''sock'' for a socket of unknown domain
see the lsof man page (1).
It shall be pointed out that TYPE has the value sock in contrast to REG which stands for regular file, see also the attached image.
2. Therefore it is very likely that not a regular file remains open, but a socket.
3. The tcpdump - viewed with wireshark - shows that the healthy check of the load balancer - mon and ldirectoryd do the same - sends a RST (Connection reset) message TCP to the BaseX database.
Each RST message from the mon or ldirectord seem to lead to a new socket that remains open ...
One question is:
What can be done one the BaxeX-side to make it compatible with a load balancer? (Especially with ldirectord, that is used in the productive environment for healthy check.)
With best regards
Andreas
(1) http://linux.die.net/man/8/lsof
Hello Andreas,
the only file that i can think of that is opened for each client is the log file of BaseX. Normally the log file should just be opened by the server and the clients are referring to it, however, you should give it a try and start the basexserver with the "-z" option (http://docs.basex.org/wiki/Startup_Options) for suppressing the logging mechanism.
-- Andreas
Am 12.05.2012 um 00:41 schrieb Andreas Rulle:
Hello Andreas,
this email informs you that
- with the tool mon (1),
- that is often used together, see (3), the LVS load balancer (2)
it has been able to reproduce the
java.net.SocketException: Too many open files
with the following configuration
watch basex service ping description Responses to ping interval 2s monitor tcp.monitor -p 1984 period
It seems to be that basex does not close the sockets when monitored with tcp.monitor -p 1984. The basex-server crashes when ulimit -n is reached.
With best regards,
Andreas
(1) https://mon.wiki.kernel.org/index.php/Monitors
(2) http://www.linuxvirtualserver.org/
(3) http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.realserver_failure.ht...
Am 11.05.2012 18:21, schrieb Andreas Rulle:
Hello Andreas,
thank you very much for your valuable hints.
sudo lsof -p basex-ps-no
delivers many lines of
java 11489 root 962u sock 0,6 0t0 10449645 can't
identify protocol
and
(1) sudo lsof -p 11489 | wc -l gets
reports an increasing number of used "files". The increase of the number in (1) in a given time interval correlates to the number of requests that the LVM load balancer sends to the basex port 1984 in that interval. There are about 4 requests from the LVM load balancer per minute.
At the time of this writing (1) has the value of 1076 ...
The parameter keepalive has the value
KEEPALIVE = 600
but it does not seem to stop the increase of the value in (1).
This information opens the way to workarounds
- decrease the number of requests from the LVM load balancer,
- increase the ulimit -n
- restart the basex -server before the number in one reaches ulimt -n.
But we really would prefer solutions to the increase of the figure in (1)
The almost identical settings of the load balancer do work for a MySQL-Database without any problems. And without the load balancer an instance of the basex runs since April 10 without hitting ulimit -n = 1024.
Any hints on this are very welcome!
With kind regards,
Andreas
I just searched our mailing list, cause there was something about too many open files some time ago:
On typical Linux installations, die open filedescriptor limit is 1024:
$ grep 'open files' /proc/self/limits Max open files 1024 1024 files
In Java, if a file-based object (FileWriter, FileReader, etc.) is not closed, the underlying file descriptor is not closed. Bug detectors (FindBugs) check for that intraprocedurally. I recommend you run both BaseX as well as this application through it.
If an open file-based object becomes unreachable, the finalizer will eventually close it - but it's possible to run out of open file descriptors simply due to unreachable, but not yet finalized objects. (Of course, if the object is leaked, it won't be closed ever.)
In Linux, use 'ps' to find out the pid of the Java JVM process, then do a ls -l /proc/<pid>/fds to see which file descriptors the process in question has open; or use the 'lsof' command.
- Godmar
Probably that helps to investigate the issue.
-- Nexoma GmbH Theodorus Weg 7 59755 Arnsberg
Tel. + 49 (0) 52 51 1613-0 Fax + 49 (0) 52 51 1613-99
mailto:andreas.rulle@nexoma.de
Geschäftsführer: Guido Sauerland Sitz der Gesellschaft: Arnsberg Registergericht: Arnsberg, HRB 9365
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
-- Nexoma GmbH Theodorus Weg 7 59755 Arnsberg
Tel. + 49 (0) 52 51 1613-0 aktuell + 49 (0) 29 32 99 400 52 Fax + 49 (0) 52 51 1613-99
mailto:andreas.rulle@nexoma.de
Geschäftsführer: Guido Sauerland Sitz der Gesellschaft: Arnsberg Registergericht: Arnsberg, HRB 9365