Hello everyone,
Once in a while, my server running BaseX will suddenly experience a big spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in /[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and it indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
Hi Tamara,
Sorry for the late reply. It’s mainly because I have no definitive solution on how to resolve the issue.
My knowledge of PHP is limited, but I assume that the socket_recv function raises an error and returns false instead of the buffer size [1,2]. You could try to check the return value, maybe like this?
if ($this->bsize == false) { throw new BaseXException(socket_strerror(socket_last_error())); }
The error message might give us a clue as to what goes wrong.
Reading some more pages of the PHP socket documentation, it would probably be best to check the return type of all other socket functions as well, as all of them seem to return false instead of raising an exception.
Hope this helps. If you decide to revise the client code, I invite you to provide us with the new version! Christian
[1] https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/php/BaseXClien... [2] https://www.php.net/manual/de/function.socket-recv.php
On Wed, Jan 25, 2023 at 6:58 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hello everyone,
Once in a while, my server running BaseX will suddenly experience a big spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in /[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and it indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
Follow up 6 months later: The same pattern happened again yesterday. The error log blew up to 6 GB with the repeated PHP notice, and the production server stopped responding.
This time I could see in the logs that what triggered the event was a heap error, which caused BaseX to stop when it was in the middle of executing a query sent by the PHP client. The value of socket_recv would have been 0 and read() would have returned NULL. The while loop in readString() wouldn't terminate unless read() specifically returned an empty string, so it stayed open, trying to read from the closed connection again and again until the disk ran out of space and all server operations ceased.
I edited our copy of the PHP client files to check the result of socket_recv and throw exceptions when it's 0 or false. I tested on a development server by submitting a query I knew would take some time to execute on our front end, then running basexserverstop before it could complete. With no changes, the error log blew up to 20 MB within seconds. With the changes, the process stopped as soon as BaseX did and returned the proper exception message.
I submitted a PR to the PHP client repository. Though I don't know what caused the Java heap error, and BaseX could unexpectedly stop again, our server won't completely die if it does!
-Tamara
On Mon, Jan 30, 2023 at 10:58 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Tamara,
Sorry for the late reply. It’s mainly because I have no definitive solution on how to resolve the issue.
My knowledge of PHP is limited, but I assume that the socket_recv function raises an error and returns false instead of the buffer size [1,2]. You could try to check the return value, maybe like this?
if ($this->bsize == false) { throw new BaseXException(socket_strerror(socket_last_error())); }
The error message might give us a clue as to what goes wrong.
Reading some more pages of the PHP socket documentation, it would probably be best to check the return type of all other socket functions as well, as all of them seem to return false instead of raising an exception.
Hope this helps. If you decide to revise the client code, I invite you to provide us with the new version! Christian
[1] https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/php/BaseXClien... [2] https://www.php.net/manual/de/function.socket-recv.php
On Wed, Jan 25, 2023 at 6:58 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hello everyone,
Once in a while, my server running BaseX will suddenly experience a big
spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that
the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in
/[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and it
indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
Hi Tamara,
Thanks for introducing error handling in the PHP client in your pull request. I took this as an opportunity to revise some more lines of code.
The updated client raises an exception whenever something unexpected happens in any of the socket function calls, and it returns the error code and an associated information string in the exception message. Maybe it will help us to find out more about the heap error. In addition, the revised code contains a bug fix in the Query.php file.
All the best, Christian
On Thu, Jul 6, 2023 at 12:47 AM Tamara Marnell tmarnell@orbiscascade.org wrote:
Follow up 6 months later: The same pattern happened again yesterday. The error log blew up to 6 GB with the repeated PHP notice, and the production server stopped responding.
This time I could see in the logs that what triggered the event was a heap error, which caused BaseX to stop when it was in the middle of executing a query sent by the PHP client. The value of socket_recv would have been 0 and read() would have returned NULL. The while loop in readString() wouldn't terminate unless read() specifically returned an empty string, so it stayed open, trying to read from the closed connection again and again until the disk ran out of space and all server operations ceased.
I edited our copy of the PHP client files to check the result of socket_recv and throw exceptions when it's 0 or false. I tested on a development server by submitting a query I knew would take some time to execute on our front end, then running basexserverstop before it could complete. With no changes, the error log blew up to 20 MB within seconds. With the changes, the process stopped as soon as BaseX did and returned the proper exception message.
I submitted a PR to the PHP client repository. Though I don't know what caused the Java heap error, and BaseX could unexpectedly stop again, our server won't completely die if it does!
-Tamara
On Mon, Jan 30, 2023 at 10:58 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Tamara,
Sorry for the late reply. It’s mainly because I have no definitive solution on how to resolve the issue.
My knowledge of PHP is limited, but I assume that the socket_recv function raises an error and returns false instead of the buffer size [1,2]. You could try to check the return value, maybe like this?
if ($this->bsize == false) { throw new BaseXException(socket_strerror(socket_last_error())); }
The error message might give us a clue as to what goes wrong.
Reading some more pages of the PHP socket documentation, it would probably be best to check the return type of all other socket functions as well, as all of them seem to return false instead of raising an exception.
Hope this helps. If you decide to revise the client code, I invite you to provide us with the new version! Christian
[1] https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/php/BaseXClien... [2] https://www.php.net/manual/de/function.socket-recv.php
On Wed, Jan 25, 2023 at 6:58 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hello everyone,
Once in a while, my server running BaseX will suddenly experience a big spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in /[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and it indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
Thanks, Christian! I tested the newest version, and the only issue is that without a specific check for a socket_recv() result of 0, the infinite loop will still happen on an unexpected socket closure. There won't be a socket error, but the "buffer" property will be NULL, so the while loop in readString() won't close. I added the check to our application for that condition.
if ($this->bsize === false) { throw $this->error("Read failed"); } else if ($this->bsize === 0) { throw $this->error("Connection closed unexpectedly"); }
The message printed by the error() method when the BaseX server stops during a query execution: "Connection closed unexpectedly: Success (0)"
-Tamara
On Thu, Jul 6, 2023 at 4:09 AM Christian Grün christian.gruen@gmail.com wrote:
Hi Tamara,
Thanks for introducing error handling in the PHP client in your pull request. I took this as an opportunity to revise some more lines of code.
The updated client raises an exception whenever something unexpected happens in any of the socket function calls, and it returns the error code and an associated information string in the exception message. Maybe it will help us to find out more about the heap error. In addition, the revised code contains a bug fix in the Query.php file.
All the best, Christian
On Thu, Jul 6, 2023 at 12:47 AM Tamara Marnell tmarnell@orbiscascade.org wrote:
Follow up 6 months later: The same pattern happened again yesterday. The
error log blew up to 6 GB with the repeated PHP notice, and the production server stopped responding.
This time I could see in the logs that what triggered the event was a
heap error, which caused BaseX to stop when it was in the middle of executing a query sent by the PHP client. The value of socket_recv would have been 0 and read() would have returned NULL. The while loop in readString() wouldn't terminate unless read() specifically returned an empty string, so it stayed open, trying to read from the closed connection again and again until the disk ran out of space and all server operations ceased.
I edited our copy of the PHP client files to check the result of
socket_recv and throw exceptions when it's 0 or false. I tested on a development server by submitting a query I knew would take some time to execute on our front end, then running basexserverstop before it could complete. With no changes, the error log blew up to 20 MB within seconds. With the changes, the process stopped as soon as BaseX did and returned the proper exception message.
I submitted a PR to the PHP client repository. Though I don't know what
caused the Java heap error, and BaseX could unexpectedly stop again, our server won't completely die if it does!
-Tamara
On Mon, Jan 30, 2023 at 10:58 PM Christian Grün <
christian.gruen@gmail.com> wrote:
Hi Tamara,
Sorry for the late reply. It’s mainly because I have no definitive solution on how to resolve the issue.
My knowledge of PHP is limited, but I assume that the socket_recv function raises an error and returns false instead of the buffer size [1,2]. You could try to check the return value, maybe like this?
if ($this->bsize == false) { throw new BaseXException(socket_strerror(socket_last_error())); }
The error message might give us a clue as to what goes wrong.
Reading some more pages of the PHP socket documentation, it would probably be best to check the return type of all other socket functions as well, as all of them seem to return false instead of raising an exception.
Hope this helps. If you decide to revise the client code, I invite you to provide us with the new version! Christian
[1]
https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/php/BaseXClien...
[2] https://www.php.net/manual/de/function.socket-recv.php
On Wed, Jan 25, 2023 at 6:58 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hello everyone,
Once in a while, my server running BaseX will suddenly experience a
big spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that
the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in
/[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and
it indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
Tamara,
Thanks for indicating that the revised solution was incomplete. I’ve just added the second condition to the client. It’s a pity that the error message won’t tell us why the socket got closed yet. At least, the systems stay responsive.
All the best, Christian
On Tue, Jul 11, 2023 at 12:22 AM Tamara Marnell tmarnell@orbiscascade.org wrote:
Thanks, Christian! I tested the newest version, and the only issue is that without a specific check for a socket_recv() result of 0, the infinite loop will still happen on an unexpected socket closure. There won't be a socket error, but the "buffer" property will be NULL, so the while loop in readString() won't close. I added the check to our application for that condition.
if ($this->bsize === false) { throw $this->error("Read failed"); } else if ($this->bsize === 0) { throw $this->error("Connection closed unexpectedly"); }
The message printed by the error() method when the BaseX server stops during a query execution: "Connection closed unexpectedly: Success (0)"
-Tamara
On Thu, Jul 6, 2023 at 4:09 AM Christian Grün christian.gruen@gmail.com wrote:
Hi Tamara,
Thanks for introducing error handling in the PHP client in your pull request. I took this as an opportunity to revise some more lines of code.
The updated client raises an exception whenever something unexpected happens in any of the socket function calls, and it returns the error code and an associated information string in the exception message. Maybe it will help us to find out more about the heap error. In addition, the revised code contains a bug fix in the Query.php file.
All the best, Christian
On Thu, Jul 6, 2023 at 12:47 AM Tamara Marnell tmarnell@orbiscascade.org wrote:
Follow up 6 months later: The same pattern happened again yesterday. The error log blew up to 6 GB with the repeated PHP notice, and the production server stopped responding.
This time I could see in the logs that what triggered the event was a heap error, which caused BaseX to stop when it was in the middle of executing a query sent by the PHP client. The value of socket_recv would have been 0 and read() would have returned NULL. The while loop in readString() wouldn't terminate unless read() specifically returned an empty string, so it stayed open, trying to read from the closed connection again and again until the disk ran out of space and all server operations ceased.
I edited our copy of the PHP client files to check the result of socket_recv and throw exceptions when it's 0 or false. I tested on a development server by submitting a query I knew would take some time to execute on our front end, then running basexserverstop before it could complete. With no changes, the error log blew up to 20 MB within seconds. With the changes, the process stopped as soon as BaseX did and returned the proper exception message.
I submitted a PR to the PHP client repository. Though I don't know what caused the Java heap error, and BaseX could unexpectedly stop again, our server won't completely die if it does!
-Tamara
On Mon, Jan 30, 2023 at 10:58 PM Christian Grün christian.gruen@gmail.com wrote:
Hi Tamara,
Sorry for the late reply. It’s mainly because I have no definitive solution on how to resolve the issue.
My knowledge of PHP is limited, but I assume that the socket_recv function raises an error and returns false instead of the buffer size [1,2]. You could try to check the return value, maybe like this?
if ($this->bsize == false) { throw new BaseXException(socket_strerror(socket_last_error())); }
The error message might give us a clue as to what goes wrong.
Reading some more pages of the PHP socket documentation, it would probably be best to check the return type of all other socket functions as well, as all of them seem to return false instead of raising an exception.
Hope this helps. If you decide to revise the client code, I invite you to provide us with the new version! Christian
[1] https://github.com/BaseXdb/basex/blob/main/basex-api/src/main/php/BaseXClien... [2] https://www.php.net/manual/de/function.socket-recv.php
On Wed, Jan 25, 2023 at 6:58 PM Tamara Marnell tmarnell@orbiscascade.org wrote:
Hello everyone,
Once in a while, my server running BaseX will suddenly experience a big spike in write operations, the hard disk space will fill up to 100%, and the whole server will lock up. I can't say what initiates this, because the logs and metrics disprove every theory I can think of. (There's nothing unusual in the access logs or BaseX data logs, memory usage stays consistently at 45% and CPU usage at 5%, no user attempted anything like a huge bulk upload, etc.)
However, one thing that always happens during these incidents is that the Apache error log blows up to a crazy size like 1 GB, filled with the message:
PHP Notice: Trying to access array offset on value of type null in /[path]/client/session.php on line 164
This line is in the read method of the Session class from GitHub, and it indicates that the byte position passed to the buffer property is null. My error log probably fills up because read() is called by readString() in a while loop that never terminates. There might be extraordinary circumstances in which during construction, socket_connect returns "true" but the response is not valid or expected.
-Tamara
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
--
Tamara Marnell Program Manager, Systems Orbis Cascade Alliance (orbiscascade.org) Pronouns: she/her/hers
basex-talk@mailman.uni-konstanz.de