Language Change PR-staging RTC

Pending

User tests: Successful: Unsuccessful:

avatar mbabker
mbabker
15 Feb 2018

Summary of Changes

  • Implements the Session MetadataManager object originally proposed in #19578
  • Implements a new Session Garbage Collection plugin
    • Can optionally trigger PHP's native session garbage collection task
    • Can optionally trigger Joomla's metadata cleanup task
  • Implements a new CLI script for use as a cron job for cleaning the metadata (well, actually this causes garbage collection on the session table in general, but it's a side effect of the incorrect dual purpose of the session table)

Testing Instructions

Apply patch, if necessary discover and install the new plugin. With the "Enable Session Garbage Collection" option enabled, during the onAfterRespond event Joomla will attempt to perform session garbage collection based on the odds defined in the plugin's parameters. With the "Enable Session Metadata Cleanup" option enabled, and a non-database session handler in use, during the onAfterRespond event Joomla will attempt to purge stale records from the session database table representing the optional session metadata (as explained a thousand times, this is done by design in the plugin for now because the session store and our metadata handling represent two different datasets and should be treated as such).

The odds calculation is the same as that used by the C level php_session_gc function, ported to userland PHP and using the plugin's configuration values versus the PHP runtime configuration.

The two cleanup operations purposefully do not share the same probability calculation, though there is no harm in running both tasks in one request IMO there is no need to force it either.

avatar mbabker mbabker - open - 15 Feb 2018
avatar mbabker mbabker - change - 15 Feb 2018
Status New Pending
avatar joomla-cms-bot joomla-cms-bot - change - 15 Feb 2018
Category SQL Administration com_admin Postgresql MS SQL Language & Strings CLI Installation Libraries Front End Plugins
avatar brianteeman
brianteeman - comment - 15 Feb 2018

thanks for this - i will test first thing in the morning

avatar csthomas
csthomas - comment - 15 Feb 2018

I like this PR, I'm only wondering if we need an another CLI script to clean up #__session table.

I would like to combine these two functions into one.
We can check if the session handler is set to the database and then call the appropriate method.

A little off topic: I am against the division of the database table #__session into 2 separated tables.

avatar mbabker
mbabker - comment - 15 Feb 2018

A little off topic: I am against the division of the database table #__session into 2 separated tables.

It needs to happen. It'd be like storing category and content data in one table. Though it might be related, it's two different types of data. In this case it's a little worse because some of the management tasks for that data have colliding/differing logic (i.e. we want the metadata to be frequently cleared, in part because of the common complaint of a wrong count for who's online, and we need to clear the metadata ourselves because it is a custom feature to Joomla, but we don't want Joomla always acting as the garbage collector for the database session handler because we are cleaning the metadata; that garbage collection should be rightfully deferred to a dedicated session task).

I'm only wondering if we need an another CLI script to clean up #__session table.

CLI tasks should be like web controllers and follow SRP. So yes, in the case of 3.x that means two files and in 4.0 two different Command subclasses.

avatar csthomas
csthomas - comment - 15 Feb 2018

My conclusion. If I am an "advanced" administrator and I want to use cronjob then

  • when I set session on database then I have to use cli/sessionGc.php
  • but when I change it to redis then I have to use cli/sessionMetadataGc.php.

Otherwise, I only use the plugin.

avatar mbabker
mbabker - comment - 15 Feb 2018

Almost.

Session GC:

  • If I want Joomla to perform session garbage collection, And want to use cron jobs to manage that, And I have session.gc_probability set to 0, And I do not have any other existing jobs purging expired data, Then I should create a cron job for cli/sessionGc.php and disable the appropriate plugin settings
  • If I want Joomla to perform session garbage collection, And want that to occur in the scope of HTTP requests, And I have session.gc_probability set to 0, And I do not have any other existing jobs purging expired data, Then I should use the plugin (default settings)
  • If I have session.gc_probability set to any non-zero value, Then I should disable the appropriate plugin settings and not create a cron job (PHP will eventually do this cleanup for me without any extra nudging)

Metadata GC:

  • If I want optional session metadata to be cleared by way of a cron job, Then I should create a cron job for cli/sessionMetadataGc.php and disable the appropriate plugin settings
  • If I want optional session metadata to be cleared in the scope of HTTP requests, Then I should use the plugin (default settings)

Note the cron doesn't make the database handler distinction, only the plugin is doing so now

avatar Gitjk
Gitjk - comment - 16 Feb 2018

Just enabled it on my life website. Will post the result after checking the size of my sessions table after two days or so...

avatar Quy
Quy - comment - 17 Feb 2018

Plugin enabled with default values. It is a public site with no registered users except for one super user account.

In the sessions table:
Before PR: 330
After PR overnight: 526

How can I tell if it is working?

avatar mbabker
mbabker - comment - 17 Feb 2018

Put a JLog::add() statement in there. Because it's happening after the response you can't do any other kind of var_dump or anything. At https://github.com/joomla/joomla-cms/pull/19687/files#diff-c62bf5ee641c0af7406f55c3f4e4f9d9R58 you'd want to know the values of $probability and $random and see if values were calculated to meet the condition to trigger garbage collection (which again, is the exact logic used in php_session_gc so unless there's a flaw in porting this code from C to PHP it should work the same).

avatar Gitjk
Gitjk - comment - 17 Feb 2018

I'm currently also wondering if the plugin works on my website. It's hosted on a shared server where I can't change the php settings set by the hoster.

Plugin Settings:
Enable Session Garbage Collection: Yes
Enable Session Metadata Cleanup: Yes
Probability: 1
Divisor: 100
Status: Enabled

Session settings in php.ini:
session.gc_probability = 0
session.gc_divisor = 1000
session.gc_maxlifetime = 1440

Session settings in Joomla's php information:
session.gc_probability 0
session.gc_divisor 1000 1000
session.gc_maxlifetime 9000 1440

My impression is that the plugin settings are ignored. Theoretically (if I understand this stuff correctly) I should set the plugin's Probability value to 100 in order to ensure a cleanup after 9000 seconds. Is that correct?

avatar mbabker
mbabker - comment - 17 Feb 2018

The PHP runtime settings are not used by the plugin.

If you really want the collector to run on every request then yes set it to 100/100. That's honestly overkill though. PHP will not let you resume a dead session so you don't need the data to purge out right at 9001 seconds of inactivity.

If you really need that frequent of garbage collection, disable the plugin and enable the cron jobs, because you're looking for the jobs to run on a high frequency and predictable schedule. If you don't mind the entropy, use the plugin and tweak the odds configuration (try even a 10/100 config to see what happens).

avatar Gitjk
Gitjk - comment - 17 Feb 2018

If you really need that frequent of garbage collection, disable the plugin and enable the cron jobs

The website including a shop has been running on a "cheap" shared server for several years. In this case "cheap" also means no cron jobs. The problem is that the response time when adding new content becomes annoying when working in the backend if the session table has grown to several hundred megabytes. After approximately 24 hours with the Plugin's default settings, the session table contains 15000 session entries already. Will play with different settings... :-)

avatar Quy
Quy - comment - 17 Feb 2018

In the backend, I am getting this: 0 Call to undefined method Joomla\CMS\Session\Session::gc()

avatar csthomas
csthomas - comment - 17 Feb 2018

@Quy, get the latest staging with #19548 merged.

avatar Quy
Quy - comment - 17 Feb 2018

@csthomas Fixed. Thank you!

avatar Gitjk
Gitjk - comment - 17 Feb 2018

Just for information - I just happened to see that Viktor also released a plugin to solve the problem on the 16th of February after an earlier discussion in the Joomla forum about the 'who's online' plugin. (Didn't try that one yet).

https://joomla-extensions.kubik-rubik.de/downloads/esk-easy-session-killer/joomla-3
Update: Link should work now

avatar Quy
Quy - comment - 18 Feb 2018

Here are some entries in the log file (p=$probability, d=$divisor, r=$random):

p=1 d=100 r=8.2489134663862
p=1 d=100 r=57.493870875459
p=1 d=100 r=3.1171028886235
p=1 d=100 r=3.9074626884956
p=1 d=100 r=22.473637906575

Will $random ever be less than $probability so that the following code will be executed?
if ($probability > 0 && $random < $probability)

avatar mbabker
mbabker - comment - 18 Feb 2018

Will $random ever be less than $probability so that the following code will be executed?

@Quy yes, validation script can be found at https://gist.github.com/mbabker/6b2636033ced225d56c1c1001d40d4cd

avatar Quy
Quy - comment - 18 Feb 2018

@mbabker Thanks! I can confirm that old entries were deleted from the sessions table since last night. I will monitor it for a few more days before submitting a test result.

avatar csthomas
csthomas - comment - 18 Feb 2018

I have tested this item successfully on a93b027


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar csthomas csthomas - test_item - 18 Feb 2018 - Tested successfully
avatar Gitjk
Gitjk - comment - 18 Feb 2018

Currently this doesn't work in my case. Perhaps I missed something. I've used mbabker's patch tester to install #19687 and enabled the plugin of course. Is there anything else which is required to make this work?
After 48 hours my session table has grown again to 91.5 mb with 35500 lines and no sessions are deleted.

avatar mbabker
mbabker - comment - 19 Feb 2018

To patch:

  • Deploy the current staging branch (any commit after #19548 is merged
  • Install and enable plugin
  • Wait

This plugin is NOT designed to run the cleanup operation on all requests, as demonstrated by the validation script at https://gist.github.com/mbabker/6b2636033ced225d56c1c1001d40d4cd which emulates the default parameters. Running that in one continual processing script 15-20 times on my local computer, it reaches a matching condition at a minimum of 20 iterations and a maximum of 150 iterations (mind you this is one process, not spread across multiple processes).

The plugin is triggering our internal gc() method on the session API. If you are using the database handler (default configuration), the session cleanup operation won't happen until just as the session connection is closed (which should presumably be at a point well after the HTTP response is sent, and even after this plugin is triggered, because we have a defer condition in the handler to delay cleanup until closing the connection).

If it is not triggering then validate your configuration. Unfortunately I don't have the time to offer one-on-one consulting for this PR, all I can say is validate everything is turned on and if need be add some logging statements to the plugin so that you can actually check to determine if the right code paths are being executed (is the path for cleanup being reached, if so is the random probability meeting the matching criteria). If it's really not running at all in a 48 hour period this really screams to me as a configuration issue, not a code issue (what happens if you set the probability to 100/100, basically meaning it should always run).

avatar Quy
Quy - comment - 19 Feb 2018

@Gitjk I am using v3.8.5. I had to also download/install #19548.

After installing #19687, I manually updated plugins/system/sessiongc/sessiongc.php with the following to log garbage collection performed:

before line 73, insert:
JLog::add('enable_session_metadata_gc: p=' . $probability . ' d=' . $divisor . ' r=' . $random, JLog::ERROR, 'enable_session_metadata_gc');

before line 60, insert:
JLog::add('enable_session_gc: p=' . $probability . ' d=' . $divisor . ' r=' . $random, JLog::ERROR, 'enable_session_gc');

Under Extensions > Plugins > System - Debug > Logging, enable Log Almost Everything.

Using FTP, download everything.php in administrator/logs/.

In the file, you should see garbage collection entries like the following if/when performed:
2018-02-18T18:11:02+00:00 ERROR #.#.#.# enable_session_gc enable_session_gc: p=1 d=100 r=0.65329211682033

Check the sessions table.

avatar Gitjk
Gitjk - comment - 19 Feb 2018

@Gitjk I am using v3.8.5. I had to also download/install #19548.
I've been using my live J3.8.5 website to test this. Will try if it works when I also install #19548 later.
At present I'm testing the plugin from Victor (see link above), which seems to work out of the box.

Meanwhile I found that I could use a cron job on my 'cheap shared server', too. Didn't notice that option before. The hosting service uses 'LiveConfig' instead of the usual cPanel or Plesk. But as a non-programmer I need to find an example how to implement the 'truncate session' as a cron job for the Joomla table, which I can copy.

Just curious (because I'm not a programmer) - Is there a certain reason why clearing the session table is based on a 'probability' instead of a period of time? I mean, neither my website nor the server is a gambling machine. :-)

avatar mbabker
mbabker - comment - 19 Feb 2018

Is there a certain reason why clearing the session table is based on a 'probability' instead of a period of time?

You get into another database read/write operation to store a last cleanup timestamp, or you end up with the arbitrary logic that exists today where it is triggered on a second that is a divisor of 5. At least with a probability driven logic, you won't end up with several concurrent DELETE operations just because a timestamp was met.

If you do prefer time based logic, cron jobs are better for this because they can run on a set schedule.

avatar csthomas
csthomas - comment - 20 Feb 2018

I will be grateful for another success test and a quick merge.

avatar joomdonation
joomdonation - comment - 20 Feb 2018

Hopefully I can test it today, otherwise on tomorrow (just come back from new year holiday and have to clear support queue first)

avatar Kubik-Rubik
Kubik-Rubik - comment - 20 Feb 2018

I have tested this item successfully on a93b027

Great work, @mbabker!

Please get this into the next version.

Another question: Shouldn't we set the default handler to PHP instead of Database for fresh installations?


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar Kubik-Rubik Kubik-Rubik - test_item - 20 Feb 2018 - Tested successfully
avatar mbabker
mbabker - comment - 20 Feb 2018

Another question: Shouldn't we set the default handler to PHP instead of Database for fresh installations?

We can discuss this default setting as a part of 4.0. I wouldn't touch it now though.

avatar franz-wohlkoenig franz-wohlkoenig - change - 20 Feb 2018
Status Pending Ready to Commit
avatar franz-wohlkoenig
franz-wohlkoenig - comment - 20 Feb 2018

Ready to Commit after two successful tests.

avatar mbabker mbabker - change - 21 Feb 2018
Labels Added: PR-staging RTC
avatar joomdonation
joomdonation - comment - 21 Feb 2018

I look at the PR today and have few issues (or questions):

  1. Since we moved the logic to clean up session meta data to plugin, I think the plugin should be published by default, otherwise, the session meta data would not be deleted unless users publish the plugin. When I apply the patch and use discover install, the plugin is unpublished and I had to publish it manually. Seems not right.

  2. I found a minor issue with the UI. Since you use boolean filter for params, if someone choose No, It is not in selected state after saved (the button is not in red color)
    no_selected

  3. Since the method afterSessionStart is removed, there are few lines of code was lost

$session = \JFactory::getSession();

if ($session->isNew())
{
	$session->set('registry', new Registry);
	$session->set('user', new \JUser);
}

Could you please confirm that it is expected? I didn't see any strange behavior during testing, just want to make sure it doesn't cause any issues at all

  1. I think the option Enable Session Garbage Collection in the plugin should not be enabled by default. It should only be enabled for sites running on servers have gc_probability set to 0 only. By default, I think we should leave this job to PHP / server.

  2. About default value of Probability and Divisor parameters. Currently, it sets to 1/100, mean there is 1% of chance the session meta data is clean up. While in our current behavior, it is run when time is a divisor of 5 (about 20%). So maybe we should increase the default value of Probability parameter (to 20?) ?

  3. Still not happy with not deleting session meta data when Database Hanlder is used, but maybe it's just me

avatar infograf768
infograf768 - comment - 21 Feb 2018

I have also an issue concerning the purpose of the plugin.
+PLG_SYSTEM_SESSIONGC_ENABLE_SESSION_GC_DESC="When enabled, this plugin will attempt to perform session garbage collection based on the odds calculated by the probability and divisor."
This is not understandable by non-specialists. Basic users (and I am one regarding that kind of stuff) will not know what it does and why, including the Options.

avatar brianteeman
brianteeman - comment - 21 Feb 2018

@infograf768 I had to google to find out what "divisor" was. I did higher level maths and had never come across it - even though it is a correct term - we always used the term "factor".

For the description I would suggest simplifying it to
"This plugin will clear session metadata on a regular basis"

I dont think it needs anything more than that.

avatar infograf768
infograf768 - comment - 21 Feb 2018

IMHO, wherever is used Garbage collection, strings should be modified.

+PLG_SYSTEM_SESSIONGC_GC_DIVISOR_DESC="In combination with the probability field, these two fields are used to determine the odds of the garbage collection operation being triggered on a request. The probability is calculated by using probability/divisor, e.g. 1/100 means there is a 1% chance that the process runs on each request."
+PLG_SYSTEM_SESSIONGC_GC_PROBABILITY_DESC="In combination with the divisor field, these two fields are used to determine the odds of the garbage collection operation being triggered on a request."
PLG_SYSTEM_SESSIONGC_XML_DESCRIPTION="System Plugin that handles session garbage collection tasks."
avatar Kubik-Rubik
Kubik-Rubik - comment - 21 Feb 2018

@joomdonation

Regarding 6.) The session table is also cleaned with the option Enable Session Garbage Collection for the Database handler. Or what exactly do you mean? Since we have the Database handler as default setting, this option should also be enabled by default.

avatar brianteeman
brianteeman - comment - 21 Feb 2018

@infograf768 suggestions welcome - am away from computer today

avatar joomdonation
joomdonation - comment - 21 Feb 2018

@Kubik-Rubik I just think by default, Joomla should only care about clean up session metadata and leave Session Garbage Collection to PHP/Server, so the option Enable Session Garbage Collection should be off by default. That's why I think we should delete session metadata for Database Handler as well. (Suppose we plan to store that metadata in separate table in the future as Michael suggested, we need that same delete logic for Database Handler, too).

That's how we did before 3.8.4. If anyone wants to improve performance, they can disable metadata clean up option and setup cron job.

avatar mbabker
mbabker - comment - 21 Feb 2018

Replies to @joomdonation

When I apply the patch and use discover install, the plugin is unpublished and I had to publish it manually. Seems not right.

Joomla does not automatically enable plugins when installed via the extension manager. This is unrelated to this PR. Please check the SQL deltas to ensure the expected published state is in use for normal upgrades.

I found a minor issue with the UI. Since you use boolean filter for params, if someone choose No, It is not in selected state after saved (the button is not in red color)

This is a bug unrelated to this PR. I'm going to take a hard nosed stance on this one, security shouldn't be compromised to have a button show the right color (and this field definition is correct and secure as far as ensuring data is properly stored/filtered/validated).

Since the method afterSessionStart is removed, there are few lines of code was lost

They were not. Please check the class hierarchy, you'll find that the parent web application class has the same method and logic.

I think the option Enable Session Garbage Collection in the plugin should not be enabled by default. It should only be enabled for sites running on servers have gc_probability set to 0 only.

We can not make this decision by any sane measure. The best we can do is add a custom form field detecting PHP runtime configuration and displaying a recommendation.

Currently, it sets to 1/100, mean there is 1% of chance the session meta data is clean up. While in our current behavior, it is run when time is a divisor of 5 (about 20%). So maybe we should increase the default value of Probability parameter (to 20?) ?

Personally I'd say no. The existing logic is based on a predictable factor (a timestamp, and because of the nature of this check all requests in a matching second would trigger that cleanup operation; so it might be a 20% probability now but in that 20% the operation could run 5-15 times easily depending on your site's traffic), this method is in line with the PHP internals and is a truly random process (which is less likely to have the overlap the existing code has). There isn't anything in this data that calls for such frequent cleanup operations, but if someone does want that, they can customize it.

Still not happy with not deleting session meta data when Database Hanlder is used

Sorry, I'm trying to make the system right and that means pushing an arbitrary restriction to prove a point. Once this gets merged then merged up to 4.0 I can finish this once and for all.

avatar mbabker
mbabker - comment - 21 Feb 2018

Regarding text, it is all influenced by the PHP manual. Try using translations from there to help form an opinion, or if someone can come up with a way to discuss highly technical matters in a not technical way then feel free to propose it.

English: http://php.net/manual/en/session.configuration.php#ini.session.gc-probability
Spanish: http://php.net/manual/es/session.configuration.php#ini.session.gc-probability
French: http://php.net/manual/fr/session.configuration.php#ini.session.gc-probability
German: http://php.net/manual/de/session.configuration.php#ini.session.gc-probability

avatar infograf768
infograf768 - comment - 21 Feb 2018

if someone can come up with a way to discuss highly technical matters in a not technical way then feel free to propose it.

I will try as I do not think this plugin is the subject of discussion in its strings, but rather a basic explanation all users can understand without the knowledge of its underlying mechanism.

avatar mbabker
mbabker - comment - 21 Feb 2018

Part of the problem as I perceived it was explaining how the two numeric fields work to come up with a configuration and how that configuration actually translates to determining when the process should run. And to be honest I think the PHP docs cover it. But, the PHP docs are aimed at a level of user the Joomla UI isn't, so ya, what I have now is a lot more technical in nature as a result.

avatar infograf768
infograf768 - comment - 22 Feb 2018

This is my proposal. It can evidently be modified at will.
I modified the xml as numerator and denominator do not need descriptions and I added a spacer as we do in com_media Options.

TTs will be told not to translate 'Garbage Collection'

screen shot 2018-02-22 at 11 31 06

The specific descriptions

screen shot 2018-02-22 at 11 36 30

screen shot 2018-02-22 at 11 30 30

avatar fevangelou
fevangelou - comment - 22 Feb 2018

Shouldn't "Probability" be named to something easier for humans to understand? E.g. "Session refresh interval".

And better still, couldn't this be triggered on a time basis? E.g. "once a day", "weekly" etc.

avatar fevangelou
fevangelou - comment - 22 Feb 2018

As a sidenote, even a 1% setting on a site with immense traffic, will probably cause unexpected load. Sure, the devs can adjust in such cases and use a cron job, but I'm pretty sure good ol' sysadmins and cPanel/Plesk folks will start nagging, and for good reason.

I think @SniperSister was spot on @ #19585

Such decisions that inherently affect performance (and in return cause a bad rep for Joomla) shouldn't be taken lightly and in the name of technical superiority. WP is crap, but that hasn't stopped it from taking over 20% of the web.

avatar infograf768
infograf768 - comment - 22 Feb 2018

Shouldn't "Probability" be named to something easier for humans to understand? E.g. "Session refresh interval".

sure.

avatar mbabker
mbabker - comment - 22 Feb 2018

As a sidenote, even a 1% setting on a site with immense traffic, will probably cause unexpected load. Sure, the devs can adjust in such cases and use a cron job, but I'm pretty sure good ol' sysadmins and cPanel/Plesk folks will start nagging, and for good reason.

With the plugin set to a 1% probability, this should result in less load than the 3.8.3 behavior where the cleanup's DELETE FROM query was running on every even numbered second, or since 3.8.4 on a second which is a divisor of 5. Also the time based triggering resulted in concurrent runs, this approach shouldn't have that particular issue.

And better still, couldn't this be triggered on a time basis? E.g. "once a day", "weekly" etc.

If you're using the "Who's Online" module (or anything showing logged in user counts or lists), which apparently a lot of people do, you can't have the cleanup be that infrequent. If you aren't, and have the capability to do so, you can set up a cron job to do cleanup on a timed interval like this. As for defaulting to a time basis, Joomla's been doing that up to and through today, on a far too frequent basis, and that has also been a cause of performance issues. Maybe this improves things in the long run, maybe this is a bad idea, either way with this PR there are a lot more tools available to fine tune this aspect of things.

avatar GCLW
GCLW - comment - 22 Feb 2018

This is something you broke and this still is not a fix for it. Constantly blaming server configs and demanding people make cron jobs to clean up your session table is absurd. Hundreds of MB and tens of thousands of rows now make logging into sites take 20-30 seconds. This should have been reverted while a real fix is done. Over two dozen sites and four different hosting providers and only "1" site is not broken. Brian? This should have been reverted and then a "fix" should have been worked on, not leave sites broken.

avatar fevangelou
fevangelou - comment - 22 Feb 2018

It's true that the change in v3.8.5 will balloon backups and cause havoc
among users and hosting companies.

On Feb 22, 2018 4:08 PM, "GCLW" notifications@github.com wrote:

This is something you broke and this still is not a fix for it. Constantly
blaming server configs and demanding people make cron jobs to clean up
your session table is absurd. Hundreds of MB and tens of thousands of
rows now make logging into sites take 20-30 seconds. This should have been
reverted while a real fix is done. Over two dozen sites and four different
hosting providers and only "1" site is not broken. Brian? This should have
been reverted and then a "fix" should have been worked on, not leave sites
broken.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#19687 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABPdG2sBLYzyxCaYRCDDcClM0GAyjJKqks5tXXT4gaJpZM4SGMaG
.

avatar mbabker
mbabker - comment - 22 Feb 2018

This is something you broke and this still is not a fix for it.

If you have a fix for it then feel free to propose it. Reverting to the 3.8.3 behavior is not a valid option if you understand the architectural and performance issues with that behavior on a high traffic site.

If this is not working, please explain the issues you are having with this patch so they may be addressed, with something more than "revert until 'fixed'" as that conveys absolutely nothing useful.

It's true that the change in v3.8.5 will balloon backups

Personally I would suggest best practice is to exclude session data from backups when practical, but alas not all tools offer that so it is a concern to address and I do think this PR helps with that.

avatar brianteeman
brianteeman - comment - 22 Feb 2018

You shouldnt back up a session table as there is no value in it being restored

avatar fevangelou
fevangelou - comment - 22 Feb 2018

Your comment Brian is catalytic as always. Server wide backup solutions
don't cherry pick tables. Ask "small" companies like cPanel or Plesk.

On Feb 22, 2018 4:29 PM, "Brian Teeman" notifications@github.com wrote:

You shouldnt back up a session table as there is no value in it being
restored


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#19687 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABPdGzUgbfn7ZW1JuuU7fALCk3od2Dncks5tXXm8gaJpZM4SGMaG
.

avatar mbabker
mbabker - comment - 22 Feb 2018

Which is why I phrased my comment the way I did. When you're working with a site specific backup tool/protocol, it's easier to exclude the data than when you're doing server backups (to be fair, I really hope it's possible to exclude /var/lib/php/sessions from that backup, but you're right that specific tables in specific databases can't be excluded arbitrarily).

Another thing to reiterate, in case it's not yet clear. We have the problems we do in our API today primarily because of the optional session metadata logging (which powers features like who's online or user status modules, but can also have a practical use when you need to reference session IDs in other ways). Prior to 3.8.4, the database table never ballooned even with PHP's native GC handler disabled because core had an arbitrary DELETE FROM #__session query that ran so frequently that acted as a performance bottleneck on some sites. Yes, I changed the logic in when that query runs, and yes I see now that was a bit overzealous, and yes I still stand by that change because it exposed a lot of other problems.

With this PR, we break a reliance on PHP's native GC handling being enabled, so that stale data from ALL session stores can be purged (with loosely the same logic that PHP's configuration uses). So if you're using a non-standard file path that won't get cleaned by a default cron job installed in WHM/cPanel or Ubuntu's PHP packages, or using something like APCu or Memcached as your session store without other tools to ensure those data stores are cleared, that stale data will get purged.

With this PR, we break the reliance on HTTP requests to run these cleanup operations on an arbitrary timestamp based measurement. For advanced configurations, you can move the cleanup entirely to a cron based system and fire it as frequent (or infrequent) as you choose. For those who can't or don't want to do that, you in essence get the PHP behavior if the server had the right settings enabled.

With this PR, having #13322 in 4.0 gets a little easier. Because ultimately the root cause of this issue is the metadata. I want to make it possible to turn off the logging of that, because there are use cases where you don't want it or have no need for it (really, most of the joomla.org sites I have admin for have no need for these constant CRUD actions to the session table). If that goal can be realized then we've made a lot of progress in the right direction IMO.

avatar joomdonation
joomdonation - comment - 22 Feb 2018

Overall, this is an improvement PR although there are few things I don't agree (or not happy with) as mentioned in my earlier comment #19687 (comment)

If someone has issue even with this PR applied, you can use this plugin https://joomla-extensions.kubik-rubik.de/downloads/esk-easy-session-killer/joomla-3 from @Kubik-Rubik . I checked it before, looks good and is something I expected to be default behavior of this PR.

avatar mbabker
mbabker - comment - 22 Feb 2018

As long as you choose to treat the optional metadata AND the real data as one entity, then Viktor's plugin is fine. You can't treat them as one and the same, especially as one piece of that puzzle is not always stored to the database (whereas the other is). You MUST split the data, and you MUST do it with a transitional period so it is clear why the system is designed the way it is and we aren't just making breaking changes because I feel like screwing over developers.

Otherwise, really, we can just go back to the 3.8.3 behavior and be done with it, and like I keep saying, just drop every non-database related session store. Because as I have said in frustration a lot of times it is quite clear nobody really cares that your server's session GC is turned off until you have a 20K row database.

avatar mbabker
mbabker - comment - 22 Feb 2018

One more reason why the metadata implementation is flawed by the way.

JSessionStorageDatabase::write() does not have INSERT or UPDATE logic. Only UPDATE. Why? Because it relies on the application to insert the record, and the application fatally errors if it can't do that. So what is supposed to be an optional component is a fatally blocking operation, even when you aren't using the database.

This is why I am so adamant about treating the two types of data storage as what they are, two different things. The code in Joomla today not only mandates this metadata be written, but API elements are broken if it is not.

avatar GCLW
GCLW - comment - 22 Feb 2018

Revert to the 3.8.3
We are talking hundreds of MB and tens of thousands of records.

While the update was with good intentions it created a catastrophic side effect. Revert it back, and spend as much time as you need on a different solution. We all understand what your goal is, but what is happening right now should be considered site breaking, and is a lot more wide spread than the hand full of people pointing out the problem.

Why is my site taking so long to login? Why are the visitor and admins logged in numbers growing? Why is my hosting provider sending me emails now warning me of my sites database size? Is there a fix? Can't they change it back?

These are all questions people have been or will be asking.

avatar joomdonation
joomdonation - comment - 22 Feb 2018

I understand that real session data and session metadata are two different things and should be stored in separate tables as you mentioned. What I am trying to say here are:

  1. Real session data clean up should be PHP internal / server job. So I expect Enable Session Garbage Collection set to No by default. For websites run on server which don't do this clean up, they can turn of this option manually.

  2. Because of the reason #1, I want the code to clean up meta data runs for Database Handler as well. That is needed for Database Handler anyway when you store metadata in separate table in the future.

  3. The default value Probability is 1. I am worry that it is low and we will still get people complain about wrong data of #__session table (Viktor's plugin has it default to 10 and I think that might be good value). The fact is that in old Joomla version, we have that in 20% and now, we only have that 1% by default, so much different.

Just my personal feeling, so if other users think that the default behavior is fine, we can go with it.

avatar joomdonation
joomdonation - comment - 22 Feb 2018

@GCLW Could you please apply this PR and check the result? It should solve the issue you are having (maybe you will have to increase the value of Probability to a higher number like 20 so that the #__session table will be cleaned up more often)

avatar mbabker
mbabker - comment - 22 Feb 2018

Real session data clean up should be PHP internal / server job. So I expect Enable Session Garbage Collection set to No by default. For websites run on server which don't do this clean up, they can turn of this option manually.

Because Joomla defaults to the session handler, and because it appears that PHP is more frequently configured to not run GC than it is configured to do so, the default should be on.

Because of the reason #1, I want the code to clean up meta data runs for Database Handler as well. That is needed for Database Handler anyway when you store metadata in separate table in the future.

You can't delete one without the other. The purge metadata operation through some misfortunate series of events could inadvertently purge active session data. That is why you cannot reliably have that process run when the database handler is in use. When the data is moved, the exclusion can be removed.

The default value Probability is 1. I am worry that it is low and we will still get people complain about wrong data of #__session table (Viktor's plugin has it default to 10 and I think that might be good value). The fact is that in old Joomla version, we have that in 20% and now, we only have that 1% by default, so much different.

This is in line with the default PHP configuration. It is still less frequent than what is happening now but frequent enough that stale data should not persist in the database for long (unless you can really prove it's possible to go X thousand requests over a measurable timeframe without it being triggered once, remember all of this is still reliant on HTTP requests so for a low traffic site a 1/100 probability does become problematic, but for a high traffic site the 20/100 probability becomes a performance issue and we are right back to where we started). The 10/100 probability is IMO too frequent.

avatar mbabker
mbabker - comment - 22 Feb 2018

You know what? I'm done fighting. @joomla/cms-maintainers someone else take over this PR, I'm sick and tired of being shot down every time I try to fix the real architecture and performance problems in Joomla so it stops getting labeled as an amateur hour hobby project; clearly I'm the only person who cares enough about that derogatory label to do something about it.

avatar brianteeman
brianteeman - comment - 22 Feb 2018

Reverting the code will NOT magically remove the records from your database!!

So instead of complaining please take the time to test this PR. Subject to it being a successful test of course then your problems will be solved. Just demanding a revert will get no where and wont help your site

avatar Kubik-Rubik
Kubik-Rubik - comment - 22 Feb 2018

@GCLW As @joomdonation said, you should test this PR. You will see that it solves the issue.

avatar brianteeman
brianteeman - comment - 22 Feb 2018

@Kubik-Rubik seems it is easier to demand others do stuff than to simply test it yourself :(

avatar GCLW
GCLW - comment - 22 Feb 2018

Don't have time to test right now. I have to work on solutions to right now keep those session database tables under control. Thank You.

avatar mbabker
mbabker - comment - 22 Feb 2018

Instead of working on your own solution, you do realize you could test this one to see if it fixes your problem, right? And either tell us it does, which helps fix the problem in core, or tell us it has problems so that the problems can be fixed.

avatar Quy Quy - test_item - 22 Feb 2018 - Tested successfully
avatar Quy
Quy - comment - 22 Feb 2018

I have tested this item successfully on 36c92de

Tested on an off season site so traffic is very low. Here is the stat for 2 days with PHP session handler and default plugin settings: day1=9 times, day2=10 times. Sessions table is cleared of stale records.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar Kubik-Rubik Kubik-Rubik - test_item - 22 Feb 2018 - Tested successfully
avatar Kubik-Rubik
Kubik-Rubik - comment - 22 Feb 2018

I have tested this item successfully on 36c92de


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar mbabker mbabker - change - 22 Feb 2018
Labels Added: Language Change
avatar mbabker
mbabker - comment - 22 Feb 2018

This now has the language tweaks from JM, with some extra changes of my own (basically the term "garbage collection" is fully gone).

avatar Quy
Quy - comment - 22 Feb 2018

Do we need to update/cross-reference to this plugin in the Session Handler tooltip under Global Configuration?

avatar mbabker
mbabker - comment - 22 Feb 2018

IMO the answer's no. The settings there only relate to the configuration of connecting to the session store. There are no settings there which directly correspond to the optional metadata or cleanup operations.

avatar joomdonation joomdonation - test_item - 23 Feb 2018 - Tested successfully
avatar joomdonation
joomdonation - comment - 23 Feb 2018

I have tested this item successfully on 79a6308

Deployed and tested it on our live site, it is now working as expected (had to setup cronjob before to clean up session data).


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar Gitjk Gitjk - test_item - 23 Feb 2018 - Tested successfully
avatar Gitjk
Gitjk - comment - 23 Feb 2018

I have tested this item successfully on 79a6308

Now it works also in my case, so I can say 'tested successfully'. However, I suppose many users will have to finetune the default settings of the plugin. On my website, approximately 99 percent of the sessions are due to crawler activity and if I'm not mistaken, each page request from a bot triggers a new session. With the default setting, I still encounter a growing waiting queue of session waiting for deletion (On average I currently see approximately 650 more sessions being added per hour than old sessions being deleted).

Would it make sense and be possible to limit the session time of bots?


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.
avatar mbabker
mbabker - comment - 23 Feb 2018

Would it make sense and be possible to limit the session time of bots?

No. Trying to identify a bot visit and differentiate it from a human visit has too many variables to be done in a practical manner. Not to mention this would in essence be a static list in each release because identifiers (browser user agent strings and IP addresses primarily) are constantly changing.

avatar Gitjk
Gitjk - comment - 23 Feb 2018

When I added "I have tested this item successfully on 79a6308" above, I'm afraid I overlooked that the session table by default is not sorted by the time column. When I do sort the session table by the time column and afterwards look at the first and the last row, it reveals that the oldest sessions still don't get purged.

Current settings are:
php.ini settings are:
Local value Master Value
session.gc_divisor: 1000 1000
session.gc_maxlifetime: 9000 1440
session.gc_probability: 0 0

Joomla session handler: Database

Plugin System - Session Data Purge
Enable Session Data Cleanup: Yes
Enable Session Metadata Cleanup: Yes
Probability: 50
Divisor: 100

Any idea about possible reasons why the plugin still seems reject to work in my case?
Did I understand it correctly that it (normally) shouldn't require to configure a cron job?

Of course I could simply use one of the other options (Victor's plugin or add a cron job) to avoid an ever growing session table. But I would like to understand why it's not working in my case. My understanding is that switching the session handler to php would not work, because of session.gc_probability = 0 in Debian.

(I hope the other successful tests were also done with session.gc_probability = 0) 😄


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.
avatar mbabker
mbabker - comment - 23 Feb 2018

Without having access to your server to log what is actually happening when the cleanup code it is impossible to say. In all cases though the cleanup code for the session is basically delete all records where the timestamp is earlier than current Unix epoch timestamp minus the session lifetime in seconds (i.e. the time is 1519420314, the lifetime is 900 seconds, the query should delete all records whose timestamp is older than 1519419414).

The PHP runtime configuration settings are not taken into consideration in this plugin. It is driven entirely by the Joomla global configuration (for the lifetime) and the probability settings in the plugin params.

avatar ggppdk
ggppdk - comment - 23 Feb 2018

Did I understand it correctly that it (normally) shouldn't require to configure a cron job?

No you are required for your case, because you mention that you have ?:

Joomla session handler: Database

So you are required to set up a CRON job for cli/sessionGc.php
Please see description and at least the first 6-7 comments

avatar mbabker
mbabker - comment - 23 Feb 2018

You are NOT required to set up a cron job at all.

If you intend to use the plugin and have cleanup happen as part of web requests (default configuration):

  • If the session handler is the database, you MUST have the "Enable Session Data Cleanup" option set to yes (default configuration) otherwise the database will not be cleaned. This is the option that triggers the code path for JFactory::getSession()->gc();. The "Enable Session Metadata Cleanup" option in this case makes no difference as explained all over the place.

  • If the session handler is not the database, you MUST have the "Enable Session Metadata Cleanup" option set to yes (default configuration) otherwise the database will not be cleaned. This is the option that triggers the code path for clearing the optional session metadata, with the database handler check in place to ensure that metadata cleanup does not corrupt "real" session data (in addition to all the opinionated discussion about treating session data and extra metadata as separate things).

avatar ggppdk
ggppdk - comment - 23 Feb 2018

@mbabker

I was taking about metadata cleanup and i was looking at this line

https://github.com/joomla/joomla-cms/pull/19687/files#diff-c62bf5ee641c0af7406f55c3f4e4f9d9R64

maybe i too sleepless, but it looks like it is not executed for database case regardless of setting
"Enable Session Metadata Cleanup"
because it has an logical AND , not an OR

avatar mbabker
mbabker - comment - 23 Feb 2018

The line evaluates to "if the database handler is not the database and the 'Enable Session Metadata Cleanup' option is enabled".

The two are different tasks.

avatar Gitjk
Gitjk - comment - 23 Feb 2018

Would somebody be so kind to answer two noobish questions: :

 From cli/sessionMetadataGc.php:
+ * This is a CRON script to delete expired optional session metadata which should be called from the command-line, not the
+ * web. For example something like:
+ * /usr/bin/php /path/to/site/cli/sessionMetadataGc.php
  1. Where does the plugin get the live path from?

...the optional session metadata

  1. What exactly are optional session meta data. And are there 'optional' session metadata in J3.8.5?
    ('optional' sounds to me like a configuration option)
avatar infograf768
infograf768 - comment - 24 Feb 2018

Question: is it supposed to work for both admin and site users?
On my localhost, plugin set to 1/100 yesterday, database handler, I log in today and I see that I have 2 sessions.
screen shot 2018-02-24 at 07 27 43

screen shot 2018-02-24 at 07 42 20

avatar infograf768
infograf768 - comment - 24 Feb 2018

Hmm, after a while the double entry does not exist anymore.

avatar zero-24
zero-24 - comment - 24 Feb 2018

Where does the plugin get the live path from?

I don't get the question.

What exactly are optional session meta data. And are there 'optional' session metadata in J3.8.5?
('optional' sounds to me like a configuration option)

If Joomla does not take care of the session we don't need that optional session meta data as this is done by the system that take care about the sessions :)

avatar joomdonation
joomdonation - comment - 24 Feb 2018

@Gitjk

Where does the plugin get the live path from?

I guess you want to get the that mentioned in /path/to/site so that you can setup cronjob? If so, you can get this zip file, unzip it, upload the received root.php to root folder of your site (via FTP or cpanel), then access to this URL https://domain.com/root.php, the path to root folder of your site will be displayed (of course you need to replace https://domain.com with your site url)

root.zip

What exactly are optional session meta data. And are there 'optional' session metadata in J3.8.5?

That session metadata is not optional at the moment (It might be optional in Joomla 4, @mbabker hmade a PR for it). When someone access to your site, Joomla will create a record in #__session table to store session id, time (the time that user accesses to your site), userid, username, client_id (site or administrator). That session metadata has been there for long time, not something only introduced in 3.8.5)

Hope it helps answering your questions

avatar joomdonation
joomdonation - comment - 24 Feb 2018

When I do sort the session table by the time column and afterwards look at the first and the last row, it reveals that the oldest sessions still don't get purged.

@mbabker @csthomas Maybe it is because currently, we store time using varchar data type, so the < operator in delete command doesn't work properly (maybe it is compare using string and that's why some session records are not deleted)

@Gitjk If you can export data of the #__session table on your site and attach it here, I think people could help figure out why it doesn't work on your site

avatar mbabker
mbabker - comment - 24 Feb 2018

What exactly are optional session meta data. And are there 'optional' session metadata in J3.8.5?
('optional' sounds to me like a configuration option)

Session data in Joomla is made up of two components:

  • The "real" session data (this is where things like the reference to your logged in user record, if you submit a form and hit errors where your submission is stored so your screen gets refilled, etc.)
  • The "optional" session metadata (this is what lets core features such as the Who's Online module on the frontend or the various user status modules in the backend function)

The "optional" metadata is not required for a session to be valid (as far as what's in the database goes, you only need the session ID, the timestamp, and the data columns; the rest are all for this metadata stuff). But, there is a fundamental flaw in its implementation. Joomla fatally exits if this metadata record cannot be inserted, and if you are using the database session handler (default configuration) then it cannot work correctly without the metadata record being inserted (as in the handler only issues UPDATE queries to write updated data, it does not try to do an INSERT query if a row does not already exist).

Many sites just don't have a need for this metadata tracking, and aside from one core feature (the check if a record is checked out for editing by an authenticated user with an active session), it can be safely disabled with no repercussions, if the architecture for it is in the right state (which my PR for 4.0 fixes). As a result, this metadata tracking is in and of itself a performance hit because at a minimum it forces one SELECT query onto the session table per request, and up to three in total under the right circumstances (the SELECT query doesn't find a record for your session ID, an INSERT query to put it in the database, and a DELETE query for cleanup of expired data).

So, the plugin here has two different cleanup operations, and we have two different scripts for cron jobs supporting these cleanup operations.

"Enable Session Data Cleanup" plugin option and cli/sessionGc.php - This essentially runs the same cleanup mechanism that PHP core does when you have session.gc_probability and session.gc_divisor are set to the right values. This option/script isn't always needed even when session.gc_probability is set to 0; as an example on cPanel based servers or with the default scripts installed in PHP packages on an Ubuntu server, there are cron jobs to clear the default filesystem storage path (or if using another backend you may already have tools in place to do this cleanup).

"Enable Session Metadata Cleanup" plugin option and cli/sessionMetadataGc.php - This is the trigger for clearing that optional metadata. As this part isn't part of sessions in the native PHP API, this cleanup operation does need to exist in some form.

So, for 3.x with this PR applied the architecture is moved as far as possible to be able to split up the handling of metadata and real data without any B/C breaks, with the benefit of being able to cut some of the performance impacting operations out of each request cycle. By the time we get around to 4.0, everything should be able to be split up and implemented in a way where the metadata can be turned off if so desired.

avatar Gitjk
Gitjk - comment - 24 Feb 2018

Thanks for the explanations!
@mbabker - sounds to me like it would make sense to create a new table for the session meta data in J4.0

avatar csthomas
csthomas - comment - 26 Feb 2018

Perhaps the debug option would be useful to add a log when running the delete query to determine the number of rows removed by the plugin.

avatar zero-24
zero-24 - comment - 26 Feb 2018

Perhaps the debug option would be useful to add a log when running the delete query to determine the number of rows removed by the plugin.

As this is already in RTC state i think we can include / discuss such a option in a later PR. As this does not impact the feature / functionality itself. So we can move that thing forward to be included in 3.8.6.

avatar zero-24
zero-24 - comment - 26 Feb 2018

I have just fixed the merge conflicts commig from the recaptcha PR :)

avatar csthomas
csthomas - comment - 26 Feb 2018

For me it is OK, I just thought that this PR is still not merged because there is some issue.

avatar rdeutz rdeutz - change - 26 Feb 2018
Status Ready to Commit Fixed in Code Base
Closed_Date 0000-00-00 00:00:00 2018-02-26 22:19:50
Closed_By rdeutz
avatar rdeutz rdeutz - close - 26 Feb 2018
avatar rdeutz rdeutz - merge - 26 Feb 2018
avatar sandewt
sandewt - comment - 4 Mar 2018

In the "Session Data Purge" plugin the probability input field is a variable (1, 2, 3, ...)
The probability is quantified as a number between 0 and 1. (Wikipedia, etc).
So the term probability is something confusing.

Alternative: Take a fixed value for the probability (= 1).
One variable value for the divisor should be sufficient.
See the following examples:
E.g. 1):
probability = 1
divisor = 100
Result 1/100 x 100% = 1%
E.g. 2):
probability = 2
divisor = 200
Result 2/200 x 100% = 1%
E.g. 3):
probability = 1
divisor = 2
Result 1/2 x 100% = 50%
E.g. 4):
probability = 4
divisor = 8
Result 4/8 x 100% = 50%

You get the same result.

avatar Gitjk
Gitjk - comment - 4 Mar 2018

The probability is quantified as a number between 0 and 1. (Wikipedia, etc).

In this case 'probability' actually means 'percentage probability' as defined in php.
See here: http://php.net/manual/en/session.configuration.php#ini.session.gc-probability

avatar sandewt
sandewt - comment - 4 Mar 2018

The probability is quantified as a number between 0 and 1. (Wikipedia, etc).

The probability is always a number between 0 and 1, or expressed as a percentage between 0 and 100%.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/19687.

avatar mbabker
mbabker - comment - 4 Mar 2018

Yes, mathematically there are several values you can configure the two fields with that have the same result (1/10 == 10/100, 1/25 == 4/100), that doesn't mean one of the configuration fields is redundant and can be hardcoded at a single value.

avatar sandewt
sandewt - comment - 5 Mar 2018

The probability is always a number between 0 and 1, or expressed as a percentage between 0 and 100%.

@mbabker
I do not understand what you mean exactly.
The next example makes clear what I mean.
This means that one field for the probability should be adequate.

<?php
echo 'probability between 0 and 100% <br />';
echo '0% is never a hit <br />';
echo '50% every two times a hit<br />';
echo '100% is always a hit <br />';
echo '<br />';
$probability = 50; // between 0% - 100%
echo 'probability = ' . $probability . '%'; 
echo '<br />';
$random = 100 * lcg_value();
echo 'random = ' . $random;
echo '<br />';
if ($probability > $random)
{
	echo 'hit'; // equivalent with do $session->gc()
}
else
{
	echo 'no hit';
}	
?>

[EDIT] Yes, in php there is another definition. But this functionality is actually not used. Finally it does not benefit the user-friendliness.

avatar drewgg
drewgg - comment - 7 Mar 2018

@mbabker
Thank you for all your long hard work trying to fix this design flaw while putting up with all the resulting flak (I've read through a number of your RPs on this topic). A couple of the sites I maintain are pretty high traffic and by the time I discovered this issue a couple days ago (an editor was having trouble logging in to the admin side) the sessions table from that site was 9 gigs large with 10 million rows 😲.

I implemented the patch here and am running cleanup using the cron method. Everything looks good!

The one question I am still unclear on:

You had stated the following (which applies to my configuration):

Session GC:
"If I want Joomla to perform session garbage collection, And want to use cron jobs to manage that, And I have session.gc_probability set to 0, And I do not have any other existing jobs purging expired data, Then I should create a cron job for cli/sessionGc.php and disable the appropriate plugin settings"

Metadata GC:
If I want optional session metadata to be cleared by way of a cron job, Then I should create a cron job for cli/sessionMetadataGc.php and disable the appropriate plugin settings

Does "And I do not have any other existing jobs purging expired data" include sessionMetadataGc.php among "other existing jobs"?

ie, If I am running sessionGc.php do I ever need to consider also running sessionMetadataGc.php?

My current understand is, no, I don't need to run sessionMetadataGc.php with my current configuration.

avatar mbabker
mbabker - comment - 7 Mar 2018

If you're setting up cron jobs...

If you're using the database session handler, you only need one cron job for the sessionGc.php file. If you're using any other session handler, you'd need a cron job for the sessionMetadataGc.php file always, and the sessionGc.php file only if nothing else is already doing its job (if using the PHP handler, odds are you don't need this cron job as cPanel and Ubuntu have default jobs cleaning up this part of the filesystem (but you should still check to verify); if using one of the cache engines as a session handler you may need it depending on the cache engine configuration).

avatar drewgg
drewgg - comment - 7 Mar 2018

If you're using the database session handler, you only need one cron job for the sessionGc.php file.

I am. That answers it. Thank you!

avatar sandewt
sandewt - comment - 9 Mar 2018

[EDIT] Yes, in php there is another definition. But this functionality is actually not used.

@mbabker I try to understand the code, so the following.

It is possible to add the following functionality in htaccess, to clean stored garbage data.
php_value session.gc_probability 1
php_value session.gc_divisor 100

That's why I went looking for this functionality in the Joomla! core.
I found only this functionality:
ini_get('session.gc_maxlifetime')

And that the session table is finaly emptied by the following function:
->delete($this->db->quoteName('#__session'))

Why then fill in the probabilty and the divisor, if the session.gc_probability and the session.gc_divisor are not used?

Or do I make a mistake?

avatar brianteeman
brianteeman - comment - 9 Mar 2018

It is possible to add the following functionality in htaccess, to clean stored garbage data.

Not all server setups will allow you to set a php_value in an htaccess

avatar sandewt
sandewt - comment - 9 Mar 2018

Thanks @brianteeman, that means that one input field is sufficient.

file: sessiongc.php (Joomla! core file)

$probability = $this->params->get('gc_probability', 1);
$divisor     = $this->params->get('gc_divisor', 100);
$random = $divisor * lcg_value();
if ($probability > 0 && $random < $probability)
{
	$session->gc();
}

My explanation:

$probability = $this->params->get('gc_probability', 1);
$random = 100 * lcg_value();
if ($probability > $random)
{
	$session->gc();
}
avatar mbabker
mbabker - comment - 9 Mar 2018

It is fine the way it is and is consistent with the PHP runtime configuration.

avatar sandewt
sandewt - comment - 9 Mar 2018

It is fine the way it is and is consistent with the PHP runtime configuration.

@mbabker, so be it, thanks for all the work you have done.

avatar kubwit
kubwit - comment - 12 Mar 2018

Sorry. i know it's closed. Just to add. I have a not huge site (last 6 day 763 sessions and 2061 pageviews - google analytics) and i truncate the session table 6 days ago because it had 250k + entries and the backend had loads with about 30k ms in some cases. After truncate it and the issue disappeared. now after 6 days it was 220k+ entries again and started to be very slow again.
I hope it has the same reason. If not and i can do some help let me know. running 3.8.5 on php 7
Thanks

Add a Comment

Login with GitHub to post a comment