J3 Issue ?
avatar mbabker
mbabker
24 Oct 2018

While analyzing the graphs generated of the CSV export of the database @PhilETaylor provided in #22700, the two high memory spots seem to be the database (expected) and in all the calls to Joomla\CMS\Router\Route::link(). All combined, 96.4 MB is being used in these method calls; 3.63 MB is spent in calling Joomla\CMS\Uri\Uri::toString() (reasonable considering we're talking about 51,000 objects) and the other 92.8 MB is spent in Joomla\CMS\Router\AdministratorRouter::build(). Note all memory references are for the full request and are not specific to what's triggered just from the com_actionlogs controller.

At this point, 20.7 MB is spent in calling Joomla\CMS\Uri\Uri::setPath() and the other 72.1 MB in building URIs. Analyzing the calls to setPath(), 15.1 MB of its memory use calls from calling PHP's explode() function. This seems to imply that use of explode() may not be performant when generating large quantities of links.

The other large bulk of memory use is the 54 MB used in creating Joomla\CMS\Uri\Uri objects. Actually, just about all of the memory use from this code path leads into Joomla\Uri\UriHelper::parse_url(). joomla-framework/uri#20 was merged to the 2.0 branch there as a performance enhancement, this may be something to look at backporting (ensuring PHP 5.3 friendliness).

Graph for analysis: https://blackfire.io/profiles/e18dbcb1-f984-4206-99a8-8e5754f4d68a/graph

Note this also has #22752 in place

And pinging @Hackwar since he's one of the only ones crazy enough to look at any code when "routing" is mentioned ?

avatar mbabker mbabker - open - 24 Oct 2018
avatar joomla-cms-bot joomla-cms-bot - change - 24 Oct 2018
Labels Added: ?
avatar joomla-cms-bot joomla-cms-bot - labeled - 24 Oct 2018
avatar csthomas
csthomas - comment - 24 Oct 2018

For setPath can you test it:

diff --git a/libraries/vendor/joomla/uri/src/AbstractUri.php b/libraries/vendor/joomla/uri/src/AbstractUri.php
index 37616b264f..7b50c59eab 100644
--- a/libraries/vendor/joomla/uri/src/AbstractUri.php
+++ b/libraries/vendor/joomla/uri/src/AbstractUri.php
@@ -399,30 +399,54 @@ abstract class AbstractUri implements UriInterface
         */
        protected function cleanPath($path)
        {
-               $path = explode('/', preg_replace('#(/+)#', '/', $path));
+               $segments = explode('/', $path);
+               $last     = count($segments) - 1;
 
-               for ($i = 0, $n = \count($path); $i < $n; $i++)
+               $total  = 0;
+               $blocks = array();
+
+               foreach ($segments as $i => $segment)
                {
-                       if ($path[$i] == '.' || $path[$i] == '..')
+                       if ($segment === '')
                        {
-                               if (($path[$i] == '.') || ($path[$i] == '..' && $i == 1 && $path[0] == ''))
+                               if ($i === 0 || $i === $last)
                                {
-                                       unset($path[$i]);
-                                       $path = array_values($path);
-                                       $i--;
-                                       $n--;
+                                       // Keep the first and last slash
+                                       $blocks[] = '';
+                                       $total++;
                                }
-                               elseif ($path[$i] == '..' && ($i > 1 || ($i == 1 && $path[0] != '')))
+
+                               continue;
+                       }
+
+                       if ($segment === '.')
+                       {
+                               // Remove './'
+                               continue;
+                       }
+
+                       if ($segment === '..')
+                       {
+                               if ($total === 1 && $blocks[0] === '')
+                               {
+                                       // Replace /../foo.php with /foo.php
+                                       continue;
+                               }
+
+                               if ($total >= 1)
                                {
-                                       unset($path[$i]);
-                                       unset($path[$i - 1]);
-                                       $path = array_values($path);
-                                       $i -= 2;
-                                       $n -= 2;
+                                       // Replace foo/../boo.php with boo.php
+                                       unset($blocks[$total - 1]);
+                                       $total--;
+
+                                       continue;
                                }
                        }
+
+                       $blocks[] = $segment;
+                       $total++;
                }
 
-               return implode('/', $path);
+               return implode('/', $blocks);
        }
 }
avatar mbabker
mbabker - comment - 24 Oct 2018

Doesn't seem like it's helping. Graph comparing current staging (57988ca) to staging + your patch - https://blackfire.io/profiles/compare/5ec78580-43c2-47a0-ab15-c7655d9a4948/graph (the memory use is a little lower but I've also got a smaller export before hitting my 90 second timeout)

avatar andrepereiradasilva
andrepereiradasilva - comment - 24 Oct 2018

hi! saw this today, something like this would help?
at least, since it avoid the cicle in non relative urls, should be a lot more performant in most url circunstances, with little aditional cost in relative url cases.

protected function cleanPath($path)
{
	// Most urls are not relative, for those cases avoid expensive operations
	if (\strpos($path, './') === false)
	{
		// RFC 3986 incorrect paths...
		if (\strpos($path, '//') === false)
		{
			return \str_replace('//', '/', $path);
		}

		return $path;
	}

	// For relative paths translate paths
	$segments = \explode('/', $path);
	$blocks   = array();
	$last     = \count($segments) - 1;

	foreach ($segments as $i => $part)
	{
		// Ignore this segment, remove last segment from stack
		if ($part === '..' && $i !== 0)
		{
			\array_pop($blocks);
			continue;
		}

		// Ignore this segments
		if ($part === '.' || ($part === '' && $i !== 0 && $i !== $last))
		{
			continue;
		}

		$blocks[] = $part;
	}

	return \implode('/', $blocks);
}
avatar mbabker
mbabker - comment - 25 Oct 2018

Looks like that one made a dent in things - https://blackfire.io/profiles/compare/ccb771c5-ca70-43d0-878c-3219cc313a80/graph (memory use is higher with the patch applied but that's because the overall request is able to export more rows before hitting the timeout, if you look at the memory tab the entirety of the memory spent in Joomla\CMS\Uri\Uri::cleanPath() is wiped out).

avatar mbabker
mbabker - comment - 25 Oct 2018

And for what it's worth applying joomla-framework/uri#20 on top of that patch doesn't create a significantly different result (it does massively lower the memory used by Joomla\Uri\UriHelper::parse_url(), but that gain is offset by calls to parse_str() within Joomla\Uri\AbstractUri::parse()).

avatar csthomas
csthomas - comment - 25 Oct 2018

IMO memory usage by explode is not so important because it is the sum of all 'explode` calls.

I prepared one more example, which is much faster than the original version.

diff --git a/libraries/vendor/joomla/uri/src/AbstractUri.php b/libraries/vendor/joomla/uri/src/AbstractUri.php
index 37616b264f..c6d7a5cc2a 100644
--- a/libraries/vendor/joomla/uri/src/AbstractUri.php
+++ b/libraries/vendor/joomla/uri/src/AbstractUri.php
@@ -399,30 +399,64 @@ abstract class AbstractUri implements UriInterface
         */
        protected function cleanPath($path)
        {
-               $path = explode('/', preg_replace('#(/+)#', '/', $path));
+               if ($path !== '' && $path[0] === '/')
+               {
+                       // Keep the first slash
+                       $blocks = array('');
+                       $total = 1;
+               }
+               else
+               {
+                       $blocks = array();
+                       $total  = 0;
+               }
+
+               $block = strtok($path, '/');
 
-               for ($i = 0, $n = \count($path); $i < $n; $i++)
+               while ($block !== false)
                {
-                       if ($path[$i] == '.' || $path[$i] == '..')
+                       if ($block === '.')
+                       {
+                               // Remove './'
+                               $block = strtok('/');
+
+                               continue;
+                       }
+
+                       if ($block === '..')
                        {
-                               if (($path[$i] == '.') || ($path[$i] == '..' && $i == 1 && $path[0] == ''))
+                               if ($total === 1 && $blocks[0] === '')
                                {
-                                       unset($path[$i]);
-                                       $path = array_values($path);
-                                       $i--;
-                                       $n--;
+                                       // Replace /../foo.php with /foo.php
+                                       $block = strtok('/');
+
+                                       continue;
                                }
-                               elseif ($path[$i] == '..' && ($i > 1 || ($i == 1 && $path[0] != '')))
+
+                               if ($total >= 1)
                                {
-                                       unset($path[$i]);
-                                       unset($path[$i - 1]);
-                                       $path = array_values($path);
-                                       $i -= 2;
-                                       $n -= 2;
+                                       // Replace foo/../boo.php with boo.php
+                                       unset($blocks[$total - 1]);
+                                       $total--;
+
+                                       $block = strtok('/');
+
+                                       continue;
                                }
                        }
+
+                       $blocks[] = $block;
+                       $total++;
+
+                       $block = strtok('/');
+               }
+
+               if ($path[strlen($path) - 1] === '/')
+               {
+                       // Keep the last slash
+                       $blocks[] = '';
                }
 
-               return implode('/', $path);
+               return implode('/', $blocks);
        }
 }
avatar csthomas
csthomas - comment - 25 Oct 2018

I have created a script UriCleanPath.php to test a few version of AbstractUri::cleanPath().

php -f UriCleanPath.php 1 // test original version from J3
php -f UriCleanPath.php 2 // version with strpos, without explode
php -f UriCleanPath.php 3 // version with explode
php -f UriCleanPath.php 4 // version with strtok
avatar andrepereiradasilva
andrepereiradasilva - comment - 25 Oct 2018

Hi, for those tests seems the best one is 4 with additional strpos in the beggining of the method.

function cleanPath4($path)
{
	// Most urls are not relative, for those cases avoid expensive operations
	if (strpos($path, './') === false && strpos($path, '/.') === false)
	{
		return str_replace('//', '/', $path);
	}

	// [...]
}

Results are exactly the same
But don't know if there are other wild scenarios, besides the ones tested in your script

Total: 3.1942 Memory peak: 503776 Function memory peak: 496 [method 1]
Total: 1.6185 Memory peak: 503776 Function memory peak: 496 [method 2]
Total: 1.4870 Memory peak: 503776 Function memory peak: 1256 [method 3]
Total: 1.1736 Memory peak: 503776 Function memory peak: 544 [method 4]
Total: 0.8908 Memory peak: 503776 Function memory peak: 528 [method 4 + strpos]

avatar brianteeman brianteeman - change - 30 Oct 2018
Labels Added: J3 Issue
avatar brianteeman brianteeman - labeled - 30 Oct 2018
avatar brianteeman brianteeman - change - 30 Oct 2018
Labels Removed: J3 Issue
avatar brianteeman brianteeman - unlabeled - 30 Oct 2018
avatar brianteeman brianteeman - unlabeled - 30 Oct 2018
avatar brianteeman brianteeman - change - 30 Oct 2018
Labels Added: J3 Issue
avatar brianteeman brianteeman - labeled - 30 Oct 2018
avatar mbabker mbabker - change - 19 Feb 2019
Status New Closed
Closed_Date 0000-00-00 00:00:00 2019-02-19 02:01:13
Closed_By mbabker
avatar mbabker mbabker - close - 19 Feb 2019

Add a Comment

Login with GitHub to post a comment