? Pending

User tests: Successful: Unsuccessful:

avatar ggppdk
ggppdk
7 Oct 2017

Pull Request for Issue #18269

Summary of Changes

Fix 2 problems described in the above issue

  • Comma is a legitimate character in URLs it should be handled
  • Restoring multiple image URLs after handling should be comma + whitespace for browser to identify them

Testing Instructions

See cases in issue #18269

Expected result

Actual result

Documentation Changes Required

Votes

# of Users Experiencing Issue
1/1
Average Importance Score
5.00

avatar joomla-cms-bot joomla-cms-bot - change - 7 Oct 2017
Category Front End Plugins
avatar ggppdk ggppdk - open - 7 Oct 2017
avatar ggppdk ggppdk - change - 7 Oct 2017
Status New Pending
avatar ggppdk ggppdk - change - 7 Oct 2017
Labels Added: ?
avatar ggppdk
ggppdk - comment - 7 Oct 2017
// A whitespace space character must follow a comma when defining multiple images inside 'srcset'

Can someone confirm the above ?

avatar OlegKi
OlegKi - comment - 8 Oct 2017

@ggppdk Thank you for the pull request! The main problem, which I reported is solved by changing tow lines included in your PR. On the other side another line of the same code fragment can still produce wrong results. I described the problem in the comment including the pattern which can be easy used to reproduce the bug.

avatar franz-wohlkoenig
franz-wohlkoenig - comment - 8 Oct 2017

@OlegKi please mark your Test as successfully.

  • Please open Issue Tracker
  • Login with your github-Account
  • Click on blue "Test this"-Button above Authors-Picture
  • mark your Test as successfully
  • hit "submit test result"
avatar OlegKi OlegKi - test_item - 8 Oct 2017 - Tested successfully
avatar OlegKi
OlegKi - comment - 8 Oct 2017

I have tested this item successfully on 0eaf34a

Thank you for the pull request! The main problem, which I reported is solved by changing tow lines included in your PR. On the other side another line of the same code fragment can still produce wrong results. I described the problem in the comment including the pattern which can be easy used to reproduce the bug.


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/18273.

avatar OlegKi
OlegKi - comment - 8 Oct 2017

@ggppdk I reread the description of srcset in the HTML specification and I would say that the usage of

preg_split("/,\s+/u", $match[1])

would be incorrect for the common case. The problem is that allowed elements of srcset could be URL without a width descriptor (like 123w) or a pixel density descriptor (like 1.5x or 2x). In the case the HTML specification allows to use no whitespace before the URL. See

its value must consist of one or more image candidate strings, each separated from the next by a U+002C COMMA character (,). If an image candidate string contains no descriptors and no ASCII whitespace after the URL, the following image candidate string, if there is one, must begin with one or more ASCII whitespace.

In other words, the examples like

src="url1 1.5x,url2, url3 2x"

would be allowed value, but \s+, which you use, requires at least one whitespace.

Because URLs (and so the image candidate strings too) could not contain whitespaces (see here) then it would be probably better to split the value of srcset by whitespaces instead of commas (something like preg_split("/\s+/u", $match[1])). One will get the URLs, width descriptors (like 123w) or pixel density descriptors (like 1.5x or 2x) appended optional comma or URLs prepended with a descriptor (like 1.5x,url2). The descriptors are easy to parse and one could detect the special case 1.5x,url2. After all one will get pure image candidate strings (URLs) and pure descriptors. The full parsing could be a little longer, but the main idea is: start with spiting by whitespaces instead of splitting by commas.

Now one can use parse_url to parse URLs. One should apply the changes from the pull request #15300 only for URLs without scheme (http:, https:, ...) and other first parts of URL and only probably if the path part of URL isn't started with slash (/).

One more option would be to use RegEx ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? from Appendix B of rfc3986, which will be suggeseted to parse URLs. One can easy modify the RegEx to parse full srcset. The only problem is that the Appendix B are removed from later redaction of rfc3986 and I don't know the exact reason of that.

P.S.: To tell the trust I don't full understand all cases addressed by the pull request #15300. Currently I just commented the block of code and srcset will by displayed correctly without any modifications. I never use related paths in srcset and thus I haven't any need any modifications of srcset. I think that it would be better to have no modifications of srcset as to have wrong modifications.

avatar ggppdk
ggppdk - comment - 10 Oct 2017

Maybe when i get some time
i will try to replace the code and use regular expression
or someone else can do this work

avatar ryandemmer
ryandemmer - comment - 13 Oct 2017

This might work:

if (strpos($buffer, 'srcset=') !== false) 
{
	$regex = '#\s+srcset="([^"]+)"#m';
			
	$buffer = preg_replace_callback(
		$regex,
		function ($match) use ($base, $protocols)
		{
			$data = array();
										
			preg_match_all('#((?:[^\s]+)\s*([\d\.]+[wx])?,*)#m', $match[1], $matches);
					
			foreach($matches[1] as $src) 
			{
				// trim spaces
				$src = trim($src);
					
				// remove comma seperator
				$src = trim($src, ",");
						
				$url = $src;
				$descriptor = '';
						
				// assign url and descriptor from match
				if (strpos(' ', $src) !== false) 
				{
					list($url, $descriptor) = explode(' ', $src);
				}

				// process url
				$url = preg_replace('#^(?!/|' . $protocols . '|\#|\')(.+)#', $base . '$1', trim($url));
						
				// join url and descriptor by space and trim
				$data[] = trim(implode(' ', array($url, $descriptor)));
			}

			return ' srcset="' . implode(", ", $data) . '"';
		},
		$buffer
	);	

	$this->checkBuffer($buffer);
}
avatar ggppdk
ggppdk - comment - 13 Oct 2017

@ryandemmer

thanks , can you make a PR ?
you have spent more time studing this than me
i will happily close this PR in favour of the new one

avatar ggppdk
ggppdk - comment - 13 Oct 2017

@ryandemmer

thanks , closing

avatar ggppdk ggppdk - change - 13 Oct 2017
Status Pending Closed
Closed_Date 0000-00-00 00:00:00 2017-10-13 11:44:36
Closed_By ggppdk
avatar ggppdk ggppdk - close - 13 Oct 2017

Add a Comment

Login with GitHub to post a comment