User tests: Successful: Unsuccessful:
Pull Request for Issue #18269
Fix 2 problems described in the above issue
See cases in issue #18269
Category | ⇒ | Front End Plugins |
Status | New | ⇒ | Pending |
Labels |
Added:
?
|
@ggppdk Thank you for the pull request! The main problem, which I reported is solved by changing tow lines included in your PR. On the other side another line of the same code fragment can still produce wrong results. I described the problem in the comment including the pattern which can be easy used to reproduce the bug.
@OlegKi please mark your Test as successfully.
I have tested this item
Thank you for the pull request! The main problem, which I reported is solved by changing tow lines included in your PR. On the other side another line of the same code fragment can still produce wrong results. I described the problem in the comment including the pattern which can be easy used to reproduce the bug.
@ggppdk I reread the description of srcset
in the HTML specification and I would say that the usage of
preg_split("/,\s+/u", $match[1])
would be incorrect for the common case. The problem is that allowed elements of srcset
could be URL without a width descriptor (like 123w
) or a pixel density descriptor (like 1.5x
or 2x
). In the case the HTML specification allows to use no whitespace before the URL. See
its value must consist of one or more image candidate strings, each separated from the next by a U+002C COMMA character (,). If an image candidate string contains no descriptors and no ASCII whitespace after the URL, the following image candidate string, if there is one, must begin with one or more ASCII whitespace.
In other words, the examples like
src="url1 1.5x,url2, url3 2x"
would be allowed value, but \s+
, which you use, requires at least one whitespace.
Because URLs (and so the image candidate strings too) could not contain whitespaces (see here) then it would be probably better to split the value of srcset
by whitespaces instead of commas (something like preg_split("/\s+/u", $match[1])
). One will get the URLs, width descriptors (like 123w
) or pixel density descriptors (like 1.5x
or 2x
) appended optional comma or URLs prepended with a descriptor (like 1.5x,url2
). The descriptors are easy to parse and one could detect the special case 1.5x,url2
. After all one will get pure image candidate strings (URLs) and pure descriptors. The full parsing could be a little longer, but the main idea is: start with spiting by whitespaces instead of splitting by commas.
Now one can use parse_url to parse URLs. One should apply the changes from the pull request #15300 only for URLs without scheme (http:
, https:
, ...) and other first parts of URL and only probably if the path
part of URL isn't started with slash (/
).
One more option would be to use RegEx ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
from Appendix B of rfc3986, which will be suggeseted to parse URLs. One can easy modify the RegEx to parse full srcset
. The only problem is that the Appendix B are removed from later redaction of rfc3986 and I don't know the exact reason of that.
P.S.: To tell the trust I don't full understand all cases addressed by the pull request #15300. Currently I just commented the block of code and srcset
will by displayed correctly without any modifications. I never use related paths in srcset
and thus I haven't any need any modifications of srcset
. I think that it would be better to have no modifications of srcset
as to have wrong modifications.
Maybe when i get some time
i will try to replace the code and use regular expression
or someone else can do this work
This might work:
if (strpos($buffer, 'srcset=') !== false)
{
$regex = '#\s+srcset="([^"]+)"#m';
$buffer = preg_replace_callback(
$regex,
function ($match) use ($base, $protocols)
{
$data = array();
preg_match_all('#((?:[^\s]+)\s*([\d\.]+[wx])?,*)#m', $match[1], $matches);
foreach($matches[1] as $src)
{
// trim spaces
$src = trim($src);
// remove comma seperator
$src = trim($src, ",");
$url = $src;
$descriptor = '';
// assign url and descriptor from match
if (strpos(' ', $src) !== false)
{
list($url, $descriptor) = explode(' ', $src);
}
// process url
$url = preg_replace('#^(?!/|' . $protocols . '|\#|\')(.+)#', $base . '$1', trim($url));
// join url and descriptor by space and trim
$data[] = trim(implode(' ', array($url, $descriptor)));
}
return ' srcset="' . implode(", ", $data) . '"';
},
$buffer
);
$this->checkBuffer($buffer);
}
thanks , can you make a PR ?
you have spent more time studing this than me
i will happily close this PR in favour of the new one
thanks , closing
Status | Pending | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2017-10-13 11:44:36 |
Closed_By | ⇒ | ggppdk |
// A whitespace space character must follow a comma when defining multiple images inside 'srcset'
Can someone confirm the above ?