No Code Attached Yet bug
avatar JTBlum
JTBlum
25 Jun 2020

Steps to reproduce the issue

$in = JApplicationCms::getInstance()->input;
die($in->getFloat('testFloat',0.0));

Navigate to the page and add the $_GET parameter testFloat=.25

Expected result

Blank page except for the text ".25"

Actual result

Blank page except for the text "25"

System information (as much as possible)

Joomla! Version Joomla! 3.6.5 through Joomla! 3.9.19 Stable [ Amani ] 2-June-2020 15:00 GMT

Additional comments

This has been reported before and was incorrectly closed in pull request 15736. Please review the actual content of the regular expression used in the PHP documentation, specifically the following:

Formally as of PHP 7.4.0 (previously, underscores have not been allowed):

LNUM          [0-9]+(_[0-9]+)*
DNUM          ([0-9]*(_[0-9]+)*[\.]{LNUM}) | ({LNUM}[\.][0-9]*(_[0-9]+)*)
EXPONENT_DNUM (({LNUM} | {DNUM}) [eE][+-]? {LNUM})

The following quote was used from an earlier version of PHP, which also shows that the change is in line with native PHP, but was used as justification to close it:

Formally:

LNUM          [0-9]+
DNUM          ([0-9]*[\.]{LNUM}) | ({LNUM}[\.][0-9]*)
EXPONENT_DNUM [+-]?(({LNUM} | {DNUM}) [eE][+-]? {LNUM})

If you didn't catch it, the left side of the DNUM regex allows for a floating point number to start with a dot in both cases.

avatar JTBlum JTBlum - open - 25 Jun 2020
avatar joomla-cms-bot joomla-cms-bot - change - 25 Jun 2020
Labels Added: ?
avatar joomla-cms-bot joomla-cms-bot - labeled - 25 Jun 2020
avatar Quy
Quy - comment - 25 Jun 2020

Please test PR #29780

avatar Quy Quy - change - 25 Jun 2020
Status New Closed
Closed_Date 0000-00-00 00:00:00 2020-06-25 17:40:44
Closed_By Quy
avatar Quy Quy - close - 25 Jun 2020
avatar SharkyKZ SharkyKZ - change - 9 Jul 2020
Status Closed New
Closed_Date 2020-06-25 17:40:44
Closed_By Quy
avatar SharkyKZ
SharkyKZ - comment - 9 Jul 2020

Reopening as #29780 was closed.

avatar SharkyKZ SharkyKZ - reopen - 9 Jul 2020
avatar JTBlum
JTBlum - comment - 9 Jul 2020

Given the discussion on #29780, I'll go into more detail here and propose another change to the RegEx as well as some changes to the unit tests.

  • $input is defined with the following: $input = '!"#$%&\'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_``' . 'abcdefghijklmnopqrstuvwxyz{|}~€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“â' . '€�•–—˜™š›œžŸ¡¢£¤¥Â' . '¦Â§Â¨Â©ÂªÂ«Â¬Â­Â®Â¯Â°Â±Â²Â³Â´ÂµÂ¶Â·' . '¸¹º»¼½¾¿ÀÃ�ÂÃÄÅÆÇÈÉÊË' . 'ÃŒÃ�ÃŽÃ�Ã�ÑÒÓÔÕÖ×ØÙÚÛÜÃ�ÞÃ' . 'ŸÃ áâãäåæçèéêëìíîïÃ' . '°Ã±Ã²Ã³Ã´ÃµÃ¶Ã·Ã¸Ã¹ÃºÃ»Ã¼Ã½Ã¾Ã¿';

  • float_01: asserts $input → 123456789, which seems like it should return an error. There are better (less messy) examples in float_04 and the tests that follow, where I'll explain more.

  • double: Is actually a duplicate of float_01 with a different name

  • float_02: asserts 20.20 → 20.2, and should pass

  • float_03: asserts '-38.123' → -38.123, and should pass

  • float_04: asserts 'abc-12.456' → -12.456, which again seems in error. The string contains but does not represent a floating point value, which is really the best example of where I think the problem and/or misunderstanding is. I believe this this test should fail.

  • float_05: asserts '-abc12.456' → 12.456 (same situation as float_04)

  • float_06: asserts 'abc-12.456abc' → -12.456 (same situation as float_04+)

  • float_07: asserts 'abc-12 . 456' → -12 (same situation as float_04+)

  • float_08: asserts 'abc-12. 456' → -12 (same situation as float_04+)

  • float_09: asserts '' → 0, which may not may not work depending on how you want type-casting to behave

  • float_10 & float_11 should remain functional

  • float_12 contains asserts that 1.0 → 1.0, but within an array that also duplicates the tests from float_06 and float_08 (see above)

  • float_13 is another array test, the second element of which asserts 'abcdef-7E-10' → -7E-10, which is another float_04 situation.

I could go on, but all of this hinges on one assumption that could be mistaken: I don't think that strings containing floating point values in the middle of junk data should come back as anything but junk, but the many of the tests in this suite and one other of five sets disagree.

I originally proposed the following RegEx because I was thinking in terms of things that were already parsed as tokens:

$pattern = '/[-+]?[0-9]*(\.[0-9]+)?([eE][-+]?[0-9]+)?/';
                       ^

Knowing that we need to take into account potential junk data surrounding legitimate values I would say the change is still a valid one, but that the details of the unit tests above might in some cases return different results since they would now show decimal values instead of integers in some cases. If, on the other hand, numbers surrounded by junk data are not supposed to be parsed at all I would propose the following:

$pattern = '/^[-+]?[0-9]*(\.[0-9]+)?([eE][-+]?[0-9]+)?$/';
             ^          ^                             ^

Thank you for your patience.

avatar Fedik
Fedik - comment - 9 Jul 2020

what about explode the string by dot and check each part separately? (as int?)

$string = 'abc-12.456abc';
$f =  new Joomla\CMS\Filter\InputFilter;

$p = explode('.', $string , 2);
$r1 = $f->clean($p[0], 'int');
$r2 = $f->clean($p[1], 'int');
$r = floatval($r1 . '.' . $r2);

var_dump($r);

easy ?
except for $string = 'abc-.456abc'; ?

avatar JTBlum
JTBlum - comment - 9 Jul 2020

That doesn't take into account plus/minus signs, exponent notation, or integers that we want to interpret as floating-point values using the filter methods.

avatar Hackwar Hackwar - change - 20 Feb 2023
Labels Added: No Code Attached Yet bug
Removed: ?
avatar Hackwar Hackwar - labeled - 20 Feb 2023

Add a Comment

Login with GitHub to post a comment