? Success

User tests: Successful: Unsuccessful:

avatar smz
smz
26 May 2015

Description

This fixes regular expressions used for email validation (see: #6982)

Joomla didn't allow the ' char in user email addresses because JMailHelper::isEmailAddress() and JFormRuleEmail::test() used (Unicode "Right single quotation mark") instead of ' (ASCII apostrophe, 0x27) in their regexp.

According to RFC 3696, it is allowed to have apostrophes in email addresses as long as they are ASCII character (0x27) (Irish email example: o'sullivan@example.com).

Testing instructions

You can test this patch by trying to enter valid and/or invalid email addresses in "User Manager" and in "Contacts". In most cases, anyway, you will be actually testing JS validation (PHP validation has been tested by unit test).

Only expected cases accepted by JS and rejected by PHP are those of these kinds:

.user@example.com
user.@example.com
user..name@example.com

Additional Comments

Reference material for email addresses validation rules:

avatar smz smz - open - 26 May 2015
avatar smz
smz - comment - 26 May 2015

@infograf768, @dgt41, @Denitz
I closed #7005 as I wasn't able to resolve conflicts and this is the new PR handling email addresses validation.

Sorry for the confusion...

avatar smz smz - change - 26 May 2015
The description was changed
avatar joomla-cms-bot joomla-cms-bot - change - 26 May 2015
Labels Added: ?
avatar smz
smz - comment - 26 May 2015

To recap, the outstanding issues/decisions-to-be-taken are:

  1. Modify unit tests to allow the single quote in email addresses and instead disallow the "right single quotation mark" (which is Unicode U+2019 and shouldn't be there) Done!

  2. Decide if we want to fix JFormRuleEmail::test() to correctly treat email addresses without the domain-part (i.e. check only the local part). By design it should do that, but it doesn't and it didn't in original code. B/C issues if I fix that? Done!

  3. In JFormRuleEmail::test() we make less thorough checks compared to what we do in JMailHelper::isEmailAddress(): should we fix that? Fixed!

  4. As in #_users the email column is defined as varchar(100), but email addresses can be 254 characters long, we should decide if we want to extend that column to varchar(255) or we want to introduce further checking against an email address being longer than 100 chars (non standard, but probably more than enough)

  5. Decide if we should correctly treat IDNA domains also in html5fallback.js, which at this time doesn't. In case I think I'll need some help...

avatar zero-24 zero-24 - change - 26 May 2015
Category JavaScript Libraries
avatar zero-24 zero-24 - change - 26 May 2015
Status New Pending
avatar smz
smz - comment - 16 Jun 2015

@infograf768, @bakual, @wilsonge Do we want this bug fixed? If we want I think unit tests should be fixed beforehand...

avatar wilsonge
wilsonge - comment - 16 Jun 2015

I think the deal is are there any use valid cases of emails that used to pass that will now fail (as per that platform link I shared in the old one). I'm super super nervous with this of invalidating users that were previously valid which is going to cause all kinds of issues.

But to clarify I do want this bug fixed as well - I'm just very nervous :stuck_out_tongue:

avatar smz
smz - comment - 16 Jun 2015

I'm super super nervous with this of invalidating users that were previously valid which is going to cause all kinds of issues.

George, I can understand this, but it is not the usernames, it is their e-mail addresses and we are talking about invalid email addresses that would have never worked, while we prohibit valid email addresses...

avatar mbabker
mbabker - comment - 16 Jun 2015

Accepting non-RFC compliant e-mail addresses is a bug. Not fixing it means Joomla does not adhere to an internet standard. Fixing it may cause some users issues, but quite frankly, if their data was invalid before and it's invalid after and the only change is now it's rightfully flagged as invalid data, then it's a bug fix. Period.

avatar wilsonge
wilsonge - comment - 16 Jun 2015

ok so on that basis is there anything with return (boolean) filter_var($email, FILTER_VALIDATE_EMAIL); (from joomla/joomla-platform#1833) that block genuinely valid emails?

avatar smz
smz - comment - 16 Jun 2015

Yes: email addresses containing a single quote are not accepted while they should (see: #6982)

avatar Bakual
Bakual - comment - 16 Jun 2015

agreed with @mbabker. We should follow the RFCs as close as possible. Lets fix it :smile:

avatar smz
smz - comment - 19 Jun 2015

... and so... ? :smile:

avatar Bakual
Bakual - comment - 19 Jun 2015

Tests need to pass obviously (you can change them with this PR). And some user testing would be helpful as well.

avatar smz
smz - comment - 19 Jun 2015

OK, will add test units mods later on today within this PR (confirmed? no separate PR needed?) Thanks!

avatar wilsonge
wilsonge - comment - 19 Jun 2015

Yes tests in this PR please :)

avatar smz
smz - comment - 19 Jun 2015

OK, I'm on the right way:

o’reilly@there.com (which has Unicode "Right single quotation mark") was passing because I didn't test regexps with the u modifier (current code is just an hack that I intend to modify). This is a mistake that was already present and I must check in other places too.

Now, if I only could understand why joe<>bob@bob.come pass the test... :unamused:

avatar smz
smz - comment - 19 Jun 2015

Aaaaargh! The u modifier doesn't work for me... Any UTF8 regexp expert around??

avatar dgt41
dgt41 - comment - 19 Jun 2015

Not expert but can you paste here the regex and the failing text?

avatar smz
smz - comment - 19 Jun 2015

Here it is:

$regex_local_part = '^[a-zA-Z0-9.!#$%&\'*+\/=?^_{|}~-]+';

if (!preg_match(chr(1) . $regex_local_part . chr(1) . 'u', $local) || substr($local, -1) == '.' || $local[0] == '.' || preg_match('/\.\./', $local))
{
    return false;
}

It passes $local = 'o’reilly'; (beware the Unicode "Right single quotation mark" in it...) while it shouldn't

It also passes joe<>bob while it shouldn't

avatar smz
smz - comment - 19 Jun 2015

I think I know what's happening:

The regexp matches the string AFTER the "Right single quotation mark". Same for joe<>bob: it matches the string after the >.

The regexp misses an "End-of-String" condition:

$regex_local_part = '^[a-zA-Z0-9.!#$%&\'*+\/=?^_{|}~-]+\Z';
avatar smz
smz - comment - 19 Jun 2015

Sorry, meant to say that:

The regexp matches the string BEFORE the "Right single quotation mark". Same for joe<>bob: it matches the string before the <.

avatar smz
smz - comment - 19 Jun 2015

OK, that was it! :smile:

NOT READY FOR PRIME TIME YET (I have to fix the same in other places too...)

I think this will not make it for 3.4.2, but better for 3.5.0 in good shape than 3.4.2 in bad shape....

avatar brianteeman
brianteeman - comment - 19 Jun 2015

Not sure but this might help or confuse
https://tedclancy.wordpress.com/2015/06/03/which-unicode-character-should-represent-the-english-apostrophe-and-why-the-unicode-committee-is-very-wrong/

My takeaway is that there is more than one way to write an apostrophe
On 19 Jun 2015 17:47, "Sergio Manzi" notifications@github.com wrote:

OK, That's was it! [image: :smile:]

NOT READY FOR PRIME TIME YET (I have to fix the same in other places
too...)

I think this will not make it for 3.4.2, but better for 3.5.0 in good
shape than 3.4.2 in bad shape....


Reply to this email directly or view it on GitHub
#7041 (comment).

avatar smz
smz - comment - 20 Jun 2015

Interesting article, Brian, on which I totally agree. Last section says it all: "Common bloody sense... apostrophes are not closing quotation marks!"

In our context, anyway, the only allowable "apostrophes" are the good old ugly ASCII apostrophe, 0x27 and (if you want to call it an apostrophe...) the acute accent, 0x60. On this RFC 3696, paragraph 3 is very clear.

avatar smz smz - change - 20 Jun 2015
Title
Fixes email validation regular expressions
Fix for email validation rules
avatar smz smz - change - 20 Jun 2015
The description was changed
Title
Fixes email validation regular expressions
Fix for email validation rules
avatar smz smz - change - 20 Jun 2015
The description was changed
avatar smz smz - change - 20 Jun 2015
The description was changed
avatar smz
smz - comment - 20 Jun 2015

Ready for testing! Thanks!

avatar smz smz - change - 20 Jun 2015
The description was changed
avatar smz smz - change - 20 Jun 2015
The description was changed
avatar smz smz - change - 20 Jun 2015
The description was changed
avatar smz smz - change - 20 Jun 2015
Title
Fix for email validation rules
Fix for email addresses validation rules
avatar smz smz - change - 20 Jun 2015
Title
Fix for email validation rules
Fix for email addresses validation rules
avatar dgt41
dgt41 - comment - 22 Jun 2015

@test ok

avatar zero-24 zero-24 - alter_testresult - 24 Jun 2015 - dgt41: Tested successfully
avatar smz
smz - comment - 26 Jun 2015

@infograf768 JM, can you please review/test this?

avatar smz smz - change - 13 Jul 2015
Status Pending Closed
Closed_Date 0000-00-00 00:00:00 2015-07-13 08:10:45
Closed_By smz
avatar smz smz - close - 13 Jul 2015
avatar smz smz - close - 13 Jul 2015
avatar smz smz - change - 13 Jul 2015
Status Closed New
Closed_Date 2015-07-13 08:10:43
Closed_By smz
avatar smz smz - reopen - 13 Jul 2015
avatar smz smz - change - 13 Jul 2015
Status New Closed
Closed_Date 0000-00-00 00:00:00 2015-07-13 10:41:18
Closed_By smz
avatar smz smz - close - 13 Jul 2015

Add a Comment

Login with GitHub to post a comment