File uploaded and stored with a "safe" filename.
File is not uploaded. An error message about "This file type is not supported." is displayed.
Joomla 3.4.3, Linux, Apache, PHP 5.5
The makeSafe function strips out all non-english characters from the filename, leaving only "png" which does not qualify as a valid filename.
Could we do this the same way as in JFilterOutput::stringURLSafe() and use JFactory::getLanguage()->transliterate() on the filename before running the stripping regexp?
This needs to be language independent, so it should not require additional processing through a language file.
The regular expression used by JFile::makeSafe needs to use a PCRE modifier to allow UTF-8 characters, eg:
$regex = array('#(\.){2,}#', '#[^a-zA-Z0-9_\.\-~\p{L}\p{N}\s ]#u');
however this requires PCRE 5 support.
See http://php.net/manual/en/regexp.reference.unicode.php
JCE already handles this quite well I think, so one might consider a variation of:
https://github.com/widgetfactory/jce/blob/master/components/com_jce/editor/libraries/classes/utility.php#L146-L215
I wouldn't make it to complicate.
I would use the JApplication::stringURLSafe()
method which takes into account the setting from the config if we want to allow unicode alias or not.
If the resulting filename doesn't contain any valid characters anymore, either show an error or just create a filename based on datetime.
Like something along what I do in my own extension: https://github.com/Bakual/SermonSpeaker/blob/master/com_sermonspeaker/admin/controllers/file.json.php#L75-L88
// Make filename URL safe. Eg replaces ä with ae.
$file['name'] = JFilterOutput::stringURLSafe(JFile::stripExt($file['name'])) . '.' . $ext;
// Make the filename safe
$file['name'] = JFile::makeSafe($file['name']);
// Replace spaces in filename as long as makeSafe doesn't do this.
$file['name'] = str_replace(' ', '_', $file['name']);
// Check if filename has more chars than only underscores, making a new filename based on current date/time if not.
if (count_chars(JFile::stripExt($file['name']), 3) == '_')
{
$file['name'] = JFactory::getDate()->format("Y-m-d-H-i-s") . '.' . $ext;
}
JApplication::stringURLSafe()
makes the string lowercase and removes the period character.
It still uses the language system and URL safe strings don't use the same
rules as filesystem safe strings so I wouldn't necessarily go that route.
On Thursday, September 10, 2015, Thomas Hunziker notifications@github.com
wrote:
I wouldn't make it to complicate.
I would use the JApplication::stringURLSafe() method which takes into
account the setting from the config if we want to allow unicode alias or
not.
If the resulting filename doesn't contain any valid characters anymore,
either show an error or just create a filename based on datetime.—
Reply to this email directly or view it on GitHub
#7841 (comment).
I don't have an issue if it uses language functions. I'd say there is a high chance the special characters present in the filename are from the same language the user has active.
Also lowercasing the filename isn't necessary a bad thing when it comes to filenames that end up in URLs. It solved a lot of support requests in the case of my extension
But it may indeed be the wrong approach for the core.
what about use JLanguageTransliterate::utf8_latin_to_ascii()
directly?
So, it seems that issue with special characters was solved ?
But isn't it possible to allow spaces in uploaded file, and to convert space to dash ?
I agree with @Bakual that a url safe name could be good, to prevent issue in the case we allow spaces, with pdf files on old browsers (eg. allow spaces in uploaded file, convert to stringURLSafe, then file readable in IE7/XP as an example).
Status | New | ⇒ | Confirmed |
@JoomliC, what makes you think the issue was solved? Didn't find nothing about this. If it isn't solved yet, I would vote for Fedik's solution to use JLanguageTransliterate::utf8_latin_to_ascii()
. Works well. It might not be the perfect solution, but is way better than just stripping all the special characters from the file name. In languages like German, French, Spanish (and for sure a lot more), which are heavily relying on accents, the filenames look pretty ugly when just stripping all the accented letters. It could be considered a temporary solution until a more in-depth solution has been developed.
Of course after the transliteration the filename can and should still be "treated" by the makeSafe function.
@ZoFx me too, found nothing about this but when testing since 3.5 with åäö.jpg, this one works, and file is renamed aao.jpg (just tested again on Joomla 3.6.2, and still works for me).
That's why i supposed this fixed, but didn't see where the change was applied...
What does not work yet is a filename like this : 代替品.jpg
This why i proposed this in comment : #9608 (comment) (i can do a PR for asian, arabic ... characters to generate a datetime-based filename if accepted as a possible workaround ?)
I would recommend using transliterator_transliterate for this:
http://php.net/manual/en/transliterator.transliterate.php
eg:
if (function_exists('transliterator_transliterate')) {
return transliterator_transliterate('Any-Latin; Latin-ASCII;', $subject);
} else {
return JLanguageTransliterate::utf8_latin_to_ascii($subject);
}
This does work on 代替品 resulting in dai ti pin
Sadly that is php 5.4 and above only
On 14 September 2016 at 14:14, Ryan Demmer notifications@github.com wrote:
I would recommend using transliterator_transliterate for this:
http://php.net/manual/en/transliterator.transliterate.php
eg:
if (function_exists('transliterator_transliterate')) { return transliterator_transliterate('Any-Latin; Latin-ASCII;', $subject);} else { return JLanguageTransliterate::utf8_latin_to_ascii($subject);}
This does work on 代替品 resulting in dai ti pin
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#7841 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABPH8TNTka5XRbj7h1wyVb0OFXUMkjaHks5qp_MbgaJpZM4F6HHU
.
Brian Teeman
Co-founder Joomla! and OpenSourceMatters Inc.
http://brian.teeman.net/
@brianteeman I think this is why @ryandemmer had this :
if (function_exists('transliterator_transliterate')) {
;-)
@ryandemmer will you propose a PR to allow this ? (would be great!
The changes should probably be made to JFile::makeSafe - https://github.com/joomla/joomla-cms/blob/staging/libraries/joomla/filesystem/file.php#L58-L66
eg:
public static function makeSafe($file)
{
// Remove any trailing dots, as those aren't ever valid file names.
$file = rtrim($file, '.');
if (function_exists('transliterator_transliterate')) {
$transformed = transliterator_transliterate('Any-Latin; Latin-ASCII;', $file);
} else {
$transformed = JLanguageTransliterate::utf8_latin_to_ascii($file);
}
$regex = array('#(\.){2,}#', '#[^A-Za-z0-9\.\_\- ]#', '#^\.#');
if ($transformed !== false) {
return trim(preg_replace($regex, '', $transformed));
}
return trim(preg_replace($regex, '', $file));
}
@JoomliC Wow, that's strange. So it might be dependent on the setup ... Unfortunately I'm already well beyond my knowledge when it comes to charset and Joomla. I leave this up to you to investigate and hopefully fix it. Please let me know if I can assist you with additional information.
Just an idea: could it be that your server setup somehow already transliterates the filename before handing it over to PHP while in the case of my setup, it does not? I have only restricted knowledge of PHP programming, so please forgive if it's complete nonsense :-).
Original: 1öäü.jpg,
Upload: 1oau.jpg
Tested on:
Joomla! 3.7.0-beta1
macOS Sierra, 10.12.3
Firefox 50.1.0
PHP 7.0.4
MySQLi 5.5.53-0
Status | Confirmed | ⇒ | Information Required |
Yes, it is referenced a few lines above, but the present issue still has the no code attached
label.
And the PR has the status This branch is out-of-date with the base branch
Status | Information Required | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2017-06-03 15:49:31 |
Closed_By | ⇒ | franz-wohlkoenig |
Closed_Date | 2017-06-03 15:49:31 | ⇒ | 2017-06-03 15:49:32 |
Closed_By | franz-wohlkoenig | ⇒ | joomla-cms-bot |
Set to "closed" on behalf of @franz-wohlkoenig by The JTracker Application at issues.joomla.org/joomla-cms/7841
closed as having PR #12049
I will update that PR, just need to remove an unwantable dependency to some class (see the PR)
Issue confirmed
Another example is with a file names 1åäö.jpg - this will be uploaded as 1.jpg