User tests: Successful: Unsuccessful:
Work on the new Custom Fields feature for Joomla 3.7 has highlighted the need to define a common "core supported" syntax for embedding codes within content that can be replaced dynamically using the content plugins. At present, the loadmodule plugin is the only content plugin in the core distribution that does this kind of tag replacement. Since the loadmodule plugin has been around for many years, it has been extensively copied by third-party developers and many, although not all, have stayed with, or close to, the informally defined syntax that loadmodule supports.
For example, to embed a module position called "myposition" into an article using the "mychrome" style, you would insert the following string into the article:
{loadposition myposition,mychrome}
The new Custom Fields feature requires a more sophisticated syntax and initially this was achieved by importing the Mustache (sic) library (https://mustache.github.io/). However, this has a notably different syntax from the one established by the loadmodule precedent and the question arose as to whether that was the right direction for the Joomla project to follow.
In my opinion, it would be better to try to stay close to the existing syntax established by loadmodule, adding only backwards-compatible extensions to the syntax to support the new custom fields feature.
Most third-party developers tend to follow "core standards" so the core distribution tends to set a precedent which then becomes a de facto, albeit often undocumented standard. So it is important that the syntax that we come up with meets some basic objectives:
This pull request is offered as a potential solution that meets these objectives. It comprises some additional classes in the Joomla string library and a refactoring of the loadmodule plugin to make use of it.
If you fancy testing it then please check that the loadmodule plugin behaves exactly as it did previously. In particular, please test with multi-byte characters to make sure I have that handled properly.
Please also try to create your own content plugins using the library and see how you get on.
The syntax supported by the library is described below and this description could be used as the basis for the documentation should the PR be accepted.
To use the parser you follow these steps in your content plugin code:
For example, the following code appears in the loadmodule plugin:
// Get a content parser.
$parser = new JStringParser;
// Register the loadposition token.
$parser->registerToken(
'loadposition',
function(JStringToken $token)
{
$tokenParams = $token->getParams();
$position = trim($tokenParams[0]);
$style = isset($tokenParams[1]) ? trim($tokenParams[1]) : $this->params->def('style', 'none');
return addcslashes($this->_load($position, $style), '\\$');
}
);
The callback function takes a JStringToken object which allows access to the token definition that was registered by the registerToken method as well as to the specific parameters associated with the token in the input.
Notice that the token parameters are assumed to be a comma-separated list, but the parser makes no further assumptions about the syntax of the parameters. The parameters are made available to the callback function through the $token->getParams() array.
With this setup out of the way, the following code performs the actual content parsing and translation:
// Parse the content.
$article->text = $parser->translate($article->text);
The parser also supports a block syntax that is rich enough to support the custom fields extension. Here's an example:
<li>{field alias=something}{field-label}: {field-value}{/field}</li>
In this case we have a couple of simple tokens, field-label and field-value, surrounded by a begin-end pair of block tokens. The begin block token takes a single argument, which in this case contains an equals sign, although the parser does not attempt to understand it and will simply pass it to the callback function as the string "alias=something" in $token->getParams()[0].
Here is some pseudo-code that will handle the above syntax:
// Define a context variable.
$context = '';
// Get a content parser.
$parser = new JStringParser();
// Register the simple tokens.
$parser->registerToken(
'field-label',
function(JStringToken $token) use (&$context)
{
// Set $label to the label for the current field defined by $context.
return $label;
}
);
$parser->registerToken(
'field-value',
function(JStringToken $token) use (&$context)
{
// Set $value to the value of the current field defined by $context.
return $value;
}
);
// Register the block token
$parser->registerToken(
'field',
function(JStringToken $token, $content) use (&$context)
{
$tokenParams = $token->getParams();
// Set the context for any contained tokens.
$context = $tokenParams[0];
return $content;
}
);
// Parse the content.
$article->text = $parser->translate($article->text);
The first point to notice about the above code is that you indicate whether a token is a simple token or a block token by the third argument passed to the registerToken method. By default this is true, meaning that the token is a simple one. If you want to register a block token you must pass false as the third argument.
The second point to notice is that the callback function for the block token ("field") takes an additional argument. The first argument is a JStringToken as before, but the second argument will be passed the already processed string extracted from between the begin and end tokens. For example, suppose we have this content:
This is a field: {field article-id}Article ID{/field}
Then $content will be passed the string "Article ID" when the callback is called. However, if the string between the begin and end tokens contains other tokens, then these will already have been processed before the callback is called. So if we have this content:
This is a field: {field article-id}{field-label}: {field-value}{/field}
And if we assume the callback for the "field-label" token always returns the string "Label" and the callback for the "field-value" token always returns the string "Value", then $content will be passed the string "Label: Value" rather than "{field-label}: {field-value}".
The third point to note is the use of the $context variable to pass context from the block token callback to any callbacks handling tokens within the block. In the above example, the {field-label} and {field-value} will presumably depend on the "article-id" parameter in the {field article-id} token, so the $context variable is used to pass that context between the callbacks. Although a simple string variable is shown in the example code above, the $context variable can be anything. For example, if you need to support nested block tokens it would make sense for $context to be a stack of contexts; perhaps an array which is pushed and popped appropriately. You may, of course, use multiple variables in the use clauses if you need to pass more context information around.
The parser supports the syntax defined by the following production rules:
list ::= string | string token list
token ::= simple | beginBlock list endBlock
simple ::= startOfToken name endOfToken | startOfToken name space params endOfToken
beginBlock ::= startOfToken name endOfToken | startOfToken name space params endOfToken
endBlock ::= startOfToken / name endOfToken
params ::= param | param paramSeparator params
string ::= any sequence of zero or more characters not including startOfToken
name ::= any sequence of at least one non-space character
param ::= any sequence of zero or more characters except paramSeparator and endOfToken
Category | ⇒ | Libraries Plugins Front End |
Status | New | ⇒ | Pending |
Labels |
Added:
?
|
Category | Libraries Plugins Front End | ⇒ | Fields Front End Libraries Plugins |
Status | Pending | ⇒ | Discussion |
@brianteeman Yes, simple and blockToken have identical syntax definitions. The third argument passed to the registerToken method allows the parser to tell the difference.
Labels |
Added:
?
|
I have revised the PR as follows:
To use the parser you follow these steps in your content plugin code:
For example, this code
// Get a content parser.
$parser = new JStringParser;
// Register the "mytoken" token.
$parser->register('mytoken', new JStringTokenSimple('contains my token'));
echo $parser->translate('This string {mytoken}.');
will output the string
This string contains my token.
IMPORTANT: Token names are case-insensitive.
You can, of course, register a variable to provide the replacement string:
$myString = 'Walrus';
echo (new JStringParser)
->register('character', new JStringTokenSimple($myString))
->translate('The time has come, the {character} said.')
;
// The time has come, the Walrus said.
NOTE: If you call register with a token name that is already registered, then your definition will replace the earlier one. You can unregister a token using the parser's unregister method.
You can register a callback function that will be called whenever the token is encountered. The callback should return the string that will replace the token. The callback function takes a JStringToken object which allows access to the token definition provided by the register method.
echo (new JStringParser)
->register(
'simple',
(new JStringTokenSimple)->callback(
function(JStringToken $token) {
return '[' . strtoupper($token->getName()) . ']';
}
)
)
->translate('This string contains a {simple} callback token.')
;
// This string contains a [SIMPLE] callback token.
You can pass parameters in the token and these are also available in the JStringToken object. For example, the following code appears in the loadmodule plugin:
// Get a content parser.
$parser = new JStringParser;
// Register the loadposition token.
// Syntax: {loadposition <module-position>[,<style>]}
$parser->register(
'loadposition',
(new JStringTokenSimple)->callback(
function(JStringToken $token)
{
$tokenParams = $token->getParams();
$position = trim($tokenParams[0]);
$style = isset($tokenParams[1]) ? trim($tokenParams[1]) : $this->params->def('style', 'none');
return addcslashes($this->_load($position, $style), '\\$');
}
)
);
// Parse the content.
$article->text = $parser->translate($article->text);
Notice that the token parameters are assumed to be a comma-separated list, but the parser makes no further assumptions about the syntax of the parameters. The parameters are made available to the callback function through the $token->getParams() method which returns an array.
You can also assign a JLayout object to process the data before rendering. In this example, a date-of-birth is formatted before being replaced into the output:
echo (new JStringParser)
->register(
'dob',
(new JStringTokenSimple(array('text' => '18 July 1918')))->layout(
new JLayoutFile('plugins.user.profile.fields.dob')
)
)
->translate('Nelson Mandela was born on {dob}.')
;
// Nelson Mandela was born on 18 July 1918.
The value assigned to the token in the JStringTokenSimple constructor is passed as the third argument to the callback function, if there is one. The result is then passed to the layout's render method, if there is one, before being substituted into the content string.
The parser also supports a block syntax in which a begin block token is paired with an end block token. A token is defined as being a block token rather than a simple token by declaring it with JStringTokenBlock instead JStringTokenSimple. The string between the begin and end tokens is passed to the callback function as the second argument. Here's an example:
echo (new JStringParser)
->register(
'shout',
(new JStringTokenBlock)->callback(
function(JStringToken $token, $content) {
return strtoupper($content);
}
)
)
->translate('Using all capitals is {shout}known as shouting{/shout} and should be avoided.')
;
// Using all capitals is KNOWN AS SHOUTING and should be avoided.
The exact same comments about parameters, callbacks and layouts apply to block tokens as well as simple tokens.
Block tokens can be nested and may also include simple tokens. When nesting, the inner content will be translated before being made available to the outer token.
It is possible to implement simple loops by defining a new parser inside the callback function of another. For example, suppose we have an array of field data like this:
$fieldset = array(
array(
'label' => 'field1',
'value' => 'val1',
),
array(
'label' => 'field2',
'value' => 'val2',
),
array(
'label' => 'field3',
'value' => 'val3',
),
);
Then the following code can be used to generate a list of field labels and values:
echo (new JStringParser)
->register(
'fieldset',
(new JStringTokenBlock($fieldset))->callback(
function(JStringToken $token, $content, $value) {
$return = '';
$parser = new JStringParser;
foreach ($value as $field)
{
$parser
->register('label', new JStringTokenSimple($field['label']))
->register('value', new JStringTokenSimple($field['value']))
;
$return .= $parser->translate($content);
}
return $return;
}
)
)
->translate('<ol>{fieldset}<li><strong>{label}</strong>: {value}</li>{/fieldset}</ol>')
;
// <ol><li><strong>field1</strong>: val1</li><li><strong>field2</strong>: val2</li><li><strong>field3</strong>: val3</li></ol>
The parser supports the syntax defined by the following production rules:
list ::= string | string token list
token ::= simple | beginBlock list endBlock
simple ::= startOfToken name endOfToken | startOfToken name space params endOfToken
beginBlock ::= startOfToken name endOfToken | startOfToken name space params endOfToken
endBlock ::= startOfToken / name endOfToken
params ::= param | param paramSeparator params
string ::= any sequence of zero or more characters not including startOfToken
name ::= any sequence of at least one non-space character
param ::= any sequence of zero or more characters except paramSeparator and endOfToken
I really appreciate your work here @chrisdavenport but don't you think you are trying to reinvent the wheel?
What I would do:
But above all: avoid doing things in the Joomla way.
As always I'm not criticizing your job. I respect the time you have invested contributing this. What I try is to avoid big mistakes in the direction Joomla is moving to.
I offer my help for anything you need including integrating everything I exposed here. Obviously I won't do it if nobody is listening up there in the PLT.
I looked at Twig and Mustache and a few others before embarking on writing my own. They are all fine packages in their own way, but I was looking for something that would support the really simple syntax that has become established for Joomla content plugins, remembering that this is syntax that ordinary users with absolutely no coding experience, must be able to handle. I also wanted something that would minimise use of regular expressions since these can be rather slow and many content plugins are typically run over the content of a page before it gets sent to the browser.
My specific reasons for rejecting Twig were:
Of course, there is nothing to stop third-party developers including Twig, Mustache or any other template engine of choice in their own plugins.
Thanks for your feedback @roberto. I really appreciate the time you have spent thinking about this and the voice of your experience is always worth listening to. Personally, I'm not (yet) convinced that Twig is the right answer (although it might be for Joomla X), but I'm going to defer to others to make the decision as I'm obviously too closely involved to be objective.
As I've worked with twig in both Grav and Gantry, I can say that it is fast -- people have benchmarked Gantry against other frameworks including Joomla itself and Gantry is generally almost as fast if not faster than Joomla itself. But twig is really more for templating than for content and compiles to PHP code to make it fast. This is also where I think it shouldn't be used for all content as it defeats the purpose of storing articles into database.
What comes to the syntax, I actually like {{
and {%
more than using single {
for everything -- also parsing something that doesn't occur in the text naturally makes it faster to parse. Twig is pretty easy to learn and people seem to love it once they get it, but it does require some learning and coding skills to master.
Twig can easily be used by multiple plugins, but writing a token parser for your own tags means that everyone creating a new syntax needs to create a class that reads the tokens and generates PHP code based on it. It is pretty involving task and needs some basic knowledge on how compilers work.
In summary: I don't think that twig should be used for this purpose, even though I'd love to see Joomla using twig as its primary templating language (instead of PHP files). I'm also not sure how you could use twig without introducing better and more general models for articles, categories etc which you could use to load arbitrary data from Joomla. Creating twig TokenParsers for everything just doesn't feel to be the right way to go...
If you want to see how Twig could be used in Joomla, please see: https://github.com/gantry/gantry5/blob/develop/engines/joomla/nucleus/particles/contentarray.html.twig which basically replaces most article modules in Joomla. But to make something like this to work, you really need to redo all the models as right now the models in Joomla work only in a single context (usually inside a single component).
Here are my models for Joomla articles allowing me to load and display Joomla articles from anywhere by using a simple API:
https://github.com/gantry/gantry5/tree/develop/src/platforms/joomla/Joomla/Content
Its documented (for Twig) in here:
http://docs.gantry.org/gantry5/advanced/content-in-particles
@chrisdavenport I've meant to contact you on these classes; I think they'd be really useful for your services work.
Thanks for the fast reply @chrisdavenport and for taking the time to reply and discuss things.
The syntax that ordinary users would need to learn is quite complex. Whilst it might just be acceptable to have users cope with "{{" instead of "{", the syntax for passing variables is something I know would be quite beyond some of the users I encounter!
About {{
& {
I don't think that's really an issue you can keep B/C for those tags but introduce new ones that will always use {{
. In fact is probably better because you know that {{
are always using Twig.
About passing variables I think that's because you haven't used Twig and you really think you need a custom token for everything. The main plan should be to write Entities that would be used internally by Twig. Let's take an example: Article twig entity which should be our future goal.
That class will only contain those methods that are publicly available for templates. So if you have a module that is displaying information 1 article its layout will receive an Article entity from where you can do whatever you want. Imagine that you need to get the author of the article. You could do something like:
<span class="article-author">{{ article.getAuthor().getName() }}</span>
or:
<span class="article-author">{{ article.author.name }}</span>
Because Twig already searches for getters automatically. What does that mean?
article.getAuthor()
will retrieve a User Twig entity which alraedy contains its own methods usable by templaters.name
db column to title
. That can be done transparently because the entity will retrieve the new data + ensure that getName()
method still returns the right information.It makes use of some scarily complex regular expressions. Take a look at the lexer here: https://github.com/twigphp/Twig/blob/v1.24.1/lib/Twig/Lexer.php#L42 To be fair, I haven't benchmarked it against my code, so maybe the difference isn't worth worrying about.
Twig is used everywhere and it has been available for years now. I don't think reliability is a real issue.
I didn't investigate further but it appears that Twig is intended to parse the content once and once only. I wasn't clear on how you could use it when multiple plugins are parsing the same content consecutively, which is what will happen if we have more than one content plugin using the same parser. We need each plugin to ignore tokens/variables it doesn't understand. Maybe there is a way to achieve that with Twig; I couldn't see it.
Plugins don't need to parse the same content recursively because plugins will load the tags, functions and filters into the main Twig enviroment and content/template will be processed once.
@phproberto I've already implemented Article classes which can be used for this. See my links above...
Thanks @mahagr. I hope that helps to understand the behavior I'm trying to describe and that we don't need to register 123123 custom tags. Just Twig entities that are passed to layouts and that will allow to retrieve an entity from another, etc. anb will serve as our API for templates.
Labels |
Removed:
?
|
Is anything happening with this RFC - its been over a year?
Hey @brianteeman,
lately I feel rejected by the system
as my vision for Joomla clearly goes in the opposite way than the decisions taken by the leadership teams. I decided to stop losing my time "fighting" the system to contribute things nobody wants. Instead I release my own packages.
You can find what I suggested here in my Joomla-Twig package:
https://phproberto.github.io/joomla-twig/
100% unit tested, 100% based on plugins and with public docs.
Closing this as it has clearly been abandoned.
Status | Discussion | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2018-04-10 12:48:25 |
Closed_By | ⇒ | brianteeman |
Are these supposed to be the same?
This comment was created with the J!Tracker Application at issues.joomla.org/joomla-cms/11702.