One of the errors that is coming up quite often is a result of a user trying a failed update for a second time and running into this sql error
2022-02-25T07:30:46+00:00 INFO 80.244.198.253 update Ran query from file 4.0.0-2018-05-15. Query text: CREATE TABLE IF NOT EXISTS `#__workflows` ( `id` int NOT NULL AUTO_INCREMENT, .
2022-02-25T07:30:46+00:00 INFO 80.244.198.253 update Ran query from file 4.0.0-2018-05-15. Query text: INSERT INTO `#__workflows` (`id`, `asset_id`, `published`, `title`, `description.
2022-02-25T07:30:46+00:00 INFO 80.244.198.253 update JInstaller: :Install: Error SQL Duplicate entry '1' for key 'PRIMARY
There are several options available to us to resolve by amending the query "insert into ..."
In our specific case the most relevant would appear to be changing the query to "replace into ..."
When issuing a REPLACE statement, there are two possible outcomes for each issued command:
@richard67 thoughts? You know way more than me on sql
Labels |
Added:
No Code Attached Yet
|
Using a "replace" statement in MySQL or doing the equivalent in PostgreSQL is called "upsert". Unfortunately the minimum version requirements of J4 don't allow to use that because we still have to support older MySQL and MariaDB versions which do not support that.
It could be solved by changing the insert statements to be done record by record, and for each record they use a select statement as a data source, similar as we do when we insert into the menu table and select the extension ID from the extensions table and put the result together with literal values for the other columns.
That select statement would contain a condition which returns no result when there is already a record with that ID (or other PK which would be violated).
We have done something similar for inserting the pkg_search record when updating.
That solution would have 2 disadvantages:
The minimum mysql version supported is 5.6 and that does support replace https://dev.mysql.com/doc/refman/5.6/en/replace.html
there is also INSERT ON DUPLICATE KEY UPDATE and INSERT IGNORE
or am I missing something?
Well, then it was either MariaDB or PostgreSQL where there was some problem why we could not do that.
Another problem is that even when we fix the duplicate key issue, there will be other issues which we can't fix with SQL only, see my comments in and it is also worth a question if we should leave the people with the illusion that their site is fine when we fix some of the glitches after failed updates but other not so visible glitches are remaining.
MariaDB supports all the methods I described. It is postgres that does not and would require a more complicated solution.
However as 98% of all installs are on mysql/mariadb just changing those updates would be a massive improvement. As only a small % of updates hit this issue then there will not be very many running postgresql at all.
Regarding the other issues. I agree with you and its why I introduced the logging although it still says installation success at the end when the sql updates terminated :( but lets fix the issues we can fix now.
And no I wont complain about immutable sql files this time (especially if it is commented as you did with the child templates)
I will investigate again on weekend and report back. But I'm still not sure if we should do something with it because it will not solve other problems like indexes already existing and MySQL not supporting an IX NOT EXISTS for adding columns and so on.
found this for postgres upsert
https://www.postgresqltutorial.com/postgresql-upsert/
and for adding columns can we use a stored procedure? https://thispointer.com/mysql-add-column-if-not-exists/
We have to check since which version upsert is supported in PostgreSQL.
I think it could be sufficient just to use INSERT IGNORE at some places.
Regarding stored procedures: That could be a way, but the SQL using that would not be handled by the database checker.
As far as I can see we could use INSERT ... ON DUPLICATE KEY UPDATE
for MySQL / MariaDB and INSERT ... ON CONFLICT
in our SQL scripts, since these statements are supported with the minimum database server versions for J4.
The question is if we really need that updating of existing records or if it would not just be enough to do nothing on conflict, i.e. not insert or update anything when the record already exists, because in the cases handled here (new stuff e.g. for workflows added with an update SQL) we can assume that the existing records after a first, unsuccessful update attempt will not be modified and nobody will create new workflows stuff in backend manually before the 2nd update attempt.
Also a way could be just to delete the records in the tables for workflows before inserting the new records (but after the CREATE TABLE ... IF NOT EXISTS
).
For other stuff which we insert on updates, like e.g. records in the menu table or in the extensions table or the assets table, we do not use hard-coded IDs. In such a case, when we run the update SQL again, we insert again new items for the same thing but with a different ID. This will result in redundant records, and an update on a violated primary key will not help. We would need other conditions (or keys) to identify what "the same thing" means in a particular table, like e.g. the combination of type
, element
, folder
and client_id
is in the extensions table. Not in all cases we have a key for that (but could create one), and should use that key to identify if there is already a record or not.
Another way to solve this and other issues would be PR #36506 , but I am not sure if it is sufficient to just allow the SQL to fail or if we would need some kind of condition which failure we allow. The correct update of the schema version in the database after each successful run of an SQL update script which that PR adds is a necessary fix independently from that condition thing.
I think for the duplicate keys we can assume that nobody modified the records in the mean time, so we do not need to update them if we run the update SQL script a second time, we simply need to do nothing when the record with that PK already exists.
That means we can use INSERT IGNORE
with MySQL/MariaDB and ON CONFLICT DO NOTHING
with PostgreSQL.
For the latter see https://www.postgresqltutorial.com/postgresql-upsert/ .
I will prepare a PR in the next days.
If you are curious: See #36506 (comment) ... I have created a branch with the change from PR #36506 and the necessary changes on the old update SQL scripts plus the change from #37154 .
The changes can be seen here: 4.1-dev...richard67:test-pr-36506 .
It only needs some additional comments in the old update SQL scripts.
PR will come tomorrow, I think.
I just came to close it but if you want it to stay open for now I am happy with that also
Closing as having a pull request. Please test #37156 . If you have tested that with success, you can also mark a successful test result for #36506 . And in my PR #37156 please check section "RFC: What is worth a discussion - feedback is welcome" in the description and let me know your opinion.
Thanks in advance.
Status | New | ⇒ | Closed |
Closed_Date | 0000-00-00 00:00:00 | ⇒ | 2022-02-28 14:04:45 |
Closed_By | ⇒ | richard67 |
It would only be the inserts that include the primary incremental record id that would need changing so that should only be the ones for workflows and maybe finder