[BSERV-4701] Performance empty pull request rescope activities is very poor on MySQL

Type: Bug
Resolution: Fixed
Priority: High
Fix Version/s: 3.0.8, 3.1.5, 3.2.5, 3.3.1
Affects Version/s: 3.0.1
Component/s: Database - MySQL, Enterprise, Pull Requests
Labels:

Bug Fix Policy:
View Atlassian Server bug fix policy

~~STASH-4517~~ introduced a recurring background job that deletes empty rescope activities. Unfortunately, this delete query performs very poorly on MySQL when there are lots of records in the sta_activity_* tables. This can result in both an increase in deadlocks and extremely long delays on other SQL operations.

Another action which can trigger these slow operations is deleting a repository, since that requires deleting all of the repository's pull requests (and their activities).

is caused by

BSERV-4517 Empty activity deletion deadlocks on MySQL

Closed

mentioned in: Page Failed to load; Page Failed to load; Page Failed to load; Page Failed to load; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(6 mentioned in)

Morgan Tocker added a comment - 13/Jan/2015 11:01 PM

Excellent, thank you for clarifying!

It is worth pointing out that point in time recovery uses the same binary logs, so in the absence of replication it is still important to make sure the backup is taken when no temporary tables are open. I believe that both MySQL Enterprise Backup and Xtrabackup understand how to 'wait' for temp tables to close (the server exposes this meta data). It is possible that a user-scripted backup around mysqldump may not do this though.

Morgan Tocker added a comment - 13/Jan/2015 11:01 PM Excellent, thank you for clarifying! It is worth pointing out that point in time recovery uses the same binary logs, so in the absence of replication it is still important to make sure the backup is taken when no temporary tables are open. I believe that both MySQL Enterprise Backup and Xtrabackup understand how to 'wait' for temp tables to close (the server exposes this meta data). It is possible that a user-scripted backup around mysqldump may not do this though.

Bryan Turner (Inactive) added a comment - 13/Jan/2015 9:21 PM

Morgan,

#5 was the solution implemented; attempting to drop and recreate foreign keys on every delete was impractical (and didn't produce a timing improvement). When Stash detects that it is running with MySQL it now overrides Hibernate's standard TemporaryTableBulkIdStrategy behavior with a custom MultiTableBulkIdStrategy that performs a JOIN instead of using an IN clause.

Stash currently does not support using a MySQL database with any form of clustering or replication enabled, but I appreciate your insight into potential pitfalls there. We'll definitely keep that in mind if such support is ever added.

Best regards,
Bryan Turner
Atlassian Stash

Bryan Turner (Inactive) added a comment - 13/Jan/2015 9:21 PM Morgan, #5 was the solution implemented; attempting to drop and recreate foreign keys on every delete was impractical (and didn't produce a timing improvement). When Stash detects that it is running with MySQL it now overrides Hibernate's standard TemporaryTableBulkIdStrategy behavior with a custom MultiTableBulkIdStrategy that performs a JOIN instead of using an IN clause. Stash currently does not support using a MySQL database with any form of clustering or replication enabled, but I appreciate your insight into potential pitfalls there. We'll definitely keep that in mind if such support is ever added. Best regards, Bryan Turner Atlassian Stash

Morgan Tocker added a comment - 13/Jan/2015 5:48 PM

Hi!

I work on the MySQL Team @ Oracle. I have two suggestions for improving this fix:

The first is that only MySQL 5.6+ can OPTIMIZE an IN subquery[1]. The description says that you arrived at (4), but for compatibility with earlier MySQL versions, may I suggest rewriting to use a join, similar to solution (5)?

The second suggestion I have it that there is some risk in using this query pattern with STATEMENT-BASED replication:

1) CREATE TEMPORARY TABLE activity_to_delete SELECT ...;
2) DELETE a FROM sta_activity a JOIN activity_to_delete atd on a.id = atd.id;

The reason is that when MySQL has a replication slave, it can not safely stop/crash/restart in between steps 1 and 2. In doing so, replication will stop because statement (2) does not know how to apply. This will then require ops intervention to resolve, and worst-case a re-image from the master!

Using ROW-BASED replication avoids this problem, and also reduces the amount of locking that the mysql server needs to apply for these statements (when used in conjunction with READ-COMMITTED as Stash uses).

[1] As an aside, MySQL 5.6 may also not be able to OPTIMIZE this subquery in current form. Explained here: http://mysqlserverteam.com/multi-table-trick/

Morgan Tocker added a comment - 13/Jan/2015 5:48 PM Hi! I work on the MySQL Team @ Oracle. I have two suggestions for improving this fix: The first is that only MySQL 5.6+ can OPTIMIZE an IN subquery [1] . The description says that you arrived at (4), but for compatibility with earlier MySQL versions, may I suggest rewriting to use a join, similar to solution (5)? The second suggestion I have it that there is some risk in using this query pattern with STATEMENT-BASED replication: 1) CREATE TEMPORARY TABLE activity_to_delete SELECT ...; 2) DELETE a FROM sta_activity a JOIN activity_to_delete atd on a.id = atd.id; The reason is that when MySQL has a replication slave, it can not safely stop/crash/restart in between steps 1 and 2. In doing so, replication will stop because statement (2) does not know how to apply. This will then require ops intervention to resolve, and worst-case a re-image from the master! Using ROW-BASED replication avoids this problem, and also reduces the amount of locking that the mysql server needs to apply for these statements (when used in conjunction with READ-COMMITTED as Stash uses). [1] As an aside, MySQL 5.6 may also not be able to OPTIMIZE this subquery in current form. Explained here: http://mysqlserverteam.com/multi-table-trick/

Brent P added a comment - 24/Sep/2014 7:20 AM

Stash 3.3.1 has been released with this bug fix.

Brent P added a comment - 24/Sep/2014 7:20 AM Stash 3.3.1 has been released with this bug fix.

Bitbucket Data Center

Details

Description

Attachments

Issue Links

Forms

Activity

Collapse comment: Morgan Tocker added a comment - 13/Jan/2015 11:01 PM

Expand comment: Morgan Tocker added a comment - 13/Jan/2015 11:01 PM

Collapse comment: Bryan Turner (Inactive) added a comment - 13/Jan/2015 9:21 PM

Expand comment: Bryan Turner (Inactive) added a comment - 13/Jan/2015 9:21 PM

Collapse comment: Morgan Tocker added a comment - 13/Jan/2015 5:48 PM

Expand comment: Morgan Tocker added a comment - 13/Jan/2015 5:48 PM

Collapse comment: Brent P added a comment - 24/Sep/2014 7:20 AM

Expand comment: Brent P added a comment - 24/Sep/2014 7:20 AM

People

Dates

Backbone Issue Sync