[JRASERVER-73874] Reindexing issues with large number of comments, worklogs, history overloads indexers

Type: Bug
Resolution: Fixed
Priority: Highest (View bug fix roadmap)
Fix Version/s: 9.0.0, 8.22.4, 8.20.12
Affects Version/s: 8.13.0, 8.20.0, 8.20.1, 8.13.21, 8.20.9
Component/s: Data Center - Index, Indexing
Labels:
- deltaIssueLimits
- fixedByDelta

Introduced in Version:
8.13
Support reference count:
1
Symptom Severity:
Severity 2 - Major
UIS:
3
Bug Fix Policy:
View Atlassian Server bug fix policy

Issue Summary

This is reproducible on Data Center: yes

Problem & solution description:
https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index
https://community.atlassian.com/t5/Data-Center-articles/Introducing-Safeguards-for-Jira-index/ba-p/2037880#M433
https://confluence.atlassian.com/jirakb/jira-indexing-limits-stats-1125876330.html

Steps to Reproduce

Detailed description:
https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index/57636/4

Expected Results

Index should not be overloaded. Actions on issues with large number of comments, worklogs & history should not affect node & cluster performance & stability.

Actual Results

Index is overloaded. When this happens it will affect the performance of all actions triggering re-indexing. Jira may start dropping indexing request which can affect crucial Jira DC components like index replication.

When the index is overloaded you may see the following exception thrown on failed indexing actions:

IndexException: Wait attempt timed out - waited  30000 milliseconds

The problem can be monitored with JIRA-STATS INDEXING-QUEUE logs.

Workaround

Archive issues with large number of related entities (comments, worklogs, history).

is related to

JRASERVER-45903 Adding a new worklog entry is slow due to a large change history

Closed

JRASERVER-66251 Reindexing large number of issues with a lot of change history can cause an OOME

Closed

JRASERVER-72612 Limit the number of comments in an issue

Closed

JRASERVER-73167 Ability To Import Issues without Automatic Indexing

Gathering Interest

relates to

JRASERVER-72045 IndexException: Wait attempt timed out - waited 30000 milliseconds caused by background indexing tasks

Closed

JRASERVER-71976 DBR replication Fails on issues with high count of worklog entries

Closed

JRASERVER-71980 nodereindex replication fails for single issues with 1000 or more worklog entries

Closed

JRASERVER-77874 Allow Jira admins to select which fields can be converted to the supported one so that they can be searchable via UpdatedBy function

Gathering Interest

is mentioned by: IDXREACT-100 Loading...

mentioned in: Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...; Page Loading...

(3 relates to, 1 is mentioned by, 6 mentioned in)

Matt Doar added a comment - 05/Aug/2023 9:37 PM

That looks like building the Lexorank index
https://confluence.atlassian.com/jirakb/troubleshooting-lexorank-system-issues-779159221.html

Matt Doar added a comment - 05/Aug/2023 9:37 PM That looks like building the Lexorank index https://confluence.atlassian.com/jirakb/troubleshooting-lexorank-system-issues-779159221.html

Rusi Popov added a comment - 05/Aug/2023 11:41 AM

I commented in https://jira.atlassian.com/browse/JRASERVER-72045?attachmentOrder=desc
maybe this makes sense to be added here too:

The bottleneck was the database in our cases of Jira 8.20.x and 9.4 on 2 vCPU-machine with 32 G RAM and 25 indexing threads. For some reason, the indexing ran a query from Atlassian Greenhopper, that was very DB CPU-intensive and slow. For 12 hours the full indexing had completed 9% with that query:

SELECT "FIELD_ID","ID","ISSUE_ID","LOCK_HASH","LOCK_TIME","RANK","TYPE" FROM jira_db."AO_60DB71_LEXORANK" WHERE "FIELD_ID" = $1 AND "ISSUE_ID" = $2 AND "TYPE" = $3

with plan:

jira_rds=> explain SELECT DATA."FIELD_ID",DATA."CHECKED",DATA."DATA_ROW_ID",DATA."DECIMAL_NUMBER_STR",DATA."ISSUE_ID",DATA."ID",DATA."DECIMAL_NUMBER",DATA."DATE",DATA."SIMPLE_DATA",DATA."SELECTED_ID",DATA."USER_KEY",DATA."SIMPLE_NUMBER" FROM jira_db."AO_FC7135_DATA" DATA JOIN jira_db."AO_FC7135_DATA_ROW" DATAROW ON DATA."DATA_ROW_ID" = DATAROW."ID" WHERE DATAROW."FIELD_CONFIG_ID" = 13 AND DATAROW."ISSUE_ID" = 13 ORDER BY DATAROW."ORDER_POSITION" ASC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Sort (cost=677.91..677.92 rows=7 width=179)
Sort Key: datarow."ORDER_POSITION"
-> Nested Loop (cost=0.42..677.81 rows=7 width=179)
-> Seq Scan on "AO_FC7135_DATA_ROW" datarow (cost=0.00..667.93 rows=1 width=16)
Filter: (("FIELD_CONFIG_ID" = 13) AND ("ISSUE_ID" = 13))
-> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.81 rows=7 width=171)
Index Cond: ("DATA_ROW_ID" = datarow."ID")

created the index:

*create index "AO_60DB71_LEXORANK_FLD_ISS_TYP" ON "AO_60DB71_LEXORANK" ("FIELD_ID", "ISSUE_ID", "TYPE")

the load dropped and the plan now is:

QUERY PLAN
---------------------------------------------------------------------------------------------------------------
Index Scan using "AO_60DB71_LEXORANK_FLD_ISS_TYP" on "AO_60DB71_LEXORANK" (cost=0.43..8.45 rows=1 width=562)
Index Cond: (("FIELD_ID" = 1) AND ("ISSUE_ID" = 2) AND ("TYPE" = 3))

And also the query:

jira_rds=> explain SELECT DATA."DATE_TIME",DATA."ID",DATA."FIELD_ID",DATA."CHECKED",DATA."DATA_ROW_ID",
               DATA."DECIMAL_NUMBER_STR",DATA."ISSUE_ID",DATA."DECIMAL_NUMBER",DATA."SIMPLE_DATA",
               DATA."DATE",DATA."SELECTED_ID",DATA."USER_KEY",DATA."SIMPLE_NUMBER"
          FROM jira_db."AO_FC7135_DATA" DATA
               JOIN jira_db."AO_FC7135_DATA_ROW" DATAROW ON DATA."DATA_ROW_ID" = DATAROW."ID"
         WHERE DATAROW."FIELD_CONFIG_ID" = 1
           AND DATAROW."ISSUE_ID" = 2
         ORDER BY DATAROW."ORDER_POSITION" ASC;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Sort (cost=1005.27..1005.28 rows=6 width=203)
Sort Key: datarow."ORDER_POSITION"
-> Nested Loop (cost=0.42..1005.19 rows=6 width=203)
-> Seq Scan on "AO_FC7135_DATA_ROW" datarow (cost=0.00..996.04 rows=1 width=16)
Filter: (("FIELD_CONFIG_ID" = 1) AND ("ISSUE_ID" = 2))
-> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.08 rows=7 width=195)
Index Cond: ("DATA_ROW_ID" = datarow."ID")
(7 rows)

I added the index:

*create index "AO_FC7135_DATA_ROW_issue_field_pos_id" on "AO_FC7135_DATA_ROW" ("FIELD_CONFIG_ID", "ISSUE_ID", "ORDER_POSITION", "ID");

and the plan now is:

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.84..17.59 rows=6 width=203)
-> Index Only Scan using "AO_FC7135_DATA_ROW_issue_field_pos_id" on "AO_FC7135_DATA_ROW" datarow (cost=0.41..8.43 rows=1 width=16)
Index Cond: (("FIELD_CONFIG_ID" = 1) AND ("ISSUE_ID" = 2))
-> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.08 rows=7 width=195)
Index Cond: ("DATA_ROW_ID" = datarow."ID")

With these two database indexes, the full indexing of JIRA completed in 2.5 hours.

The slow full index in JIRA might also be caused by inefficient DB queries. Consider adding (these) DB indexes and only then upgrade machines and indexing threads.

Rusi Popov added a comment - 05/Aug/2023 11:41 AM I commented in https://jira.atlassian.com/browse/JRASERVER-72045?attachmentOrder=desc maybe this makes sense to be added here too: The bottleneck was the database in our cases of Jira 8.20.x and 9.4 on 2 vCPU-machine with 32 G RAM and 25 indexing threads. For some reason, the indexing ran a query from Atlassian Greenhopper, that was very DB CPU-intensive and slow. For 12 hours the full indexing had completed 9% with that query: SELECT "FIELD_ID" , "ID" , "ISSUE_ID" , "LOCK_HASH" , "LOCK_TIME" , "RANK" , "TYPE" FROM jira_db. "AO_60DB71_LEXORANK" WHERE "FIELD_ID" = $1 AND "ISSUE_ID" = $2 AND "TYPE" = $3 with plan: jira_rds=> explain SELECT DATA. "FIELD_ID" ,DATA. "CHECKED" ,DATA. "DATA_ROW_ID" ,DATA. "DECIMAL_NUMBER_STR" ,DATA. "ISSUE_ID" ,DATA. "ID" ,DATA. "DECIMAL_NUMBER" ,DATA. "DATE" ,DATA. "SIMPLE_DATA" ,DATA. "SELECTED_ID" ,DATA. "USER_KEY" ,DATA. "SIMPLE_NUMBER" FROM jira_db. "AO_FC7135_DATA" DATA JOIN jira_db. "AO_FC7135_DATA_ROW" DATAROW ON DATA. "DATA_ROW_ID" = DATAROW. "ID" WHERE DATAROW. "FIELD_CONFIG_ID" = 13 AND DATAROW. "ISSUE_ID" = 13 ORDER BY DATAROW. "ORDER_POSITION" ASC; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Sort (cost=677.91..677.92 rows=7 width=179) Sort Key: datarow. "ORDER_POSITION" -> Nested Loop (cost=0.42..677.81 rows=7 width=179) -> Seq Scan on "AO_FC7135_DATA_ROW" datarow (cost=0.00..667.93 rows=1 width=16) Filter: (( "FIELD_CONFIG_ID" = 13) AND ( "ISSUE_ID" = 13)) -> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.81 rows=7 width=171) Index Cond: ( "DATA_ROW_ID" = datarow. "ID" ) created the index: *create index "AO_60DB71_LEXORANK_FLD_ISS_TYP" ON "AO_60DB71_LEXORANK" ("FIELD_ID", "ISSUE_ID", "TYPE") the load dropped and the plan now is: QUERY PLAN --------------------------------------------------------------------------------------------------------------- Index Scan using "AO_60DB71_LEXORANK_FLD_ISS_TYP" on "AO_60DB71_LEXORANK" (cost=0.43..8.45 rows=1 width=562) Index Cond: (( "FIELD_ID" = 1) AND ( "ISSUE_ID" = 2) AND ( "TYPE" = 3)) And also the query: jira_rds=> explain SELECT DATA. "DATE_TIME" ,DATA. "ID" ,DATA. "FIELD_ID" ,DATA. "CHECKED" ,DATA. "DATA_ROW_ID" , DATA. "DECIMAL_NUMBER_STR" ,DATA. "ISSUE_ID" ,DATA. "DECIMAL_NUMBER" ,DATA. "SIMPLE_DATA" , DATA. "DATE" ,DATA. "SELECTED_ID" ,DATA. "USER_KEY" ,DATA. "SIMPLE_NUMBER" FROM jira_db. "AO_FC7135_DATA" DATA JOIN jira_db. "AO_FC7135_DATA_ROW" DATAROW ON DATA. "DATA_ROW_ID" = DATAROW. "ID" WHERE DATAROW. "FIELD_CONFIG_ID" = 1 AND DATAROW. "ISSUE_ID" = 2 ORDER BY DATAROW. "ORDER_POSITION" ASC; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------- Sort (cost=1005.27..1005.28 rows=6 width=203) Sort Key: datarow. "ORDER_POSITION" -> Nested Loop (cost=0.42..1005.19 rows=6 width=203) -> Seq Scan on "AO_FC7135_DATA_ROW" datarow (cost=0.00..996.04 rows=1 width=16) Filter: (( "FIELD_CONFIG_ID" = 1) AND ( "ISSUE_ID" = 2)) -> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.08 rows=7 width=195) Index Cond: ( "DATA_ROW_ID" = datarow. "ID" ) (7 rows) I added the index: *create index "AO_FC7135_DATA_ROW_issue_field_pos_id" on "AO_FC7135_DATA_ROW" ("FIELD_CONFIG_ID", "ISSUE_ID", "ORDER_POSITION", "ID"); and the plan now is: QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.84..17.59 rows=6 width=203) -> Index Only Scan using "AO_FC7135_DATA_ROW_issue_field_pos_id" on "AO_FC7135_DATA_ROW" datarow (cost=0.41..8.43 rows=1 width=16) Index Cond: (( "FIELD_CONFIG_ID" = 1) AND ( "ISSUE_ID" = 2)) -> Index Scan using index_ao_fc7135_dat1120928830 on "AO_FC7135_DATA" data (cost=0.42..9.08 rows=7 width=195) Index Cond: ( "DATA_ROW_ID" = datarow. "ID" ) With these two database indexes, the full indexing of JIRA completed in 2.5 hours. The slow full index in JIRA might also be caused by inefficient DB queries. Consider adding (these) DB indexes and only then upgrade machines and indexing threads.

Matt Doar added a comment - 14/Nov/2022 7:32 PM

Good to hear, thanks

Matt Doar added a comment - 14/Nov/2022 7:32 PM Good to hear, thanks

Maciej Swinarski (Inactive) added a comment - 01/Sep/2022 7:24 PM

Hi thony.lundin, this fix was backported to 8.20.12.

Maciej Swinarski (Inactive) added a comment - 01/Sep/2022 7:24 PM Hi thony.lundin , this fix was backported to 8.20.12.

Thony Lundin added a comment - 25/Jul/2022 1:49 PM

Any news on backporting to 8.20 LTS?

Thony Lundin added a comment - 25/Jul/2022 1:49 PM Any news on backporting to 8.20 LTS?

Kamil Cichy (Inactive) added a comment - 20/Jun/2022 10:22 AM

e6a44563da75 ,

We're considering backporting the safeguards to the 8.20.x LTS, but I cannot promise that yet.

Kamil Cichy (Inactive) added a comment - 20/Jun/2022 10:22 AM e6a44563da75 , We're considering backporting the safeguards to the 8.20.x LTS, but I cannot promise that yet.

Matt Doar added a comment - 08/Jun/2022 9:00 PM

https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index/57636 is a useful link I had missed, thanks

Matt Doar added a comment - 08/Jun/2022 9:00 PM https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index/57636 is a useful link I had missed, thanks

Matt Doar added a comment - 08/Jun/2022 9:00 PM

Sorry. yes 8.20.x for LTS

Matt Doar added a comment - 08/Jun/2022 9:00 PM Sorry. yes 8.20.x for LTS

Kamil Cichy (Inactive) added a comment - 08/Jun/2022 10:33 AM

What changes were made in 8.22 to fix this, in general?

The changes are described in detail in these three articles: https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index/57636. Is there anything specific you'd like to know that's missing from these articles?

Are there plans to port those changes to an 8.22.x release?

I didn't get this one. These changes were introduced gradually in 8.22.2, 8.22.3 and 8.22.4.

Did you mean 8.20.x maybe?

Kamil Cichy (Inactive) added a comment - 08/Jun/2022 10:33 AM What changes were made in 8.22 to fix this, in general? The changes are described in detail in these three articles: https://community.developer.atlassian.com/t/safeguards-in-jira-dc-index/57636 . Is there anything specific you'd like to know that's missing from these articles? Are there plans to port those changes to an 8.22.x release? I didn't get this one. These changes were introduced gradually in 8.22.2, 8.22.3 and 8.22.4. Did you mean 8.20.x maybe?

Matt Doar added a comment - 07/Jun/2022 5:23 PM

1. What changes were made in 8.22 to fix this, in general?
2. Are there plans to port those changes to an 8.22.x release?

Matt Doar added a comment - 07/Jun/2022 5:23 PM 1. What changes were made in 8.22 to fix this, in general? 2. Are there plans to port those changes to an 8.22.x release?

Assignee:: Maciej Swinarski (Inactive)

Reporter:: Maciej Swinarski (Inactive)

Affected customers:: 1 This affects my team

Watchers:: 19 Start watching this issue

Created:: 01/Jun/2022 8:44 AM

Updated:: 09/Jul/2024 7:28 AM

Resolved:: 21/Jun/2022 9:39 AM

Details

Description

Issue Summary

Steps to Reproduce

Expected Results

Actual Results

Workaround

Attachments

Issue Links

Forms

Activity

Collapse comment: Matt Doar added a comment - 05/Aug/2023 9:37 PM

Expand comment: Matt Doar added a comment - 05/Aug/2023 9:37 PM

Collapse comment: Rusi Popov added a comment - 05/Aug/2023 11:41 AM

Expand comment: Rusi Popov added a comment - 05/Aug/2023 11:41 AM

Collapse comment: Matt Doar added a comment - 14/Nov/2022 7:32 PM

Expand comment: Matt Doar added a comment - 14/Nov/2022 7:32 PM

Collapse comment: Maciej Swinarski (Inactive) added a comment - 01/Sep/2022 7:24 PM

Expand comment: Maciej Swinarski (Inactive) added a comment - 01/Sep/2022 7:24 PM

Collapse comment: Thony Lundin added a comment - 25/Jul/2022 1:49 PM

Expand comment: Thony Lundin added a comment - 25/Jul/2022 1:49 PM

Collapse comment: Kamil Cichy (Inactive) added a comment - 20/Jun/2022 10:22 AM

Expand comment: Kamil Cichy (Inactive) added a comment - 20/Jun/2022 10:22 AM

Collapse comment: Matt Doar added a comment - 08/Jun/2022 9:00 PM

Expand comment: Matt Doar added a comment - 08/Jun/2022 9:00 PM

Collapse comment: Matt Doar added a comment - 08/Jun/2022 9:00 PM

Expand comment: Matt Doar added a comment - 08/Jun/2022 9:00 PM

Collapse comment: Kamil Cichy (Inactive) added a comment - 08/Jun/2022 10:33 AM

Expand comment: Kamil Cichy (Inactive) added a comment - 08/Jun/2022 10:33 AM

Collapse comment: Matt Doar added a comment - 07/Jun/2022 5:23 PM

Expand comment: Matt Doar added a comment - 07/Jun/2022 5:23 PM

People

Dates