[CONFSERVER-10030] db2: queries that use 'lower' do not use index because of case sensitivity

Type: Suggestion
Resolution: Won't Fix
Fix Version/s: None
Component/s: None
Labels:

Support reference count:
5
Feedback Policy:

We collect Confluence feedback from various sources, and we evaluate what we've collected when planning our product roadmap. To understand how this piece of feedback will be reviewed, see our Implementation of New Features Policy.

NOTE: This suggestion is for Confluence Server. Using Confluence Cloud? See the corresponding suggestion.

Most of the queries against CONTENT table use 'lower' db function for a title field. The goes for (space) key field in SPACES table.
We do it to make a case insensitive comparison, however it means that db2 does not use the existing indexes for those fields and performs a full table scan.
This results in a poor performance.

This can be applicable to other database too, but it is not verified. Also other databases may have global 'case sensitivity' flag.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List

lower-func-performance.patch
3 kB
03/Jul/2008 12:51 AM
lower-func-performance.patch
19 kB
03/Jul/2008 1:21 AM
lower-func-performance.patch
18 kB
28/Jan/2009 5:52 PM
lower-func-performance.patch
28 kB
21/Apr/2010 10:41 PM
lower-func-performance.patch
34 kB
02/Aug/2010 6:22 PM

is related to

CONFSERVER-12319 Adding users to a large group is slow with default (Hibernate) user management

Closed

CONFSERVER-9152 InnoDB vs MyISAM discrepancy with MySQL

Closed

relates to

CONFCLOUD-10030 db2: queries that use 'lower' do not use index because of case sensitivity

Closed

was cloned as

CONFSERVER-11577 Store pre-lowercased content titles for more efficient lookup

Closed

Form Name

Matt Ryall added a comment - 09/Jul/2012 3:46 AM

As per our announcement, the last major version of Confluence to support DB2 was Confluence 4.2. Confluence 4.3 and later will not support DB2.

Therefore this issue will not be fixed. See my comment above for a workaround in existing instances.

Matt Ryall added a comment - 09/Jul/2012 3:46 AM As per our announcement , the last major version of Confluence to support DB2 was Confluence 4.2. Confluence 4.3 and later will not support DB2. Therefore this issue will not be fixed. See my comment above for a workaround in existing instances.

Matt Ryall added a comment - 05/Jun/2012 11:53 PM

Adding a generated column and index manually can resolve this problem on DB2.

See the instructions on this page, under "DB2 for Linux or Windows": Creating a Lowercase Page Title Index.

Matt Ryall added a comment - 05/Jun/2012 11:53 PM Adding a generated column and index manually can resolve this problem on DB2. See the instructions on this page, under "DB2 for Linux or Windows": Creating a Lowercase Page Title Index .

Igor Minar added a comment - 02/Aug/2010 6:22 PM

Adding the latest version of the patch with removed lower() function calls. This one removes the calls also from all the Java classes (these were giving us a hard time in production as well).

Igor Minar added a comment - 02/Aug/2010 6:22 PM Adding the latest version of the patch with removed lower() function calls. This one removes the calls also from all the Java classes (these were giving us a hard time in production as well).

Igor Minar added a comment - 21/Apr/2010 10:41 PM

adding a patch against confluence 3.2 that removes all lower() calls and relies only on mysql case-insensitive collation

Igor Minar added a comment - 21/Apr/2010 10:41 PM adding a patch against confluence 3.2 that removes all lower() calls and relies only on mysql case-insensitive collation

Matt Ryall added a comment - 20/Aug/2009 5:24 AM

Confluence 3.0 includes some improved caching that alleviates this problem a lot for page lookups. We've also documented the instructions for creating lowercase indexes manually in several databases that support it (including DB2, thanks for the help – Patrick):

http://confluence.atlassian.com/display/DOC/Creating+a+Lowercase+Page+Title+Index

This issue will remain open, but should be much less of a problem if you can follow the instructions on that page.

Matt Ryall added a comment - 20/Aug/2009 5:24 AM Confluence 3.0 includes some improved caching that alleviates this problem a lot for page lookups. We've also documented the instructions for creating lowercase indexes manually in several databases that support it (including DB2, thanks for the help – Patrick): http://confluence.atlassian.com/display/DOC/Creating+a+Lowercase+Page+Title+Index This issue will remain open, but should be much less of a problem if you can follow the instructions on that page.

Igor Minar added a comment - 28/Jan/2009 5:52 PM

rebased version compatible with confluence 2.10.1

Igor Minar added a comment - 28/Jan/2009 5:52 PM rebased version compatible with confluence 2.10.1

Igor Minar added a comment - 11/Jul/2008 3:54 AM

Thanks Don. I'm not happy about the data redundancy, but if it means that you have only one db schema to support that is fast for everyone, then I think it's definitely worth it.

The benefit for us is that you'll do all your QA with the same schema as we use, minimizing the chance of something breaking for us in the future because of this change.

Igor Minar added a comment - 11/Jul/2008 3:54 AM Thanks Don. I'm not happy about the data redundancy, but if it means that you have only one db schema to support that is fast for everyone, then I think it's definitely worth it. The benefit for us is that you'll do all your QA with the same schema as we use, minimizing the chance of something breaking for us in the future because of this change. /i

Don Willis added a comment - 10/Jul/2008 7:51 AM

I completely agree that the use of lower in all these places is going to hurt performance for some databases and that Igor's approach is the most pragmatic one for us to take in fixing this. (This is despite the obvious ickiness of replicated data). I only suggested using the MySQL insensitivity because if we don't manage to prioritise this bug, then you're going to need to maintain whatever workaround you do, and I'd think that a simple workaround would be easier to maintain than one that is database agnostic.

Don Willis added a comment - 10/Jul/2008 7:51 AM I completely agree that the use of lower in all these places is going to hurt performance for some databases and that Igor's approach is the most pragmatic one for us to take in fixing this. (This is despite the obvious ickiness of replicated data). I only suggested using the MySQL insensitivity because if we don't manage to prioritise this bug, then you're going to need to maintain whatever workaround you do, and I'd think that a simple workaround would be easier to maintain than one that is database agnostic.

Igor Minar added a comment - 03/Jul/2008 4:17 PM

Don,

We took the risk and we deployed the patch to production yesterday. So far so good

The data was manually migrated using SQL, the queries should be in the comment section of the patch.

Thanks for the cluster update. I thought that just a simple serialization was used to push the objects over the wire to the second node, but I wasn't sure.

I'd prefer the solution proposed in the patch over changing the hibernate queries and exploiting MySQL insensitivity because the solution is the smallest common denominator for all of your supported databases, which would make it easier for you to test and support it.

Using this solution you could easily and safely provide this enhancements to all of your customers regardless of their database via an upgrade task (I can image that creating an upgrade task that would create custom indexes for all the supported databases would be much trickier).

Igor Minar added a comment - 03/Jul/2008 4:17 PM Don, We took the risk and we deployed the patch to production yesterday. So far so good The data was manually migrated using SQL, the queries should be in the comment section of the patch. Thanks for the cluster update. I thought that just a simple serialization was used to push the objects over the wire to the second node, but I wasn't sure. I'd prefer the solution proposed in the patch over changing the hibernate queries and exploiting MySQL insensitivity because the solution is the smallest common denominator for all of your supported databases, which would make it easier for you to test and support it. Using this solution you could easily and safely provide this enhancements to all of your customers regardless of their database via an upgrade task (I can image that creating an upgrade task that would create custom indexes for all the supported databases would be much trickier).

Deleted Account (Inactive) added a comment - 03/Jul/2008 12:58 PM

Hello,
I want to reiterate
1. DB2 9 for z/OS supports index on column expressions.
2. DB2 LUW supports generated columns with index on generated columns - where the database maintains the generated column and will automatically rewrite queries to use the generated column. In DB2 LUW 9.5, generated columns can be created as hidden columns (not returned for SELECT *).

So this problem is solvable in current releases of DB2 - but it requires user action to solve it.

This issue is a major problem for any site that has large tables. Basically, any customer that's going to have a large wiki installation - this issue is a show-stopper. The lower is used all over the place. Not just on content. The biggest issue is on the SPACES table - since it's used to look up content, blog posts, etc. So it's either a unique lookup on SPACES, or a table space scan to find 1 row.

We also saw this issue on LINKS.
select outgoingli0_.LINKID as LINKID, outgoingli0_.DESTPAGETITLE as DESTPAGE2_
, outgoingli0_.DESTSPACEKEY as DESTSPAC3_, outgoingli0_.CONTENTID as CONTENTID
, outgoingli0_.CREATOR as CREATOR, outgoingli0_.CREATIONDATE as CREATION6_
, outgoingli0_.LASTMODIFIER as LASTMODI7_, outgoingli0_.LASTMODDATE as LASTMODD8_
from LINKS outgoingli0_ CARD 1457006 NPAGES 57402
where (lower(outgoingli0_.DESTSPACEKEY)=? ) COLCARD 11320
and(lower(outgoingli0_.DESTPAGETITLE)=? ) COLCARD 356000

BLOGPOST Title also has the issue.

select blogpost0_.CONTENTID as CONTENTID, blogpost0_.SPACEID as SPACEID
, blogpost0_.TITLE as TITLE, blogpost0_.VERSION as VERSION
, blogpost0_.CREATOR as CREATOR, blogpost0_.CREATIONDATE as CREATION6_
, blogpost0_.LASTMODIFIER as LASTMODI7_, blogpost0_.LASTMODDATE as LASTMODD8_
, blogpost0_.VERSIONCOMMENT as VERSIONC9_, blogpost0_.PREVVER as PREVVER
, blogpost0_.CONTENT_STATUS as CONTENT11_
from CONTENT blogpost0_
, SPACES space1_
where blogpost0_.CONTENTTYPE='BLOGPOST'
and ((lower(space1_.SPACEKEY)=?
and blogpost0_.SPACEID=space1_.SPACEID)
and(lower(blogpost0_.TITLE)=? )
and(blogpost0_.CREATIONDATE>=? )
and(blogpost0_.CREATIONDATE<? )
and(blogpost0_.PREVVER is null )
and(blogpost0_.CONTENT_STATUS='current' ))

CONTENT_LABEL table.
select distinct label0_.LABELID as LABELID
, label0_.NAME as NAME, label0_.OWNER as OWNER
, label0_.NAMESPACE as NAMESPACE, label0_.CREATIONDATE as CREATION5_
, label0_.LASTMODDATE as LASTMODD6_
from LABEL label0_ CARD 22253 NPAGES 424
, CONTENT_LABEL labelling1_ CARD 73483 NPAGES 1711
where (lower(labelling1_.SPACEKEY)=? ) COLCARD 7813
AND(labelling1_.LABELID=label0_.LABELID ) COLCARD 22253/22253
AND(label0_.NAMESPACE=? ) COLCARD 4
order by label0_.NAME

ATTACHMENTS table.
SELECT attachmentid, title, contenttype, creator, creationdate
, lastmodifier, lastmoddate, filesize, attachment_comment
, attversion, prevver
FROM WIKI_RW.attachments CARD 403449
where pageid = ? COLCARD 94288
and lower(title) = ? COLCARD 253952
order by attversion desc with UR

Our wiki has a lot of content, a lot of attachements, a lot of spaces. Any big site - this issue is an absolute performance killer . The wiki quickly becomes useless.

Deleted Account (Inactive) added a comment - 03/Jul/2008 12:58 PM Hello, I want to reiterate 1. DB2 9 for z/OS supports index on column expressions. 2. DB2 LUW supports generated columns with index on generated columns - where the database maintains the generated column and will automatically rewrite queries to use the generated column. In DB2 LUW 9.5, generated columns can be created as hidden columns (not returned for SELECT *). So this problem is solvable in current releases of DB2 - but it requires user action to solve it. This issue is a major problem for any site that has large tables. Basically, any customer that's going to have a large wiki installation - this issue is a show-stopper. The lower is used all over the place. Not just on content. The biggest issue is on the SPACES table - since it's used to look up content, blog posts, etc. So it's either a unique lookup on SPACES, or a table space scan to find 1 row. We also saw this issue on LINKS. select outgoingli0_.LINKID as LINKID, outgoingli0_.DESTPAGETITLE as DESTPAGE2_ , outgoingli0_.DESTSPACEKEY as DESTSPAC3_, outgoingli0_.CONTENTID as CONTENTID , outgoingli0_.CREATOR as CREATOR, outgoingli0_.CREATIONDATE as CREATION6_ , outgoingli0_.LASTMODIFIER as LASTMODI7_, outgoingli0_.LASTMODDATE as LASTMODD8_ from LINKS outgoingli0_ CARD 1457006 NPAGES 57402 where (lower(outgoingli0_.DESTSPACEKEY)=? ) COLCARD 11320 and(lower(outgoingli0_.DESTPAGETITLE)=? ) COLCARD 356000 BLOGPOST Title also has the issue. select blogpost0_.CONTENTID as CONTENTID, blogpost0_.SPACEID as SPACEID , blogpost0_.TITLE as TITLE, blogpost0_.VERSION as VERSION , blogpost0_.CREATOR as CREATOR, blogpost0_.CREATIONDATE as CREATION6_ , blogpost0_.LASTMODIFIER as LASTMODI7_, blogpost0_.LASTMODDATE as LASTMODD8_ , blogpost0_.VERSIONCOMMENT as VERSIONC9_, blogpost0_.PREVVER as PREVVER , blogpost0_.CONTENT_STATUS as CONTENT11_ from CONTENT blogpost0_ , SPACES space1_ where blogpost0_.CONTENTTYPE='BLOGPOST' and ((lower(space1_.SPACEKEY)=? and blogpost0_.SPACEID=space1_.SPACEID) and(lower(blogpost0_.TITLE)=? ) and(blogpost0_.CREATIONDATE>=? ) and(blogpost0_.CREATIONDATE<? ) and(blogpost0_.PREVVER is null ) and(blogpost0_.CONTENT_STATUS='current' )) CONTENT_LABEL table. select distinct label0_.LABELID as LABELID , label0_.NAME as NAME, label0_.OWNER as OWNER , label0_.NAMESPACE as NAMESPACE, label0_.CREATIONDATE as CREATION5_ , label0_.LASTMODDATE as LASTMODD6_ from LABEL label0_ CARD 22253 NPAGES 424 , CONTENT_LABEL labelling1_ CARD 73483 NPAGES 1711 where (lower(labelling1_.SPACEKEY)=? ) COLCARD 7813 AND(labelling1_.LABELID=label0_.LABELID ) COLCARD 22253/22253 AND(label0_.NAMESPACE=? ) COLCARD 4 order by label0_.NAME ATTACHMENTS table. SELECT attachmentid, title, contenttype, creator, creationdate , lastmodifier, lastmoddate, filesize, attachment_comment , attversion, prevver FROM WIKI_RW.attachments CARD 403449 where pageid = ? COLCARD 94288 and lower(title) = ? COLCARD 253952 order by attversion desc with UR Our wiki has a lot of content, a lot of attachements, a lot of spaces. Any big site - this issue is an absolute performance killer . The wiki quickly becomes useless.

Assignee:: Matt Ryall

Reporter:: Anatoli

Votes:: 11 Vote for this issue

Watchers:: 9 Start watching this issue

Created:: 23/Nov/2007 12:58 AM

Updated:: 19/Sep/2019 5:22 AM

Resolved:: 09/Jul/2012 3:47 AM

Details

Description

Attachments

Attachments

Issue Links

Forms

Activity

Collapse comment: Matt Ryall added a comment - 09/Jul/2012 3:46 AM

Expand comment: Matt Ryall added a comment - 09/Jul/2012 3:46 AM

Collapse comment: Matt Ryall added a comment - 05/Jun/2012 11:53 PM

Expand comment: Matt Ryall added a comment - 05/Jun/2012 11:53 PM

Collapse comment: Igor Minar added a comment - 02/Aug/2010 6:22 PM

Expand comment: Igor Minar added a comment - 02/Aug/2010 6:22 PM

Collapse comment: Igor Minar added a comment - 21/Apr/2010 10:41 PM

Expand comment: Igor Minar added a comment - 21/Apr/2010 10:41 PM

Collapse comment: Matt Ryall added a comment - 20/Aug/2009 5:24 AM

Expand comment: Matt Ryall added a comment - 20/Aug/2009 5:24 AM

Collapse comment: Igor Minar added a comment - 28/Jan/2009 5:52 PM

Expand comment: Igor Minar added a comment - 28/Jan/2009 5:52 PM

Collapse comment: Igor Minar added a comment - 11/Jul/2008 3:54 AM

Expand comment: Igor Minar added a comment - 11/Jul/2008 3:54 AM

Collapse comment: Don Willis added a comment - 10/Jul/2008 7:51 AM

Expand comment: Don Willis added a comment - 10/Jul/2008 7:51 AM

Collapse comment: Igor Minar added a comment - 03/Jul/2008 4:17 PM

Expand comment: Igor Minar added a comment - 03/Jul/2008 4:17 PM

Collapse comment: Deleted Account (Inactive) added a comment - 03/Jul/2008 12:58 PM

Expand comment: Deleted Account (Inactive) added a comment - 03/Jul/2008 12:58 PM

People

Dates