Issue Details (XML | Word | Printable)

Key: CONF-5494
Type: Improvement Improvement
Status: Open Open
Priority: Major Major
Assignee: Unassigned
Reporter: Dave Loeng [Atlassian]
Votes: 16
Watchers: 16
Operations

Add/Edit UI Mockup to this issue
If you were logged in you would be able to see more operations.
Confluence

Improve page load time when it contains lots of links

Created: 14/Feb/06 07:07 PM   Updated: Thursday 06:20 AM
Component/s: Linking, Performance, Permissions, Renderer
Affects Version/s: 2.2.8
Fix Version/s: None

Time Tracking:
Not Specified

Issue Links:
Duplicate
 
Reference
 
Supersession
 

Participants: Aaron Hamid, Chris Kiehl [Atlassian], Dave Loeng [Atlassian], Don Willis [Atlassian], Garnet R. Chaney, Joseph Benjamin, Matt Ryall [Atlassian], Scott Farquhar [Atlassian] and Tony Atkins
Since last comment: 15 weeks, 4 days ago
Labels:
Support reference count: 10


 Description  « Hide
By links, we mean links to other Confluence pages.

This is slow at the moment because for each link Confluence needs to hit the database to determine if the link points to an existing page or not.



 All   Comments   Work Log   Change History      Sort Order: Ascending order - Click to sort in descending order
Garnet R. Chaney added a comment - 14/Sep/07 11:35 PM
This is another example why the rendering process needs to be exposed to allow plugins to be inserted in it.

What about a page directive that says "Assume all links exist". This could drammatically improve rendering time.

I am finding that a page with 1500 children, and containing links to all the children, takes over 3 minutes to render, despite the fact that the wikitext is only about 60K.

Another work around might be to change the links to the other pages into external links back to the wiki. That might avoid the database lookup. Of course you'd loose features like rename, and "what pages link to this page" type features.


Don Willis [Atlassian] added a comment - 24/Jan/08 05:47 PM
Most of the slowness is from checking whether the current user has permission to see the page or not. This involves checking both the Space Permissions and Page Restrictions. The Space Permission checks can be particularly costly and are not cached, even during the request.

Tony Atkins added a comment - 25/Apr/08 10:26 AM
So the next question based on that would be, why aren't at least the space permissions cached during the request?

This issue is starting to hit us hard, we have pages that end up checking permissions over 500 times and caching only saves the second person to view the data within the refresh window. Like the above respondent, we see response times in the tens of seconds for some pages.


Don Willis [Atlassian] added a comment - 27/Apr/08 08:42 PM
Hi Tony,

There's no good reason not to cache them during the request. We just haven't implemented it yet. Thankyou for voting for this issue.

Cheers,
Don


Joseph Benjamin added a comment - 02/May/08 10:57 AM
There are several other issues related to this same problem as well. We are experiencing very slow load times on a Space with 1000 child pages.

It may make sense to combine all of these into one larger support issue? I know this problem has limited our ability to roll Confluence out to the rest of our users who want a separate page per customer.

Here are the other Issues I know of related to this:

https://support.atlassian.com/browse/CSP-16926
http://jira.atlassian.com/browse/CONF-10535 and also
http://jira.atlassian.com/browse/CONF-9972
http://jira.atlassian.com/browse/CONF-4710

are related to slow performance on spaces with a lot of child pages

thanks!


Aaron Hamid added a comment - 21/Nov/08 04:22 PM
What version(s) does this affect? We are on 2.8.2 with a fairly large install and are seeing some chronic performance problems that could be related. Our 'Space Permissions' cache is listed as 98% effective.

Don Willis [Atlassian] added a comment - 23/Nov/08 06:31 PM
Hi, this affects all versions of Confluence. It should be improved somewhat in Confluence 2.10 by CONF-13235

Chris Kiehl [Atlassian] added a comment - 06/Jan/09 05:38 PM
We should implement an exists() method on the DAO and the PageManager. This is all what the link renderer needs. Once that is done we can optimize that method to not load whole page objects. And we could use the internal sapcekey/title -> id cache. If there is an entry for a spacekey/title combination we can just return true.

Scott Farquhar [Atlassian] added a comment - 06/Jan/09 07:54 PM
Could we talk to the index for this?

Alternatively, could we do a two-pass system, where we find out all the links we want to link to, and ask for them in one database query, and parse the results? Should be faster than multiple db queries.


Matt Ryall [Atlassian] added a comment - 26/Jan/09 06:52 PM
The permissions checking no longer shows up in our profiling. Rather, the majority of the time in link rendering is spent determining whether page exists.

We'll be implementing Chris's suggested fix for the next major release of Confluence.


Chris Kiehl [Atlassian] added a comment - 27/Jan/09 10:13 PM
@Scott:

The thing with the lucene index in Confluence is that it is only updated once every minute. That way you might end up with links not being rendered just because the index queue is not flushed (for example if you create a new page and link to it on some other page straigt away).
Using just one query is a good idea as well, although we might run into problems with there query size if we try querying for 1500 pages at once (we could still break it down into batches though). And I think this change would mean to change the behaviour of the renderer quite a bit.
That's why we will first implement the solution which uses the in-memory cache and see if there really is need for further optimization.


Matt Ryall [Atlassian] added a comment - 17/Mar/09 02:03 AM
I've tested with 500 child pages and either the {children} macro or a list of links on the parent page, using Confluence standalone with HSQL. There's no significant difference between 2.10.1 and 3.0. In both cases, primed page and permission caches help reduce the time from 14 seconds to 1.5 seconds.

I'm going to test further with a larger data set to see if there's some aspect of scale I'm not examining in my current test (like, say, having a huge number of rows in the CONTENT table). I'd like to be able to reproduce the problem in 2.10 – at the moment it seems like this is just a deficiency in the size of the cache.