Loading...

XML

Word

Printable

Type: Bug
Resolution: Low Engagement
Priority: Low
Fix Version/s: None
Affects Version/s: None
Component/s: Directory - Remote Crowd, Embedded
Labels:
- cleanup-seos-fy25
- cwd-cleanup-1

Support reference count:
7
Symptom Severity:
Severity 3 - Minor
UIS:
3

Problem

After an LDAP directory has already synced and users/groups/memberships have been cached to the database, subsequent syncs can take an extremely long amount of time to complete if there's a large change in user management data. This can be caused by a misconfiguration, e.g. a directory which pulled in too many users initially, and has been adjusted afterwards to pull in less. Or, it can simply be due to large changes on the LDAP server itself.

This has the potential to tie up CPU and database resources and can cause performance issues in the application.

Example/Steps to Reproduce

Configure an LDAP directory with 50000 users and 10000 groups, with an average of 100 group memberships per person
Allow the directory to be synced to the database
Subsequently, edit the LDAP directory in a way such that the user search filter now only pulls in 1000 users.

This will result in the directory needing to delete 49000 users from cwd_users. To do that, it also needs to look up all of the memberships of each of the 49000 users from the cwd_membership table and issue DELETE statements to the database to remove those. In the logs and the UI, there will be some indication that a massive amount of users are being removed:

2018-03-26 17:51:06,822 INFO [Caesium-1-3] [atlassian.crowd.directory.DbCachingRemoteChangeOperations] deleteCachedUsersByName deleting [ 49000 ] users

This process will likely take a long time to complete.

Suggestions

It would actually be quicker for Crowd to sync "from scratch" in this case instead of reconciling the cached dataset in the database. The act of having to work with a huge delta is expensive and taxing on the database. In other words, we can improve sync time in these cases by simply adding 1000 users from zero instead of trying to delete 49000 users from 50000. This will require the directory to perform some pre-sync statistics gathering, so that the directory can decide whether to use the "from scratch" approach or go with the original approach.

Additionally/alternatively, in the case of large deletes, we might be able to improve the way some of the changes are handled from the database perspective. For example, if the user to be removed is in 100 groups, the sync will first need to delete these 100 memberships. Currently, it will issue 100 individual DELETE statements using the primary key column of the cwd_membership table ("id"):

2018-03-26 18:51:06,822 DEBUG [Caesium-1-1] [org.hibernate.SQL] logStatement delete from cwd_membership where id=?
2018-03-26 18:51:06,822 TRACE [Caesium-1-1] [type.descriptor.sql.BasicBinder] bind binding parameter [1] as [BIGINT] - [294934]

It should be more efficient to look up all of the membership the rows associated with the user ID in one go, using the child_id column, and issue a single DELETE statement that way.

is cloned by: KRAK-2619 Loading...

mentioned in: Page Loading...

Assignee:: Unassigned
Reporter:: Robert Chang (Inactive)
Votes:: 7 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: 27/Mar/2018 12:12 AM
Updated:: 09/Apr/2025 5:26 PM
Resolved:: 09/Apr/2025 5:26 PM

Details

Description

Problem

Example/Steps to Reproduce

Suggestions

Attachments

Issue Links

Activity

People

Dates