Updating a user's attributes requires a remove then add of common attributes. The remove can, in some circumstances, get flushed after the add. This causes a constraint violation as the add happens before the remove.

            [CWD-1621] Updating user attributes causes database error

            shihab added a comment -

            A user can have many attributes. An attribute can have many values. In order to store a map of user attributes, a diff needs to take place in order to determine what attributes need to be added and what need to be replaced. This diff was resolved by removing any existing attribute matches and replacing them with the new attribute values. Basically, it comes down to a DELETE on matching attribute keys and INSERTs for the updated attribute values. Under normal circumstances, this is the preferred approach as it offers cleaner logic.

            Concurrently updating a user's attributes under high load can cause simultaneous transactions to perform two INSERTs of the same attribute in succession, violating the uniqueness constraint on the attribute values - due to the read-committed isolation level. This was the error we were experiencing on extranet. The bamboo-svn user performs highly concurrent authentication operations on Crowd. An authentication operation requires updating attributes on the user (eg. last authenticated stamp, etc). And hence the problem.

            In Crowd 1.x, we did not experience this problem because we let Hibernate manage our collections, at the cost of table-hops. In Crowd 2.0 we moved the multi-valued attributes from a 2-table model to a 1-table model. In order to be more "effective" with regards to collection mapping, we have implemented manual collection dirty-checking to move away from the DELETE/INSERT approach to an UPDATE and DELETE/INSERT only when absolutely required. This reduces the likelihood of INSERT collisions under high concurrent load.

            20 concurrent threads, each with 200 iterations of authentication of the same user, successfully executed against Crowd hooked up to Postgres 8.3. Prior to this commit, 2 concurrent threads would cause INSERT collisions. See PrincipalAuthenticationLoadTest.

            This problem was not identified earlier in load tests because our jmeter load tests did not test the specific case of hammering the exact same user under high load.

            shihab added a comment - A user can have many attributes. An attribute can have many values. In order to store a map of user attributes, a diff needs to take place in order to determine what attributes need to be added and what need to be replaced. This diff was resolved by removing any existing attribute matches and replacing them with the new attribute values. Basically, it comes down to a DELETE on matching attribute keys and INSERTs for the updated attribute values. Under normal circumstances, this is the preferred approach as it offers cleaner logic. Concurrently updating a user's attributes under high load can cause simultaneous transactions to perform two INSERTs of the same attribute in succession, violating the uniqueness constraint on the attribute values - due to the read-committed isolation level. This was the error we were experiencing on extranet. The bamboo-svn user performs highly concurrent authentication operations on Crowd. An authentication operation requires updating attributes on the user (eg. last authenticated stamp, etc). And hence the problem. In Crowd 1.x, we did not experience this problem because we let Hibernate manage our collections, at the cost of table-hops. In Crowd 2.0 we moved the multi-valued attributes from a 2-table model to a 1-table model. In order to be more "effective" with regards to collection mapping, we have implemented manual collection dirty-checking to move away from the DELETE/INSERT approach to an UPDATE and DELETE/INSERT only when absolutely required. This reduces the likelihood of INSERT collisions under high concurrent load. 20 concurrent threads, each with 200 iterations of authentication of the same user, successfully executed against Crowd hooked up to Postgres 8.3. Prior to this commit, 2 concurrent threads would cause INSERT collisions. See PrincipalAuthenticationLoadTest. This problem was not identified earlier in load tests because our jmeter load tests did not test the specific case of hammering the exact same user under high load.

            Thanks for the info Shihab.

            PdZ (Inactive) added a comment - Thanks for the info Shihab.

            shihab added a comment -

            Dogfooding Crowd on extranet. Specifically, the bamboo-svn login was placing a high authentication load on the Crowd server. See Fisheye commit logs for more information on a test that can highlight the issue locally.

            shihab added a comment - Dogfooding Crowd on extranet. Specifically, the bamboo-svn login was placing a high authentication load on the Crowd server. See Fisheye commit logs for more information on a test that can highlight the issue locally.

            Shihab, can you let us know how this issue was discovered please.

            PdZ (Inactive) added a comment - Shihab, can you let us know how this issue was discovered please.

            shihab added a comment -

            This did not resolve the problem. We are now investigating replicating high load with postgres (for authentication requests) on our local machines.

            shihab added a comment - This did not resolve the problem. We are now investigating replicating high load with postgres (for authentication requests) on our local machines.

            shihab added a comment -

            Fixed via flushing the Hibernate session after the remove, prior to performing the add. Doesn't feel good though.

            shihab added a comment - Fixed via flushing the Hibernate session after the remove, prior to performing the add. Doesn't feel good though.

              shamid@atlassian.com shihab
              shamid@atlassian.com shihab
              Affected customers:
              0 This affects my team
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: