Monday, April 25, 2011

Sql Server INSERT scope problem

This may have been asked before, but it's really hard to search for terms that limit the search results...

Take the following SQL snippet:

declare @source table (id int)
declare @target table(id int primary key, sourceId int)


set nocount on

insert into @target values (0,0)

insert into @source(id) values(1)
--insert into @source(id) values(2)

set nocount off

insert into @target select (select max(id)+1 from @target), s.id from @source s

select * from @target

This obviously executes without error, but now uncomment the second insert line and the following error occurs:

Msg 2627, Level 14, State 1, Line 15
Violation of PRIMARY KEY constraint 'PK__#7DB3CB72__7EA7EFAB'. Cannot insert duplicate key in object 'dbo.@target'.

I realise that the insert statement more than likely is effected against a snapshot of the @target table so (select max(id)+1 from @target) will always return a value of 1 - causing the violation error above...

Is there any way around this apart from resorting to a cursor?

From stackoverflow
  • You could use an identity column (that's exactly what they are meant for)

    declare @target table(id int IDENTITY(1,1), sourceId int)
    

    If your problem is that the select clause is "computed" before the insert is executed, there's afaik no way around this using a single SQL request

    I think it's by design ; For your insertion to avoid duplicates, the index id must be computed during the insert, not during the select. This is the exact purpose of the IDENTITY keyword.

    If you want to insert one select at a time, you must write separate requests (using cursors for example, but you'll lose atomicity, and will have to use proper locking keywords to avoid race conditions)

    Sean : Thanks for the answer, though I am looking for an answer to this particular scenario as that is what I am facing at present. I cannot adjust the table structures and require exactly what is expected from the above. Note that the seed value may not necessarily be one, depending on targets contents...
    Brann : @Sean: then you should rephrase your question so that you expose what your real problem is, not a simplification (which leads to the kind of simplistic answer you're getting now)
    Brann : @Sean : you may also want to fix your title so that it describes more accurately your question.
    Sean : @Brann : I get what you're saying but I thought that the very specific error in the above simple scenario would elicit an answer, as opposed to being offerd alternative ways of doing things. I kept it simple to highlight the issue. Any suggestion for a title?
    Brann : @Sean ; I edited my answer accordingly ; hope this helps. for a title, what about "SQL IDENTITY-like insertion behaviour without using the IDENTITY keyword" ?
    Sean : @Brann : it's as I thought then. I was wondering if there was a 'switch' (for want of a better word) which would allow reads of the uncommitted data to be used in subsequent inserts. I used a cursor as this was a once-off exercise. Always looking for new tips and tricks though.
  • The way you're determining your new PK value, is a race condition waiting to happen. If your DB is under high load, and multiple records are being inserted at the same time, you're going to get unexpected results.

    Why don't you just use an identity column , and let the database handle the assignment of a new primary Id ?

    Or, you can create some kind of meta-table, which holds a record for every table in your database, and this record contains the next value that should be used as a primary id in the table. Then, you must make sure that every time you create a new record, you also update the next-value in your meta-table (and you should make sure that you do the appropriate locking), but, I see no added value in this approach vs making use of identity columns.

    Sean : See comment from above. No worry about race conditions - this is to be done in a very controlled environment. And the PK is a composite key - the above sample is just that - a sample... it doesn't represent the whole scenario - just the section of the scenario causing me a problem. Thanks anyway.
  • Change your insert statement to the following:

    insert into @target select (select max(id) from @target) + (ROW_NUMBER() OVER(ORDER BY s.id)), s.id from @source s

    This should work for this specific case but I would be careful about generalizing it.

    kristof : +1. @Sean If you cannot modify the target to use identity, that solution should work for you. Just make sure you do not execute multiple statements at the same time.
    Sean : @Brian - if I had enough of a reputation I'd +1 your answer too. This is precisely the type of answer I was hoping for. Thanks.

0 comments:

Post a Comment