connect milos as sysdba ...: 2011

Let's say that you want to re-create all nonclustered PKs and make them clustered.
Here is stored procedure which will generate necessary DDLs.

Table that will store DDLs:



create table log_table 
(
 id integer identity,
 t text
);

Stored procedure:



IF OBJECT_ID ( 'sp_fk_pk', 'P' ) IS NOT NULL 
    DROP PROCEDURE sp_fk_pk;
GO

CREATE PROCEDURE sp_fk_pk
AS

-- Declare variables to store data returned by cursor
DECLARE 
 @childColumnList varchar(2000),
 @parentColumnList varchar(2000),
 @childSchema varchar(100),
 @childTable varchar(100),
 @fkName varchar(100),
 @parentSchema varchar(100),
 @parentTable varchar(100);

-- clean log table 
TRUNCATE TABLE log_table;


-- generate DDL to drop all foreign keys
INSERT INTO LOG_TABLE (t)
SELECT 
 'alter table ['+ SCHEMA_NAME(TB.schema_id)+'].['+OBJECT_NAME(F.PARENT_OBJECT_ID)+'] drop constraint ' + F.NAME  +';'
FROM SYS.FOREIGN_KEYS AS F
JOIN sys.tables AS TB ON F.parent_object_id = TB.object_id
JOIN sys.tables AS FT ON F.REFERENCED_OBJECT_ID = FT.object_id;



/**
  PRIMARY KEY 
 **/



-- declare the cursor to get all NONCLUSTERED indexes on primary keys
DECLARE dm_cursor_pk Cursor
FOR 
 select TABLE_SCHEMA, TABLE_NAME, name 
 from sys.indexes i, INFORMATION_SCHEMA.TABLE_CONSTRAINTS tc
 where name=tc.CONSTRAINT_NAME
 and type_desc='NONCLUSTERED';

-- open cursor
OPEN dm_cursor_pk 

-- iterate over recordset
FETCH NEXT from dm_cursor_pk 
INTO 
 @childSchema,
 @childTable,
 @fkName;
 
WHILE (@@FETCH_STATUS <> -1)
BEGIN
IF ( @@FETCH_STATUS <> -2 )

 SET @childColumnList = NULL;
 -- get list of columns that make PK
 select @childColumnList = coalesce(@childColumnList + ', ', '') + column_name
 from INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE
 where constraint_name=@fkName;
 
 -- insert DDL for index recreation into log table  
 INSERT INTO LOG_TABLE (T) VALUES ('create unique clustered index ' + @fkName + ' on [' + @childSchema + '].[' + @childTable + '](' + @childColumnList + ') with drop_existing;');

 -- iterate over recordset
 FETCH NEXT from dm_cursor_pk 
 INTO 
  @childSchema,
  @childTable,
  @fkName;   
END

-- close the cursor
CLOSE dm_cursor_pk

-- deallocate the space
DEALLOCATE dm_cursor_pk



/*
 FOREIGN KEYS CREATION 
 */
 
 
-- declare the cursor to get all foreign keys
DECLARE dm_cursor_fk Cursor
FOR 
SELECT 
 distinct(F.NAME)
FROM SYS.FOREIGN_KEYS AS F
JOIN SYS.FOREIGN_KEY_COLUMNS AS FC ON F.OBJECT_ID = FC.CONSTRAINT_OBJECT_ID
JOIN sys.tables AS TB ON F.parent_object_id = TB.object_id
JOIN sys.tables AS FT ON F.REFERENCED_OBJECT_ID = FT.object_id;


-- open cursor
OPEN dm_cursor_fk 

-- iterate over recordset
FETCH NEXT from dm_cursor_fk 
INTO 
 @fkName;
 
WHILE (@@FETCH_STATUS <> -1)
BEGIN
IF ( @@FETCH_STATUS <> -2 )
 SET @childColumnList = NULL;
 SET @parentColumnList = NULL;

 -- get list of columns that make FK
 SELECT 
  @childSchema = SCHEMA_NAME(TB.schema_id),
  @childTable = OBJECT_NAME(F.PARENT_OBJECT_ID),
  @childColumnList = coalesce(@childColumnList + ', ', '') + COL_NAME(FC.PARENT_OBJECT_ID,FC.PARENT_COLUMN_ID),
  @parentSchema = SCHEMA_NAME(FT.schema_id),
  @parentTable = OBJECT_NAME (F.REFERENCED_OBJECT_ID),
  @parentColumnList = coalesce(@parentColumnList + ', ', '') + COL_NAME(FC.REFERENCED_OBJECT_ID,FC.REFERENCED_COLUMN_ID)
 FROM (SELECT * FROM SYS.FOREIGN_KEYS WHERE NAME=@fkName) AS F
 JOIN SYS.FOREIGN_KEY_COLUMNS AS FC ON F.OBJECT_ID = FC.CONSTRAINT_OBJECT_ID
 JOIN sys.tables AS TB ON F.parent_object_id = TB.object_id
 JOIN sys.tables AS FT ON F.REFERENCED_OBJECT_ID = FT.object_id
 ORDER BY SCHEMA_NAME(TB.schema_id), OBJECT_NAME(F.PARENT_OBJECT_ID);

 --  generate DDL to recreate FK
 INSERT INTO LOG_TABLE (T) VALUES ('alter table [' + @childSchema + '].[' + @childTable + '] ' 
     + ' add constraint ' + @fkName + ' foreign key (' + @childColumnList + ') references ' + '[' + @parentSchema + '].[' + @parentTable + '] ' 
     + '(' + @parentColumnList  + '); ');
 -- iterate over recordset
 FETCH NEXT from dm_cursor_fk 
 INTO 
  @fkName;
   
END

-- close the cursor
CLOSE dm_cursor_fk

-- deallocate the space
DEALLOCATE dm_cursor_fk


GO

Invocation of the procedure:


exec sp_fk_pk

Check the DDLs:


select t from log_table order by id

Note: use generated DDLs at your own risk

Sample schema:


create table a (
  id integer not null,
  t varchar(10)
 );
 alter table a add constraint pk_a primary key nonclustered(id);
 
 create table b (
  id integer not null,
  a_id integer
 );
 alter table b add constraint pk_b primary key nonclustered(id);
 alter table b add constraint fk_b_a foreign key (a_id) references a (id);
 
 create table c (
  id integer not null,
  t varchar(10)
 );
 alter table c add constraint pk_c primary key nonclustered(id);
 
 create table d (
  id integer not null,
  id1 integer not null,
  t varchar(10)
 );
 alter table d add constraint pk_d primary key nonclustered(id,id1);
 
 
 create table e (
  id integer not null,
  d_id integer,
  d_id1 integer
 );
 alter table e add constraint pk_e primary key nonclustered(id);
 alter table e add constraint fk_e_d foreign key (d_id,d_id1) references d (id,id1);

Today I got complaints from ex-team leader about some stored procedure and "bad coding practice".

I'm going to elaborate problem a bit (it's classic in DB development). And it's real error, not bad coding practice ...

The stored procedure is part of API for a primitive queuing mechanism written in plpgsql for my ex company.
The queuing mechanism was written 5.5 years ago.

Problematic procedure contains code like this:

....

insert into table a (....)
select max(id) from a

....

So I did basic mistake: first insert into table, followed by immediate selection of max id.

Default isolation level for PostgreSQL is READ COMMITTED, so (theoreticaly) the same procedure invoked from the other transaction could commit between insert and select of the previous transaction moving MAX(ID) to a higher value.

Let's "construct" an example.

First, let's create two tables: table a will be main table which will store IDs and the second one, b, will store inserted ID (i.e. expected ID) and MAX(ID) (i.e. the ID we got).

create table a (id serial);
create table b (a_id integer, a_max_id integer);

Here is the store procedure:

create or replace function insert_into_a (_sleep_time numeric(12,4) )

  returns integer as

$body$

	declare

 		_id integer;

	begin

		-- I did small trick here: I prefetched nextval from a_id_seq (as I had to do 6 years ago! shame on me!) 

		-- to show the difference between what we excpected to get and what we got.

		select nextval('a_id_seq') into _id;



		-- ok, let's insert nextval

		insert into a values (_id);

	

		-- let's sleep for given number of seconds 

		perform pg_sleep(_sleep_time);



		-- insert generated id and max id

		insert into b select _id, max(id) from a;



		return _id;



	end;



$body$

  language plpgsql volatile

  cost 100;

The procedure invokes pg_sleep: handy procedure which can help you create race condition tests and emulate real execution.
pg_sleep accepts one numeric parameter which represents number of seconds. The parameter is double precision, so fractional-second delays can be specified.

OK, now we have tables and stored procedure: let's test the system.

First single transaction which sleeps 0 seconds:

milos=# select * from b;

 a_id | a_max_id 

------+----------

(0 rows)



milos=# begin;

BEGIN

milos=# select insert_into_a(0);

 insert_into_a 

---------------

             1

(1 row)



milos=# commit;

COMMIT

milos=# select * from b;

 a_id | a_max_id 

------+----------

    1 |        1

(1 row)

OK, expected result.

Now real action.

Transaction T1

milos=# begin;

BEGIN

milos=# select insert_into_a(100);

....

100 seconds: plenty of time to create second transaction which will screw our final result.

Transaction T2

milos=# begin;

BEGIN

milos=# select insert_into_a(0);

 insert_into_a 

---------------

             3

(1 row)



milos=# select * from b;

 a_id | a_max_id 

------+----------

    1 |        1

    3 |        3

(2 rows)



milos=# commit;

COMMIT

So far, so good.

Back to committed transaction T1:



 insert_into_a 

---------------

             2

(1 row)



milos=# select * from b;

 a_id | a_max_id 

------+----------

    1 |        1

    3 |        3

    2 |        3

(3 rows)

HA!
Last pair (2,3) clearly shows "bad coding practice" :)

So, the proper way is to prefetch the ID using nextval('a_id_seq') and than to re-use it using currval('a_id_seq').

OK, I agree that this example is fake: we achieved with pg_sleep something that cannot be done with real procedure, but still we have race condition, and sometimes it can produce error which can be hard to trace and find (maybe it didn't fail in 5.5 years but ....).

connect milos as sysdba ...

уторак, 6. децембар 2011.

Recreate primary keys as CLUSTERED (SQL Server 2008)

петак, 26. август 2011.

Why select max(id) is wrong with PostgreSQL

Основни подаци о мени