Sunday, October 30, 2011

Strip HTML using UDF in SQL Server 2005

I would strongly suggest to do this in the front end application (or) make use of CLR based function to do this job. But for simple well formed html string may be we can make use of the new XML datatype introduced in SQL Server 2005 as shown in the below example.


CREATE FUNCTION dbo.Strip_WellFormed_HTML

@inputString VARCHAR(MAX) 

RETURNS VARCHAR(MAX) 
AS
BEGIN
--Variable Declaration    
DECLARE @htmlContent XML
    DECLARE @parsedValue VARCHAR(MAX)
    
--Variable Initialization
SET @htmlContent = @inputString;
    
  WITH HTML(InnerText) AS
    (
        SELECT Html.Tag.query('.') FROM @htmlContent.nodes('/') AS Html(Tag)
    )
    SELECT @parsedValue = InnerText.value('.', 'VARCHAR(MAX)') FROM HTML
    RETURN @parsedValue
END
GO


Pasting of html tags in blogger seems to be a difficult task. So just check out the attached screenshots of the way I have tested it.



Strip HTML irrespective of whether it is Well formed or not:

I have used Bill McEvoy's script in the past for this purpose. He followed a very cool idea and the original article can be read here. If you wanted to directly see his script then check out here.


Thursday, October 27, 2011

GO - Batch Separator

GO is not a SQL Statement or SQL Command. It is just a Batch Separator used by SQL Client tools like SQL Server Management Studio, SQL CMD etc.,

Extract from MSDN

"GO is not a Transact-SQL statement; it is a command recognized by the sqlcmd and osql utilities and SQL Server Management Studio Code editor.
SQL Server utilities interpret GO as a signal that they should send the current batch of Transact-SQL statements to an instance of SQL Server. The current batch of statements is composed of all statements entered since the last GO, or since the start of the ad hoc session or script if this is the first GO."

In SSMS:

Though the default Batch Separator is GO we can change it as well! Just go to Tools > Options >Query Execution > SQL Server > General > Batch Separator.



Let's change our Batch Separator as "Done". Please note that it would take effect only from the next SQL Query window which we open. We can't use it directly in any of the already opened Query windows.



In a new query window lets try to use our Batch Separator "Done" to see it working.



In my personal experience changing the Batch Separator is fun as long as we do it in our local boxes or to confuse our friends :) 

I would strongly suggest we stick with the default Batch Separator GO while generating scripts for deployment (or) sharing with other developers. Reason being, if we generate the scripts with this new Batch Separator "Done" then the place where it is run would throw up errors! Because the destination box still has the Batch Separator as 'GO'.

Now lets revert the batch separator to 'GO' and then try the same script. If you had expected an error to be thrown you would be surprised!


It has taken 'DONE' as an Alias name :)

This is one another good reason to use proper alias names ourself in the scripts. Now if we add an alias name ourself and check the same script it would throw the syntax error as expected.



Apart from this since SQL Server 2005 onwards we can use a positive integer value after GO and make that batch to execute that many times. I have used this many times in my previous posts. If you are looking for an working example check out this.


In SQLCMD:


Lets use the same example of printing the current datetime using SQLCMD utility. Goto Command prompt and type SQLCMD. Check out the below screenshot where I have connected using Windows Authentication and used a batch separator for printing current datetime.


Now if we need to change the Batch Separator for this utility as well then we need to make use of -c for it as shown in the below screenshot.


The Batch Separator 'Done' will work only as long as this current connection is open. If you close this command prompt and reopen it again you could see that it has gone back to the default batch separator 'GO'.

To know complete list of options for SQLCMD check out this MSDN article.

Sunday, October 23, 2011

IDENTITY in SQL Server

If you haven't heard about $IDENTITY in SQL Server check out the initial part of the scripts explained below to know what it does.

CREATE TABLE IdentityTest
(
  Sno INT IDENTITY(1,2) PRIMARY KEY,
CreatedDate DATETIME
)
GO


INSERT INTO IdentityTest(CreatedDate) 
SELECT Getdate()
Go 10

1. How to display the LAST inserted IDENTITY Value into a table in the current scope and session?
SELECT SCOPE_IDENTITY()

Generally we would use SCOPE_IDENTITY() but since SQL Server 2005 there is a serious bug with this. The safe option to use is OUTPUT clause. Check out this KB article here.

2. What if we don't know the IDENTITY column name but wanted to display its column values? (OR) Without mentioning the IDENTITY column name how to list all the IDENTITY values in a table?
SELECT $IDENTITY FROM IdentityTest
GO

I have seen $IDENTITY working since SQL Server 2005 and above. Not very sure whether it used to work in earlier versions as I have never tried this prior to SQL 2005.

3. How to check the current Identity seed value?
DBCC CHECKIDENT (IdentityTest)
GO

Checking identity information: current identity value '19', current column value '19'.

So while a new row is inserted it would be
"Current Column Value" (19) + STEP Value mentioned in IDENTITY declaration (2)

4. What happens if the transaction fails?

BEGIN TRANSACTION
INSERT INTO IdentityTest(CreatedDate) SELECT Getdate()
ROLLBACK TRANSACTION

Though the transaction is rolled back the "current column value" would have been increased by now! Check out DBCC CHECKIDENT to verity it

DBCC CHECKIDENT (IdentityTest)
GO

Checking identity information: current identity value '21', current column value '21'.

Let's try to INSERT another row into the table.

INSERT INTO IdentityTest(CreatedDate) 
SELECT Getdate()
GO


SELECT $IDENTITY FROM IdentityTest
GO

We can see that after 19 the identity value inserted is 23.

5. Assume we don't want that gap and want to insert a row with identity value 21 manually. How to do it?
--Option1:
--Immediately after the transcation failed may be we could have RESEED it back.
DBCC CHECKIDENT ("dbo.IdentityTest", RESEED, 19);
GO


--Option 2:
SET IDENTITY_INSERT IdentityTest ON

--You need to mentioned the column list else the statement would fail
INSERT INTO IdentityTest (Sno, CreatedDate)
SELECT 21, Getdate()
GO

SET IDENTITY_INSERT IdentityTest OFF

6. How to list all IDENTITY columns in a database?

SELECT 
OBJECT_NAME(OBJECT_ID) AS [Table Name], 
[Name] AS [Identity Column Name],
[seed_value] AS [Seed Value],
[Increment_value] AS [Increment Value],
[Last_Value] AS [Last Value]
FROM 
sys.identity_columns

7. Is it possible to UPDATE a value of an IDENTITY column?

NO. We cannot update an IDENTITY column. If we try to do so it would throw up an error.

SET IDENTITY_INSERT IdentityTest ON


UPDATE IdentityTest SET Sno = 31
WHERE Sno = 21
GO


SET IDENTITY_INSERT IdentityTest OFF

The error message would be something like this:
Msg 8102, Level 16, State 1, Line 4
Cannot update identity column 'Sno'.

8. Is it possible to generate Identity values as NEGATIVE values?

YES it is possible. In the below example it would start from 1 and start decreasing by -1 for each record.

CREATE TABLE IdentityTest_Negative
(
  Sno INT IDENTITY(1,-1) PRIMARY KEY,
  CreatedDate DATETIME
)
GO


INSERT INTO IdentityTest_Negative(CreatedDate) 
 SELECT Getdate()
GO 10


SELECT * FROM IdentityTest_Negative

Related topic written in 2004 - Fetching Identity Value

Monday, October 17, 2011

SQL Server performance for NOT EXISTS vs NOT IN

Though "NOT EXISTS" and "NOT IN" sounds similar there is quite a lot of difference between them.  To start with check out the blog post by Mladen.

In continuation to what Mladen has already written I thought I would show the differences it makes on the Execution Plan and the IO / Time when we use NOT EXISTS or NOT IN in our queries. Let's see few of the differences between them.

Case 1: Lets use them in columns which are declared as NOT NULL

SET NOCOUNT ON
GO


CREATE TABLE PackageInformation
(
Sno INT IDENTITY(1,1) PRIMARY KEY,
PackageID INT NOT NULL,
PackageName VARCHAR(20)
)
GO

-- I am using the random records generator which I wrote few days back to populate data into this table.
-- generating 10000 records
INSERT INTO PackageInformation  (PackageID, PackageName)
SELECT CAST(RAND(CHECKSUM(NEWID())) * 10000 AS INT),
dbo.udf_StringGenerator('A', 20)
GO 10000


CREATE TABLE ChildTable
(
RID INT IDENTITY(1,1) PRIMARY KEY,
PackageID INT NOT NULL
)
GO

-- Lets take some 40% (approx 4000 records) from the PackageInformation table to populate this table
INSERT INTO ChildTable (PackageID)
  SELECT PackageID
  FROM dbo.PackageInformation
  TABLESAMPLE (40 PERCENT); -- This would work only on SQL Server version 2005 or above
GO

Let's write a query to list Package details from PackageInformation table which is not present in ChildTable.

--Clear out the cache (DONT TRY THIS IN PRODUCTION ENVIRONMENT)
DBCC FREEPROCCACHE
GO

SET STATISTICS IO ON
SET STATISTICS TIME  ON
GO

--Press Control + M to display the Actual Execution Plan of the queries
--Query1: Using NOT IN
SELECT PackageID, PackageName 
FROM dbo.PackageInformation 
WHERE PackageID NOT IN (SELECT PackageID FROM ChildTable)
GO

--Query2: Using NOT EXISTS
SELECT PackageID, PackageName 
FROM dbo.PackageInformation
WHERE NOT EXISTS
 (
   SELECT PackageID FROM ChildTable 
   WHERE ChildTable.PackageID = PackageInformation.PackageID
)
GO

SET STATISTICS IO OFF
SET STATISTICS TIME  OFF
GO

Result:

i) Query using NOT IN : Returned 4690 records
ii) Query using NOT EXISTS : Returned 4690 records

iii) Let's see the Actual execution plan for both the queries. 

iv) Let's also check on the logical reads and CPU time taken for these queries.

So when the column is declared as NOT NULL then both NOT IN and NOT EXISTS seems to perform the same way.

Case 2: Let's change the PackageID column as NULL

ALTER TABLE dbo.PackageInformation
ALTER COLUMN PackageID INT NULL
GO

ALTER TABLE dbo.ChildTable
ALTER COLUMN PackageID INT NULL
GO

--Lets insert some 100 null values into the PackageInformation Table
INSERT INTO PackageInformation 
SELECT NULL,
dbo.udf_StringGenerator('A', 5)
GO 100


Let's run the same query which we used in Case 1 to list Package details from PackageInformation table which is not present in ChildTable.

--Clear out the cache (DONT TRY THIS IN PRODUCTION ENVIRONMENT)
DBCC FREEPROCCACHE
GO

SET STATISTICS IO ON
SET STATISTICS TIME  ON
GO

--Query1: Using NOT IN
SELECT PackageID, PackageName 
FROM dbo.PackageInformation 
WHERE PackageID NOT IN (SELECT PackageID FROM ChildTable)
GO

--Query2: Using NOT EXISTS
SELECT PackageID, PackageName 
FROM dbo.PackageInformation
WHERE NOT EXISTS
 (
   SELECT PackageID FROM ChildTable 
   WHERE ChildTable.PackageID = PackageInformation.PackageID
)
GO

SET STATISTICS IO OFF
SET STATISTICS TIME  OFF
GO


Result:

i) Query using NOT IN : Returned 4690 records (It hasn't considered those 100 new NULL records which we added!!)
ii) Query using NOT EXISTS : Returned 4790 records

iii) Let's see the Actual execution plan for both the queries. 

iv) Let's also check on the logical reads and CPU time taken for these queries.

So when column is declared as NULL then NOT IN seems to generate a pretty complicated execution plan and does NUMEROUS number of logical reads more than NOT EXISTS. So the winner here is NOT EXISTS.

Case 3: Adding NULL values into ChildTable

INSERT INTO ChildTable (PackageID)
SELECT NULL

Result:

i) Query using NOT IN : Returned 0 records!
ii) Query using NOT EXISTS : Returned 4790 records

So if the Subquery returns even one NULL then NOT IN operator would not return any result which isn't right. So again the winner is NOT EXISTS.

I think it would be safe to say that we should use NOT EXISTS instead of NOT IN as it seems to work as expected by us in all the scenarios which we saw in this post.

--Cleanup
DROP TABLE ChildTable
GO
DROP TABLE PACKAGEINFORMATION
GO

Thursday, October 13, 2011

Timezone conversion UTC to CST with Daylight Savings

Converting UTC format date to Standard time of CST or EST is straight forward.

DECLARE @UTC_Date DATETIME
SET @UTC_Date = GETUTCDATE()
SELECT
@UTC_Date AS [UTC],
DATEADD(hh, -6, @UTC_Date) AS [CST - Standard Time],
DATEADD(hh, -5, @UTC_Date) AS [EST - Standard Time]

But if the given date falls under daylight saving then the above calculation won't work.

So how is Daylight saving calculated?

1. If the year <= 2006 then daylight saving is between:
2 am on First Sunday in April till 2 am on Last Sunday in October

2. If the year >= 2007 then daylight saving is between:
2 am on Second Sunday in March till 2 am on First Sunday in November

3. UTC to CST (Standard Time) = -6
4. UTC to CDT (Daylight Time) = -5

5. UTC to EST (Standard Time) = -5
6. UTC to EDT (DayLight Time) = -4

Solution - 1:

DECLARE @UTC_Date DATETIME
SET @UTC_Date = GETUTCDATE()


SELECT
@UTC_Date AS [UTC],
DATEADD(hh, -6, @UTC_Date) AS [CST - Standard Time],
DATEADD(hh,
CASE WHEN YEAR(@UTC_Date) <= 2006 THEN
                CASE WHEN
                      @UTC_Date >=  '4/' + CAST(ABS(8 - DATEPART(WEEKDAY,'4/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND
                      @UTC_Date < '10/' + CAST(32 - DATEPART(WEEKDAY,'10/31/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR)) AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00'
                THEN -5 ELSE -6 END
              ELSE
                CASE WHEN
                      @UTC_Date >= '3/' + CAST(ABS(8 - DATEPART(WEEKDAY,'3/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 8 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND
                      @UTC_Date <
                        '11/' + CAST(ABS(8 - DATEPART(WEEKDAY,'11/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00'
                THEN -5 ELSE -6 END
              END
, @UTC_Date
) AS [CST/CDT - DayLight Time],
DATEADD(hh, -5, @UTC_Date) AS [EST - Standard Time],
DATEADD(hh,
CASE WHEN YEAR(@UTC_Date) <= 2006 THEN
                CASE WHEN
                      @UTC_Date >=  '4/' + CAST(ABS(8 - DATEPART(WEEKDAY,'4/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND
                      @UTC_Date < '10/' + CAST(32 - DATEPART(WEEKDAY,'10/31/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR)) AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00'
                THEN -5 ELSE -6 END
              ELSE
                CASE WHEN
                      @UTC_Date >= '3/' + CAST(ABS(8 - DATEPART(WEEKDAY,'3/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 8 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND
                      @UTC_Date <
                        '11/' + CAST(ABS(8 - DATEPART(WEEKDAY,'11/1/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/'
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00'
                THEN -4 ELSE -5 END
              END
, @UTC_Date
) AS [EST/EDT - DayLight Time]
GO

Instead of calling the same set of code again and again we can make it as a User defined function and call it by passing UTC date, Offset values for standard time and daylight saving respectively.

Solution - 2

/* Parameter 1 = UTC Date
Parameter 2 = Offset value for Standard time
Parameter 3 = Offset value for Daylight saving time */

CREATE FUNCTION [dbo].[fn_GetDaylightSavingsTime]
(
@UTC_Date DATETIME,
@ST_Offset INT, -- CST = -6, EST = -5
@DT_Offset INT  -- CDT = -5, EDT = -4
)
RETURNS DATETIME
AS
BEGIN


RETURN 
DATEADD(hh, 
CASE WHEN YEAR(@UTC_Date) <= 2006 THEN  
                CASE WHEN 
                      @UTC_Date >=  '4/' + CAST(ABS(8 - DATEPART(WEEKDAY,'4/1/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND 
                      @UTC_Date < '10/' + CAST(32 - DATEPART(WEEKDAY,'10/31/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR)) AS VARCHAR) +  '/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' 
                THEN @DT_Offset ELSE @ST_Offset END
              ELSE
                CASE WHEN 
                      @UTC_Date >= '3/' + CAST(ABS(8 - DATEPART(WEEKDAY,'3/1/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 8 AS VARCHAR) +  '/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' AND 
                      @UTC_Date < 
                        '11/' + CAST(ABS(8 - DATEPART(WEEKDAY,'11/1/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR))) % 7 + 1 AS VARCHAR) +  '/' 
+ CAST(YEAR(@UTC_Date) AS VARCHAR) + ' 2:00' 
                THEN @DT_Offset ELSE @ST_Offset END
              END
, @UTC_Date
)
END
GO

Usage:

DECLARE @UTC_Date DATETIME
DECLARE @ST_Offset INT, @DT_Offset INT


SET @UTC_Date = GETUTCDATE()
SET @ST_Offset = -6
SET @DT_Offset = -5


SELECT [dbo].[fn_GetDaylightSavingsTime](@UTC_Date, @ST_Offset, @DT_Offset)

Reference:

Tuesday, October 11, 2011

List all triggers in a database with its source code

Yesterday I saw a question in a forum asking for help in writing a script to list out all Triggers within a database. They also wanted to see the source code of each trigger.

Method 1: Which works in SQL Server 2000, SQL Server 2005 and SQL Server 2008.

But the problem with SYSCOMMENTS is it would truncate any text beyond 4000 characters. As the "Text" column which preserves the source code is declared as NVARCHAR(8000).

/*
Written By: Vadivel Mohanakrishnan
Date: Oct 10, 2011
URL: http://vadivel.blogspot.com
Purpose:  To list all triggers in the current database along with the actual text/source code. It also lists what of type of trigger it is. 


But for all future development we should avoid using SYSOBJECTS, SYSCOMMENTS as it might be removed in the future versions.
*/
SELECT
  SO2.Name AS [Table Name],
    SO1.name AS [Trigger Name],
  OBJECTPROPERTY( [so1].[id], 'ExecIsUpdateTrigger') AS [isUpdate],
    OBJECTPROPERTY( [so1].[id], 'ExecIsDeleteTrigger') AS [isDelete],
    OBJECTPROPERTY( [so1].[id], 'ExecIsInsertTrigger') AS [isInsert],
    OBJECTPROPERTY( [so1].[id], 'ExecIsAfterTrigger') AS [isAfter],
    OBJECTPROPERTY( [so1].[id], 'ExecIsInsteadOfTrigger') AS [isInsteadOf],
    OBJECTPROPERTY([so1].[id], 'ExecIsTriggerDisabled') AS [isDisabled],
    SC.Text AS [Source Code]
FROM
SYSOBJECTS SO1 
    Inner Join SYSOBJECTS SO2 On SO1.parent_obj = SO2.id
    Inner Join SYSCOMMENTS SC On SO1.id = SC.id
WHERE      
SO1.xtype = 'TR' And SO2.xtype = 'U'
ORDER BY 
SO2.Name, SO1.name
GO

Method 2: Works with SQL Server 2005 and above.

This method doesn't have the problem of text getting truncated beyond 4000 characters because the column "definition" of SYS.SQL_MODULES is declared as NVARCHAR(MAX).

/*
Written By: Vadivel Mohanakrishnan
Date: Oct 11, 2011
URL: http://vadivel.blogspot.com
Purpose: To list all triggers in the current database along with the actual text/source code. It also lists what of type of trigger it is.
*/
SELECT 
Object_name( TR.parent_id) AS [Table Name],
TR.[Name] AS [Trigger Name],
TR.is_disabled AS [isDisabled],
TR.is_Instead_of_trigger AS [isInsteadOfTrigger],
MO.definition AS [Source Code], 
TR.create_date AS [Created Date],
TR.modify_date AS [Modified Date]
FROM 
SYS.TRIGGERS TR INNER JOIN SYS.SQL_MODULES MO 
ON TR.Object_id = MO.Object_ID
WHERE 
TR.[Type] = 'TR' AND
TR.is_ms_shipped = 0 AND
TR.[parent_id] > 0
ORDER BY 
Object_name( TR.parent_id), TR.[Name]
GO

Saturday, October 08, 2011

Generate test (random) data for a table in SQL Server

The best product I have seen so far for generating random test records within a SQL Server database is Red Gate's Data Generator software. Its feature list is really impressive. Please be advised it is not a free software but it has a 14 days trial version to try it out.

If you are looking for a simple script to generate records for a standalone table may be try out the below scripts.

To Generate Random Integer Value:

--This will generate an Integer value lesser than 100000 
SELECT CAST(RAND(CHECKSUM(NEWID())) * 100000 AS INT)

To Generate Random date:

--This will create DOB whose age would be between 21 and 60 (21 + 39)
SELECT GETDATE() - ((21 * 365) + RAND()* (39 * 365))

To Generate Random Money value:

--This will generate Money value (rounded to 2 decimals) between 25000 to 30000 
SELECT Round(CAST(25000 + RAND(CHECKSUM(NEWID())) * 5000 AS Money),2)

Assume in a company the salary range is between 25000 to 100000. So we can generate random salary between that range like shown below:

/*
It's kind of self explanatory. 
Min value is 25000 which we need to place it before RAND.
Then MaxValue - MinValue should be used to multiply with the output of RAND(CHECKSUM(NEWID()))
*/
SELECT Round(CAST(25000 + RAND(CHECKSUM(NEWID())) * 75000 AS Money),2)

To Generate Random Strings (Only Alphabets or Alpha numeric or Passwords)

For this let's write an user defined function to return us a random string. It would take two parameters which are explained below:

1. First parameter: To decide whether we need a string based out of only Alphabets (OR) Alpha numeric (OR) Alpha numeric + Special characters (Example: For password).

2. Second parameter: To mention the string length which we need. If we think there might be a need to pass a number greater than 500 then we might need to change the size of @randomString and the RETURNS VARCHAR appropriately.

In order to generate Random strings we need to make use of RAND(). But RAND() cannot be used within an User defined function as of now. So as a work around we would create a View which would return us an RAND() so that we can make use of it within the UDF! But in SQL Server 2008 or above machine we can overcome this limitation by using CRYPT_GEN_RANDOM.

--This view is just a work around to make use of RAND function within a UDF

CREATE VIEW [dbo].[vw_Rand]
AS 
SELECT RAND() AS rnd 
GO

--UDF for Random string Generation

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Vadivel Mohanakrishnan
-- URL: http://vadivel.blogspot.com
-- Create date: Oct 8th, 2011
-- Description: Function to return random string. 
-- =============================================
CREATE FUNCTION udf_StringGenerator
(
-- A = only Alphabets,
-- AN = alpha numeric,
-- P = AN + special characters
@Type VARCHAR(2),
@MaxLength INT
)
--If @MaxLength > 500 then change the size here as well
RETURNS VARCHAR(500)
AS
BEGIN


        --If @MaxLength > 500 then change the size here as well
DECLARE @randomString VARCHAR(500)

       DECLARE @counter SMALLINT
DECLARE @Length INT
DECLARE @strPattern VARCHAR(150)
DECLARE @isType VARCHAR(2)
DECLARE @rand REAL

SET @isType = UPPER(@Type)

SELECT
  @strPattern = CASE
WHEN @isType = 'A' THEN 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz'
WHEN @isType = 'P' THEN 'ABCDEFGHIJKLMNOPQRST UVWXYZabcdefghijklmnopqrstuvwxyz0123456789 -=+&$'
WHEN @isType = 'AN' THEN 'ABCDEFGHIJKLMNOPQRST UVWXYZabcdefghijklmnopqrstuvwxyz0123456789'
  END

SET @Length = LEN(@strPattern)
SET @randomString = ''
SET @counter = 1

WHILE @counter <= @MaxLength
BEGIN
SET @rand = (SELECT rnd FROM [dbo].[vw_Rand]) 
SET @randomString = @randomString + SUBSTRING(@strPattern, CONVERT(TINYINT, ((@Length - 1) * @rand + 1)), 1)
SET @counter = @counter + 1
END

RETURN @randomString
END
GO

It's time to test all the logic which we have discussed till now. Let's create a table with different data types and using the logic(s) explained above pump in random records into it.

CREATE TABLE tblEmployee 
(
EmpID INT,
EmpName VARCHAR(25),
DOB DATETIME,
Salary MONEY,
Password VARCHAR(8)
)
GO

--Usage: Lets populate some random 10000 records into this table
INSERT INTO tblEmployee 
SELECT CAST(RAND(CHECKSUM(NEWID())) * 10000 AS INT),
dbo.udf_StringGenerator('A', 25),
GETDATE() - ((21 * 365) + RAND()* (39 * 365)),
Round(CAST(25000 + RAND(CHECKSUM(NEWID())) * 1000000 AS Money),2),
dbo.udf_StringGenerator('P', 8)
Go 10000

--Check
SELECT * FROM tblEmployee 
GO

--Clean Up
DROP TABLE tblEmployee 
GO

Though I feel this should help us get started but if there is a real need we can extend this further as well. For example, we can write the logic to generate Integer values within a given range etc., 

Friday, October 07, 2011

How to find missing (GAPS) sequence numbers

Few years back I wrote a solution for the same problem here. Recently I read an article by Henning on the similar topic which was interesting and informative.

After that, I was wondering why not just make use of a NUMBER table and a NOT IN to do this. I am providing the test script as-is and I haven't compared the performance yet between any other solutions out there :)

--Lets set up a sample table and pump in few thousand records with gaps in it
SET NOCOUNT ON;
GO

CREATE TABLE tblFindGaps 
(
    ID INT NOT NULL IDENTITY(1, 1) PRIMARY KEY CLUSTERED,
    [Name] VARCHAR(50) NULL
)
GO

INSERT INTO tblFindGaps (Name) VALUES ('a')
GO 10000

-- Delete some rows to create gaps in the identity column
DELETE tblFindGaps WHERE ID IN (23, 64, 3392, 44)
DELETE tblFindGaps WHERE ID BETWEEN 5000 AND 6000
GO

Solution:

Lets create a table to hold numbers from 1 till the MAX available number already present in tblFindGaps table.


-- Number table 
Create table tblNumbers
(
iValue INT iDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[dummy] char(1)
)

--Find the maximum value of the sequence column of your table
SELECT MAX(ID) from tblFindGaps
GO

INSERT INTO tblNumbers ([dummy]) VALUES ('a')
GO 10000 -- Assign the MAX value here

--Query to list the missing sequence numbers
SELECT  *
FROM 
(
SELECT ROW_NUMBER() OVER(ORDER BY iValue) AS Number FROM tblNumbers
) temp
where Number not in 
(
SELECT [ID] FROM tblFindGaps
)
GO

Thursday, October 06, 2011

How to truncate all tables in a SQL Server database?

1. For those who are adamant to do it via scripts: :)


Few years back I have written a t-sql script to do this. I have disabled the constraints before truncating the data and once it is done have enabled it back. But I guess in the latest versions of SQL Server we might need to drop & recreate the constraints (haven't tried it yet).

2. Generate Scripts:

If we need to return back to complete empty database with just table schema and other SQL Objects like Stored procedure, trigger, constraints etc., then it would be simple to script out the whole database. That way we can just drop and recreate the database anytime using those scripts.

3. Backup/Restore:

If we need to achieve what's mentioned in point 2 and along with that if we need few tables to have the master entries preserved so that our application can work properly! then why not take a Full backup of the database once everything is set in it. So that when ever we want to revert back to that state we can restore it back from that backup.