Thursday, November 17, 2011

A Request

I know "the combined Ohio, Indiana, and Northern Kentucky metro area" is an awkward phrase.  It cries out for abbreviation.

But please, don't call it "OINK". 

Monday, March 28, 2011

Towards a Print Statement that Appears in SQL Profiler –Part 1

The SQL Server PRINT statement can be very useful.  Some uses:
  1. Impromptu performance testing:  save the start time of a chunk of SQL in a variable, and at the end PRINT the difference between the current time and end time.
  2. Verifying during debugging that a variable used in a stored proc has the expected value at a certain point.
  3. Obtaining the value of a variable retrieved from the database during a proc so that the developer can do impromptu querying of other tables and verify their contents.
But the PRINT statement doesn’t echo to SQL Profiler.  This means that you can only do the above if you’re working with SSMS or otherwise have access to SQL Info Messages.  Oh, you can get a copy of the command you executed – “PRINT @ID” or whatever – but you won’t get visibility through Profiler to what @ID was. 

This is a problem if you’re trying to diagnose a stored procedure as an application executes it.  I started research a few days ago on how I might provide a PRINT to Profiler capability.

Let me say up front that I did not find a perfect solution.  But I can offer three partial ones.

1.  Use sp_trace_generateevent.  This is an extended stored procedure that lets you write a user-defined event to the trace log.  You can provide up to 256 chars of text and 8000 bytes of event data.  The characters appear in TextData and the 8000 bytes in BinaryData.   There are 10 reserved event Ids (82 to 91) that you can use for your own purposes. 

Here’s a stored proc that combines PRINT and using sp_trace_generateevent:
CREATE PROCEDURE PrintTrace1
@Text nvarchar(max) 
AS
BEGIN
DECLARE @UserData BINARY(8000) = 0
DECLARE @UserInfo NVARCHAR(256) = SUBSTRING(@Text,1,256)
PRINT   @Text
EXEC sp_trace_generateevent 82, @UserInfo, @UserData
END
GO

In order to see the trace output in Sql Server Profiler, you need to turn on the appropriate user defined event.  My code above uses user-configurable event 0  (id=82), so you need to turn it on:

image

Here’s what the test script looked like:

DECLARE @PrintText varchar(256) = 'This is a test 
This is a test
This is a test
This is a test
This is a test'
EXEC PrintTrace1 @PrintText

And here’s what the SQL Profiler output looks like:

image

This works just fine, as long as you only need to PRINT 256 chars.  But suppose you need more?  Suppose your app works by retrieving templated SQL from a database table, and you need to be able to see every line that’s about to be executed?  In that case, sp_trace_generateevent won’t help you.  We’ll discuss other options next time.

Sunday, March 06, 2011

Quick and Dirty Unpivoting with SQL Server’s XML Features

Let’s suppose you want to return some data in the following extremely unpivoted format:

SELECT  ID
   ,    ColumnName
   ,    Value
FROM    .. something ..


Your source table or query has (let’s say) about 30 different columns.  How can you quickly produce something in the format above?



Well, one way is a bunch of UNION statements, like so (using the AdventureWorks database):



select  ID = ProductId
    ,   ColumnName = 'Name'
    ,   Value = Name
from    Production.Product
union
select  ID = ProductId
    ,   ColumnName = 'ProductNumber'
    ,   Value = ProductNumber
from    Production.Product    
union
select  ID = ProductId
    ,   ColumnName = 'MakeFlag' 
    ,   Value = CONVERT(char(1), MakeFlag)
from    Production.Product


This works fine, if you have only a few columns you want to return in this format.  If you have 20-30, as with the Production.Product table, that’s a lot of UNION statements.  And copying the code to use for another table is basically a waste of time:  you’ll have to change so many things you might as well just start typing from scratch.



Another way is using the UNPIVOT statement.  It produces the same sort of result, and looks like this:



select  *
from    (
    Select  ProductId, 
            ProductLine, 
            Class, 
            Style 
    from Production.Product) as pv
UNPIVOT (Value for FieldName IN (ProductLine, Class, Style)) p


But all the fields you want to unpivot need to be of the exact same type.  For strings, that means the same type AND the same length.  (The ProductLine, Class, and Style fields above are are nchar(2)).  Plus, you need to list all the fields to which you want the UNPIVOT to apply.  If you have 20 or 30 fields, that’s a lot of fields to list.



So, what we need is a way to produce a list of this form, that can adapt to differing column types, can be applied to all fields or some fields in a table with a minimum of column-name-typing, and which can be copied-and-repurposed without requiring a complete retype. 



Here’s one way to do it, using XML.  In terms of performance, it is the slowest of the three options, but in terms of programmer time, it’s definitely the fastest and most versatile.



with A
as (
    --  We're going to return the product ID, plus an XML version of the 
    --  entire record.
	select  ProductId
	    ,   (
	        Select  * 
	        from    Production.Product 
	        where   ProductID = pp.ProductId
	        for xml auto, type) as X
	from    Production.Product pp
	)
,	B
as	(
    --  We're going to run an Xml query against the XML field, and transform it
    --  into a series of name-value pairs.  But X2 will still be a single XML
    --  field, associated with this product ID.
    select  ProductId
        ,   X.query(
		       'for $f in Production.Product/@*
		        return 
    				<product name="{ local-name($f) }" value="{ data($f) }" />
		    ') as X2
	from	A
	)
,	C
as	(
    --  We're going to run the Nodes function against the X2 field, splitting 
    --  our list of "product" elements into individual nodes.  We will then use
    -- the Value function to extract the name and value.  
	select	B.ProductId as ID
		,	norm.product.value('@name', 'varchar(max)') as Name
		,	norm.product.value('@value', 'varchar(max)') as Value
from	B
	cross apply B.X2.nodes('/product') as norm(product)
)
-- Select our results.
select	*
from	C


Note a couple of things:



1.  We’re using (and returning) all the fields from Production.Product, but we didn’t have to list the field names.



2.  We’re using the term “Product” in C and referring to the XML node Production.Product, but those names are arbitrary:  we could have formed the XML in step A in such a way that it had a generic name.



3.  If you wanted to return fewer columns, you have a couple of choices:  you can specify the particular columns you want in the subquery in step A, or you could filter them out at the final step.  (Since the column names are in the field Name, you could even join to sys.all_columns or some other list of column names, so as to filter by type or some other criteria.) 



4.  We are returning the Value column as varchar(max); this is because anything can be expressed as a varchar.  But if we were returning, say, all ints, we could specify in the value ()function that the data should be returned as an type int. 



5.  Nulls are automatically filtered out.



Here’s what the results look like, by the way:



Capture Normalization



Not bad, I think. 

Sunday, February 20, 2011

Doing Things with Strings: SQL, Xml, and String Manipulation, Part II

This series of posts is about using SQL Server’s Xml features to do string manipulations.  Part I talked about creating comma-separated (CSV) lists from SQL database data.  Part II, this post, will talk about parsing a comma-separated list.

Until Xml-typed parameters came along in SQL Server 2005, the only easy way to pass a set of information into a stored procedure was by passing a string parameter that represented serialized data.  The two most common forms were an Xml fragment, and a CSV string.  Parsing these CSV strings usually required a custom table-valued function that used T-SQL string manipulation operations to parse out the information and build the return table. 

This is a perfectly good solution, but there’s a faster one.  I’ll give you the code for it, right up front:

CREATE FUNCTION ParseCsv
(
    @Csv varchar(max)
)
RETURNS TABLE
AS
    RETURN 
    (
        -- We're being passed in a CSV string; we'll replace the commas with
        -- endtag/start tag pairs.
        WITH A
        AS  (
            SELECT  REPLACE(@Csv, ',', '</dummy><dummy>') as XmlFrag1
        )
        ,   B
        AS  (
            -- We're building an XML fragment out of the CSV, in fact.
            SELECT  '<dummy>' + XmlFrag1 + '</dummy>' as XmlFrag2
            FROM    A
        )
        ,   C
        AS  (
            -- Convert it actually to type XML
            SELECT  CONVERT(xml, XmlFrag2) as XmlFrag3
            FROM    B
        )
            -- For more on XmlFragment, see below.  Right now, we want the
            -- content of the fragment, which we're getting with the value()
            -- function, and converting it to "int". 
        SELECT  D.XmlFragment.value('.', 'int') as Value
        FROM    C
            -- When you want to access the Xml field in every record, and 
            -- apply the "nodes" function to it, you need to remember that
            -- nodes() is a *function*, and thus can be used with CROSS APPLY
            -- to run the function for every record in the record source (C).
            -- Hence, we run nodes() on C.XmlFrag3 for every record, and call
            -- the resulting set of data D, with the results of "nodes" being
            -- the field XmlFragment.  
            CROSS APPLY C.XmlFrag3.nodes('/dummy') as D(XmlFragment)
     )
     
  


You’ll want to test it, so here’s code (using the AdventureWorksLT database) that generates a CSV string. 



DECLARE @csv varchar(max)
WITH A
as (
    SELECT (
        SELECT  ProductID 
        FROM    SalesLT.Product
        FOR XML AUTO, ELEMENTS, TYPE
    ) as ProductXml
   )
   
,   B
AS  (
    SELECT A.ProductXml.query('data(*)') AS ProductSeries 
    FROM A
)
SELECT  @csv = REPLACE(CONVERT(varchar(max), ProductSeries),' ',',')
        FROM B
SELECT @csv
select * from ParseCsv(@csv)


And here’s the result you get from the test.  The first recordset is just the select on @Csv, the second one is the result of the function.



image 



In tests, this consistently ran in half the time of a T-SQL string manipulation solution, regardless of the number of values in the CSV.

Labels: , ,

Sunday, January 30, 2011

Useful Things to Know, #428

Contrary to a lot of what you’ll find said online, Linq to SQL seems to work fairly happily with tables from different SQL Server databases (but the same server) in the same model.  You just have to put the full name (including the database) into the Name property of the entities representing tables in the “foreign” database.  You can even build associations between tables from different databases:  Linq to SQL will take your word for it that those relationships exist.

A caveat: I have only tested this on VS 2008 (what I happen to be working with right now) and I haven’t attempted anything fancy with updates or inserts (since what I’m working on doesn’t require them).  But, since I found people saying emphatically that cross-database modeling couldn’t be done in Linq to SQL, I wanted to report that it can. 

Saturday, January 22, 2011

Doing Things With Strings: SQL, Xml, and String Manipulation, Part I

SQL Server 2005 introduced new features for handling and manipulating Xml. Generating Xml via T-SQL is much simpler than it used to be. Parsing Xml is much, much easier, and we have a an Xml data type to use, too. All of that is pretty well documented in the Books Online. So if you need to produce or consume Xml, that's where to look.

But what these posts will be about is using Xml and the Xml features to get some non-Xml things done. Specifically, I'm going to talk about some types of string manipulation that the Xml features make faster and easier.

Generating Comma-Separated Lists (CSV Strings)

Sometimes, when you're summarizing data, you want the values in a column to be aggregated into a comma-separated list. An example of this might be a list of your biggest customers, with the trade associations or buying groups they belong to listed off to one side.

As you know, there is no built-in aggregate to do this in SQL Server. Microsoft (and, I believe, others) have published a SQL CLR aggregate function that will do this for you.

But, thanks to the Xml features, this can be also be done with T-SQL code, without publishing assemblies or writing custom functions. I’m going to demonstrate it with a series of CTEs, so that each step in the transformation is separate, but you can combine all of them into one expression if you like.

Let’s suppose that we are querying the AdventureWorks2008 database, and we want a list of stores with a list of contacts for each store in the same row, in a comma-separated list.

We start by querying the Sales.Store table, and include a subquery of the Person.Contact table, using the Sales.StoreContact table to link between stores and contacts. Our subquery is going to return the names of the contacts as an Xml fragment.
select  CustomerId
    ,   Name
    ,   (
        Select  ContactName  = FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName
        from    Person.Contact pc
                    inner join
                Sales.StoreContact ssc
                    on    pc.ContactID = ssc.ContactID
        where   ssc.CustomerID = ss.CustomerID
        for     xml auto, elements, type
        ) as ContactList1
from    Sales.Store ss

This looks like this:


Capture


Note that we’re only returning one element in our fragments: the Contact Name element. This is the only text that the fragment contains.


Our next step transforms the ContactList1 Xml field into a space-delimited list of names. Because our data has spaces, we’re going to alter our subquery a little to replace the spaces in the data with a character unlikely to appear in a name. You’ll see why in a minute.

With A
as  (        
    Select    ContactName  = FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName
         ,    ssc.CustomerID   
    from      Person.Contact pc            
                  inner join                  
              Sales.StoreContact ssc              
                  on   pc.ContactID = ssc.ContactID    
 ),  B
 as  (    
     select  CustomerId        
         ,   Name
         ,   (                
                 select  ContactName = REPLACE(A.ContactName, ' ','|')
                 from    A
                 where   CustomerID = ss.CustomerID 
                 for     xml auto, elements, type 
             ) as ContactList1    
     from   Sales.Store ss
)
select  B.*
   ,    ContactList2 = B.ContactList1.query('data(*)')
from    B

This produces results like this:


Capture2


To convert this to a comma-separated list, we just need to do a conversion and some REPLACE statements:

With A
as  (
        Select    ContactName  = FirstName + ' ' + ISNULL(MiddleName + ' ','') + LastName
             ,    ssc.CustomerID
        from      Person.Contact pc
                        inner join
                  Sales.StoreContact ssc
                        on   pc.ContactID = ssc.ContactID
    )
,   B
as  (
    select  CustomerId
        ,   Name
        ,   (
                select  ContactName = REPLACE(A.ContactName, ' ','|') 
                from    A
                where   CustomerID = ss.CustomerID
                for     xml auto, elements, type
            ) as ContactList1
    from   Sales.Store ss
)
,   C
as  (
select  B.*
   ,    ContactList2 = B.ContactList1.query('data(*)')
from    B
)
select  C.CustomerID
    ,   C.Name
    ,   ContactList = REPLACE(
                        REPLACE(
                            CONVERT(varchar(max), C.ContactList2)
                        ,   ' '  -- convert spaces ...
                        ,   ', ' -- to commas (followed by spaces, if youlike)
                        )
                      , '|'  -- replace the pipe char ...
                      , ' '  -- with the original spaces!
                      )
from    C

… and the result looks like this:


capture3 Voilà: inline comma-separated lists!

Labels: , ,

Saturday, October 30, 2010

Useful Things to Know, #427

When it comes to XML, SQL Server won't always eat its own cooking.

What I mean by that is that it's quite possible to generate syntactically valid XML from valid SQL Server data, via SQL Server queries, that is semantically invalid.  SQL Server will deliver this data, but it won’t consume it or allow it to be assigned to an XML-type variable.  

Here's how.  Suppose you have the bad luck to have some non-printable characters in some of your text fields. XML only tolerates a few non-printable characters -- although I'd have to look it up, I think the ASCII characters for 9, 10, and 13 are the only ones it does tolerate.

So consider this SQL:

create table BadChars
(
TextData varchar(255)
)
insert into BadChars
values
  (char(4)) -- bad char
, (char(6)) -- bad char
, (char(10)) -- good non-printable char
, (char(13)) -- good non-printable char
, (char(33)) -- regular character

Let's retrieve this data as XML:

select *
from BadChars
for xml auto, root('BadCharTable')

This will run fine, and return you a chunk of XML in SSMS, like so:

<BadCharTable>
  <BadChars TextData="&#4;">
  <BadChars TextData="&#6;">
  <BadChars TextData="&#10;">
  <BadChars TextData="&#13;">
  <BadChars TextData="!">
<BadCharTable>

Just try doing anything with it, though:

declare @xml Xml
select @xml =
(select *
 from BadChars
 for Xml auto, root('BadCharTable')
)

The error you'll get is this:

Msg 9420, Level 16, State 1, Line 3
XML parsing: line 1, character 40, illegal xml character

That's because the XML it so happily generated a few moments ago is not valid, due to those non-printable characters, that even in escaped form are not allowed.

So SQL Server will return invalid XML in a result set, if the source data contains invalid chars. It just won't consume it, or allow it to be assigned to an XML variable.