Getting worker thread IDs using DMVs

In SQL 2000, the kpid column in sysprocesses was a convenient way to find out the actual worker thread ID (OS level thread identifier) for a given task. How does one do this using the DMVs?

Here is a quick way:

select R.Session_Id, Th.os_thread_id from sys.dm_exec_requests R
join sys.dm_exec_sessions S on R.session_id = S.session_id
join sys.dm_os_tasks T on R.session_id = T.session_id
join sys.dm_os_workers W on T.worker_address = W.worker_address
join sys.dm_os_threads Th on W.thread_address = Th.thread_address
where S.is_user_process = 1
order by Session_id

The above script will get you the session ID and thread ID for any tasks currently running, and will also filter out any non-user processes. BUT… my hand-written T-SQL looks so crowded and ugly. Let us format it with the T-SQL formatting script I shared a while back on my blog:

SELECT R.Session_Id,
Th.os_thread_id
FROM   sys.dm_exec_requests AS R
INNER JOIN
sys.dm_exec_sessions AS S
ON R.session_id = S.session_id
INNER JOIN
sys.dm_os_tasks AS T
ON R.session_id = T.session_id
INNER JOIN
sys.dm_os_workers AS W
ON T.worker_address = W.worker_address
INNER JOIN
sys.dm_os_threads AS Th
ON W.thread_address = Th.thread_address
WHERE  S.is_user_process = 1;

Looks a lot easier to read! Now, by switching the last is_user_process flag to 0, you can switch to viewing the OS thread IDs of system processes. In the next post, I will show you how to use this knowledge and information to dig even deeper and view the operation of system threads like the lazy writer. Stay tuned!

TechEd India 2013 – Bengaluru

I just delivered a session on ‘T-SQL Horrors: how NOT to code’ at TechEd India. The response has been great, and those of you who see this after attending my session, thank you very much for your interest and participation! What I appreciate the most is that people stayed till the very end despite the session being at the end of a long day. Kudos and I hope you enjoyed the session, do leave a comment or use the ‘Contact me’ link on this blog to send your feedback.

TechEd India @ Pune, here we come! All my Mumbaikar SQL fans and customers, do see if you can make it on the 25th and 26th at the Pune Marriot.

Here’s a photo (a bit too colorful and blurry but passable Smile) taken by my good friend and colleague Balmukund who presented two great sessions – one on AlwaysOn AGs and the other on Backup myths.

T-SQL formatter: a PowerShell script

EDIT 19 Aug 2022: This post has not been updated for a long time, and the link to the sample code is broken. So, I recommend you take a look at Mala’s more recent blog post and associated script at https://www.sqlservercentral.com/articles/formatting-t-sql-scripts-using-scriptdom-and-powershell If you really want to look at my original sample code, a member of the community updated my older script and shared it as a gist: https://gist.github.com/jheidt/5320803

This is a ‘quick-and-not-so-dirty’ version of a T-SQL formatter / ‘pretty printer’ in PowerShell. To use this, please install the SQL 2012 SP1 version of the SQLDOM from the Feature Pack page. (If you already have SQL 2012 client tools on your machine, you should be ready to go.) And you will need PowerShell of course Smile

Here is a sample usage of this script:

tsqlpp.ps1 -Source c:temptestcases.sql -Target c:tempout.sql

Some usage tips:

  • Right click and save the file to your machine, do NOT run it directly from this blog site
  • To execute the script your PowerShell execution policy must be set to RemoteSigned
  • The parser is by default set to 110 compatibility i.e. SQL 2012 syntax. If you are working with older compatibility levels feel free to edit the script to reference the 80, 90, 100 versions of the parser.
  • The generated / formatted script will NOT contain comments, you will lose any comments you had in the unformatted version

Please leave your comments, suggestions and asks for future versions of the script in the Comments section of this post. Thank you and I’m glad to be of help to you!

Disclaimer: This script is provided AS IS with no warranties, and confer no rights.

Named constraints: two sides to the story!

Background

Constraints in SQL Server are of the following types:

  • CHECK constraints
  • DEFAULT constraints
  • Foreign key constraints
  • NULLable constraint
  • UNIQUE constraint
  • PRIMARY KEY constraint

BTW, if you are wondering what a NULLable constraint is, it is the formal representation of a NULL / NOT NULL definition for a field. We don’t normally think of NULL definitions in that way but in the parser that is how it is represented Smile

Anyways, constraints such as CHECK, DEFAULT, UNIQUE, PRIMARY KEY, FOREIGN KEY can be named, for example:

CREATE TABLE #TableWithNamedConstraints
(
    i int not null constraint PK_TableWithNamedConstraints primary key,
    j int constraint DF_j default 100,
    k int constraint CK_k check (k > 0)
)

The issue

While this is generally considered to be a good practice for base tables, for a temp table such as the above, it can be a real problem if you have multiple connections executing the above CREATE TABLE code at the same time. The problem arises from the fact that while the temp table name is uniquified, the constraint names are not. In such cases of concurrent execution, you typically receive an error such as the below:

Msg 2714, Level 16, State 5, Line 1
There is already an object named ‘PK_TableWithNamedConstraints’ in the database.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

You can see the evidence of this in sys.objects:

select name from sys.objects
where parent_object_id = object_id(‘#TableWithNamedConstraints’)
or object_id = object_id(‘#TableWithNamedConstraints’)

The output shows that the table name is uniquified but the constraint names are not:

PK_TableWithNamedConstraints
DF_j
CK_k
#TableWithNamedConstraints_________…_____00000000001D

Workaround

This issue is described in this Connect bug as well. The workaround is therefore to not name constraints in the case of temporary tables. So for the example shown above, the more ‘robust’ version (for the temp table case) would look like this:

CREATE TABLE #TableWithNamedConstraints
(
    i int not null primary key,
    j int default 100,
    k int check (k > 0)
)

Conclusion

So the next time you have a CREATE TABLE with constraints, consider the two cases:

  • For base tables, you almost ALWAYS want to name them
  • For temporary tables, you almost NEVER want to name them

Hope this helps!

The Top 10 Issues uncovered by the SQL Server Risk Assessment Program (SQLRAP)

Firstly – Happy New Year, everyone! In my job as a Principal Premier Field Engineer at Microsoft Services, I am a regional lead for a proactive risk identification program called the SQLRAP. For over 7 years now we have evolved this from a manually conducted ‘Health Check’ offering to a highly automated and comprehensive Risk Assessment platform. I have the honor of sharing the top 10 most common patterns (a.k.a. Issues) we uncover as part of the program – you can read more about this at our MSPFE team blog.

“The operation has timed out” while using Power View in Excel 2013

This one is a quick tip. I recently upgraded to Office 2013 RTM and wanted to use the excellent Power View feature which is now an integral part of Excel! However, after I setup my tables (actually I used the Power Pivot data model) I was consistently presented with the below error message whenever I tried to open the Power View add-in: “The operation has timed out” with a title of “Power View error”.

If you encounter this, my tip to you is to reinstall the Silverlight runtime on your PC. I don’t know exactly why, but that was in some way causing the above issue. I hope it will save you a few hours of troubleshooting!

Microsoft Report Viewer 2012 update: a ‘gotcha’ to be aware of

Something which I have been waiting for a long time has finally been released! The ReportViewer 2012 redistributable RTM package is available for download now!

Deployment notes

FYI, ReportViewer is used by Management Studio (SSMS), other utilities and also by any custom application which uses this to render local RDLC reports, or within a web application to view remotely rendered reports.

SQL 2012 installation deploys ReportViewer if the Management Tools are selected for installation. The other shipping vehicle for the ReportViewer control is Visual Studio 2012. This blog post pertains more to the case where we installed ReportViewer through the normal SQL 2012 installer.

FYI, you can view the ReportViewer 2012 assembly version at C:windowsassemblyGAC_MSILMicrosoft.ReportViewer.WinForms11.0.0.0__….. Right clicking on the assembly, and viewing the Details tab will give you the version of the DLL.

Note the ‘gotcha’

Now, the updated runtime release will deploy the equivalent of SQL 2012 SP1 binaries, so you get the latest and greatest bits! These should deploy a 11.0.3010 version for the Microsoft.ReportViewer.WinForms.dll file (and other files as well.)

Now, here’s the ‘note from the field’ thing which you can only get from me Smile If you just install SQL 2012 SP1 (without later running the above download) it does not seem to update the ReportViewer control. Normally this may not have much visible impact, but if you are like me, you may want to keep the runtime up to date due to the number of important fixes in such updated versions.

Test case

In my tests, just applying SQL 2012 SP1 installation did NOT upgrade the runtime to 11.0.3000. It was still at 11.0.2100. However, applying the above updated runtime MSI will upgrade the runtime to 11.0.3010.

Your checkpoint is that the version of ReportViewer 2012 assembly under C:windowsassemblyGAC_MSILMicrosoft.ReportViewer.WinForms11.0.0.0__….. should finally be 11.0.3010 or higher. (repeat this check for the other controls such as Microsoft.ReportViewer.WebForms as well.

Conclusion

So in short, if you use ReportViewer – either indirectly (like in SSMS) or directly (through custom applications developed using Visual Studio 2012) it is highly recommended to update your RTM ReportViewer 11.0 runtime to the latest version using the MSI from the download link.

Extra T-SQL checks to complement SQL Upgrade Advisor

Background

More and more customers are looking to upgrade to SQL Server 2012, which offers some exciting features like AlwaysOn Availability Groups, ColumnStore Indexes and much more. As part of my job in the Microsoft Premier Field Engineering team, we are called upon by customers to assess their readiness to upgrade to the new version.

Normally, when we perform an upgrade readiness assessment, we also review the T-SQL code that the customer is running. This is done to uncover breaking changes and other patterns which might cause poor performance or such behavioral change post the upgrade. For breaking changes, we normally recommend customers to use the publicly available SQL Server Upgrade Advisor (SSUA) tool. However, not all breaking changes can be detected by the current SSUA toolset.

Stuff outside of SSUA scope

Here are some examples of T-SQL breaking changes which cannot be uncovered by SSUA. Some of these are boundary scenarios, but I feel it is important to list them so that you are aware of the issues. Also, some of these are known to the SSUA team but they would not incorporate checks for them in SSUA due to their ship timeline and priority ratings.

Issue Conflicting locking hints
Description A specific issue in SQL Server 2000 caused locking hints in a FROM clause for an UPDATE statement, to be ignored. While this bug was fixed in SQL 2005, it potentially gave rise to upgrade issues if the locking hint in the FROM clause was different from the locking hint for the UPDATE source table.
Example See this Connect bug for an example.
Recommendation If at all used, locking hints need to be consistent between the table source and the FROM clause of the UPDATE.

 

Issue Objects in the FROM clause must have distinct exposed names
Description In SQL 2000, identically named tables from different databases did not require an alias
Example This code will work in compatibility mode 80 while it will break on compatibility 90 and above:

select *
from DB1.dbo.sometab
join DB2.dbo.sometab
on DB1.dbo.sometab.name = DB2.dbo.name

Recommendation It is generally a good practice in any case to alias all tables in a JOIN. That is the recommended solution in this case as well.
Additional Reading You can read more about this at Nacho’s blog.

 

Issue Subqueries are not supported in GROUP BY clause
Description The GROUP BY clause cannot contain a subquery in an expression that is used for the group by list. In earlier versions of SQL Server, this was allowed. In SQL Server 2008, error 144 is returned.
Recommendation Be aware of this breaking change and revise the query logic accordingly.
Additional Reading There are some Connect articles which describe this. Read them here, here and here.

 

Issue SETUSER statement usage
Description SETUSER may not be supported in a future release of SQL Server.
Recommendation We recommend that you use EXECUTE AS instead.

 

Issue ORDER BY clauses in views
Description The ORDER BY clause is not valid in views, inline functions, derived tables, and subqueries, unless the TOP or OFFSET and FETCH clauses are also specified. When ORDER BY is used in these objects, the clause is used only to determine the rows returned by the TOP clause or OFFSET and FETCH clauses. The ORDER BY clause does not guarantee ordered results when these constructs are queried, unless ORDER BY is also specified in the query itself.
Recommendation Specify the ORDER BY clause only in the outermost query and not inside views.

 

Issue FASTFIRSTROW table hint usage
Description The usage of FASTFIRSTROW as a table hint has been disallowed in SQL 2012. We recommend that hints be used only as a last resort by experienced developers and database administrators.
Recommendation You can evaluate the query hint OPTION (FAST 1) instead.

 

Issue COMPUTE clause is not allowed in database compatibility 110
Description The COMPUTE clause generates totals that appear as additional summary columns at the end of the result set. However, this clause is no longer supported in SQL Server 2012.
Recommendation Use the ROLLUP operator instead.
Additional Info A Connect request has been logged to include this check in SSUA. Vote for it!

 

Conclusion

So, now you know the most common patterns which you need to check for in addition to what SSUA already uncovers. However let’s say you have a huge T-SQL code base and cannot do this manually. Our PFE team are working on some offerings: SQL Server Upgrade Assessment and also a T-SQL Patterns and Practices Review, which among other things will also cover the above deprecation / upgrade checks. If you need more information, you can contact me for more details.

Also if you have any comments, additional cases which are not detected by SSUA, please leave a feedback item in the Comments section below. I read each one of them!

Getting TaskUnzip to work with SQL Server Integration Services (SSIS) 2012

Background

It is a very common requirement in SSIS packages to extract contents of a compressed ZIP file, and then process that file in a data flow. From what I can see, it appears that TaskUnzip is quite a popular way to achieve this. So as part of the BI Aviation project I decided to use this for extracting the data from the ZIP files available in those datasets.

Using TaskUnzip

In my testing I used build 1.3.0.1 available for SQL 2008 (SSIS_TaskUnZip_1.3.0.1_SQL_2008.zip). I suspected it may not work in SQL 2012 as-is, but decided to give it a try. The setup instructions are included in the ZIP file, so I ran the Install_SQL_2008.cmd to install the library. That failed, because the references are to SQL 2008. So here is the corrected installation script:

net stop "SQL Server Integration Services 11.0"
gacutil.exe /u "ICSharpCode.SharpZipLib"
gacutil.exe /u TaskUnZip
copy "ICSharpCode.SharpZipLib.dll" "%ProgramFiles%Microsoft SQL Server110DTSTasks" /Y
copy "TaskUnZip.*" "%ProgramFiles%Microsoft SQL Server110DTSTasks" /Y
gacutil.exe /i "%ProgramFiles%Microsoft SQL Server110DTSTasksICSharpCode.SharpZipLib.dll"
gacutil.exe /i "%ProgramFiles%Microsoft SQL Server110DTSTasksTaskUnZip.dll"
net start "SQL Server Integration Services 11.0"
pause

The fixes are to change the references of 10.0 to 11.0 and 100 to 110, to suit the SQL 2012 version. You have to do similar corrections in the x86 version of the CMD script, because designers like the VS2010 based SSIS package editor are still 32-bit:

net stop "SQL Server Integration Services 11.0"
gacutil.exe /u "ICSharpCode.SharpZipLib"
gacutil.exe /u TaskUnZip
copy "ICSharpCode.SharpZipLib.dll" "%ProgramFiles(x86)%Microsoft SQL Server110DTSTasks" /Y
copy "TaskUnZip.*" "%ProgramFiles(x86)%Microsoft SQL Server110DTSTasks" /Y
gacutil.exe /i "%ProgramFiles(x86)%Microsoft SQL Server110DTSTasksICSharpCode.SharpZipLib.dll"
gacutil.exe /i "%ProgramFiles(x86)%Microsoft SQL Server110DTSTasksTaskUnZip.dll"
net start "SQL Server Integration Services 11.0"
pause

At this stage, you would expect the TaskUnzip task should show up in the SSIS Toolbox, but if you have a fresh installation of SQL 2012, it will not show up!

Rebuilding TaskUnzip for SQL 2012

Start by downloading the sources (I used this link and clicked the ‘download’ link on the top) and opening it in Visual Studio 2010 SP1. When you rebuild this project, you get the following errors:

The type or namespace name ‘Dts’ does not exist in the namespace ‘Microsoft.SqlServer’ (are you missing an assembly reference?)
The type or namespace name ‘Task’ could not be found (are you missing a using directive or an assembly reference?)

(errors truncated for brevity). This is due to the namespace changes in DTS in SQL 2012. You will also find that the references to the older DLLs are broken:

image

We also need to change the .NET Framework version to 4.0 runtime (read more about this here):

image

Then remove the old references and add Microsoft.SqlServer.Dts.Design and Microsoft.SQLServer.ManagedDTS (though the namespace names are the same, the DLLs they reference are different, hence you need to add them again.)

After this is done, the DLL build should succeed.

Deploying the recompiled DLL

After this is done, if you uninstall and reinstall the newer DLL you might get the following error:

C:2008>gacutil.exe /i "C:Program FilesMicrosoft SQL Server110DTSTasksTaskUnZip.dll"
Microsoft (R) .NET Global Assembly Cache Utility.  Version 3.5.21022.8
Copyright (c) Microsoft Corporation.  All rights reserved.

Failure adding assembly to the cache:   This assembly is built by a runtime newer than the currently loaded runtime and cannot be loaded.

This is because the gacutil.exe which ships with the TaskUnzip library ZIP file is actually for the framework version 3.5. To fix this you need a gacutil for v4.0. You can find an updated gacutil at “c:Program Files (x86)Microsoft SDKsWindowsv7.0ABinNETFX 4.0 Tools” (this path is on your VS.NET box). Copy these to your deployment server and overwrite the version which shipped with the TaskUnzip version. More details on this step are here.

Conclusion

That’s it! By recompiling the DLL you can add it to the toolbox and then into your Control Flows in SSIS. We will be using this component to import our BI Aviation data, so stay tuned for more details on how that is done!

And in closing, please take a minute to leave your questions or comments in the space provided at the end of this post. I do appreciate your feedback!


Disclaimer: the mention of 3rd party or community products in these blog posts in no way constitutes a recommendation or advice. These are my personal opinion and do not reflect any of my employer’s opinions.

Aviation BI Project – Part 1: Requirements

Background

One of my hobbies is following the commercial aviation scene. My job requires me to fly around a lot, and that only sparks my curiosity further around how the commercial aviation business works. What better then, that to use this hobby as a motivation to test new features in SQL Server 2012? This hybrid interest, which I call Aviation BI, is the driver behind a series of blog posts, where I will take you through a typical lifecycle of a BI project:

  • Defining requirements
  • Solution design and technology selection
  • Cleansing and importing data into the data warehouse
  • Querying the data warehouse
  • Using a multidimensional / tabular model to obtain slice / dice insights
  • Visualizing those insights in eye-catching fashion

In the first blog post of this series, I will take you through what I (as an end user) see as possible requirements. We will also review the datasets available for consumption.

Requirements

There are some categories of requirements that we have, depending on the perspective we are talking about. A typical retail consumer would want to check some statistics:

  1. Given a particular sector, which airlines serve that sector?
  2. How is the on-time performance of each airline on the sector?
  3. Which airline has been providing the cheapest tickets on the sector?
  4. What is the average arrival delay for a particular airport? Is there a pattern based on time of day?
  5. What is the average load factor (how full is the aircraft) on the sector?

As an airline executive, I might want to look at:

  1. Which are the routes with the highest traffic?
  2. Which aircraft am I routinely having service difficulties with?

As an airport administrator, I might want to look at:

  1. For my airport, is the ramp time (the time taken to taxi from gate to runway) very high compared to other airports?
  2. What are the most common locations from which international passengers arrive to my airport?
  3. Which is the busiest time for arrivals and departures for my airport?

As an aviation analyst I may have the following queries:

  1. Which type of aircraft is used to make trans-Atlantic journeys?
  2. Which operators are still using DC-10 aircraft (DC-10s were largely phased out by major operators in the late 1990s)
  3. Did the Antonov 225 make any appearances in the US recently?
  4. What are the longest non-stop flights to the USA?

And so on… these perspectives are imaginary, but I suppose they are very typical of what people in those roles would be asking to see.

Data sets available

There are some amazing datasets available in the public domain. For our purposes, we will be using the following:

Data set Data Points Row count
BTS On-time Performance Origin, Destination airports
Departure delay
Arrival delay
Taxi times
Flight distance
Flight time

Data available: 1987 onwards
Granularity: flight

146,090,545
Airline Origin and Destination Survey (DB1B) Market Number of coupons for this market1
Origin, destination airports
Ticketing and operating carrier         
Market Fare         
Number of passengers
         
Data available: 1993 onwards
Granularity: individual ‘market’
346,513,372
Airline Origin and Destination Survey (DB1B) Ticket Number of coupons in itinerary
Origin airport
Round trip indicator
Miles flown
Ticket fare per person
Reporting carrier

Data available: 1993 onwards
Granularity: individual ticket

190,783,532
Airline Origin and Destination Survey (DB1B) Coupon Origin, destination airports
Number of passengers included
Fare class

Data available: 1993 onwards
Granularity: individual coupon

584,874,480
T-100 Segment Unique carrier code
Origin, destination airport
Aircraft type
Departures scheduled and performed
Load factor
Seats vs. passengers
Freight and mail
Distance
Ramp to ramp time
Air time

Data available: 1990 onwards
Granularity: aggregated flights (monthly)

7,347,563
T-100 Market 2 Unique carrier code
Origin, destination airport
Service class
Domestic / International indicator
Passengers
Freight and mail
Distance

Data available: 1990 onwards
Granularity:

5,994,306
FAA Service Difficulty Reporting (SDR) Data available: 1994 onwards
Granularity: each service difficulty report
 

1 A ‘market’ in DB1B terminology is a break in journey other than for changing planes.

2 The difference between ‘market’ and ‘segment’ in T-100 is explained at this link. The T-100 reporting guide is also useful to understand this dataset.

   

Next Steps

Over a series of blog posts, I will take you through how we:

  1. Cleanse and import this data into a data warehouse.
  2. Build some analysis services databases on top of the relational data. We will explore both multidimensional and tabular mode databases for this step and see where each fits in.
  3. We will then consume this data – through Reporting Services, Excel and PowerView clients.
  4. We will finally do an incremental loading of ‘new’ data into the warehouse, and follow the incremental processing of the Analysis Services databases and confirm if the reporting pieces refresh the new data.

All of this, towards answering the ‘customer’ requirements which we laid out earlier!

References

For those interested in reading further about airline statistics, here are some references:

So if you have some comments on the scope of this ‘project’, I would LOVE to hear from you! Do leave a comment below if you liked this idea and would like to see more.