Monthly Archives: July 2010

The Dangers of Sub-queries!!!!

Ok, now that I have your attention this really should be titled the danger of not qualifying all objects, but that isn’t quite as sensational enough to make you click through to here :)

Imagine if you will, a developer expects ten records to be inserted into a table and the query has been running for almost an hour.  A quick glance and sp_who2 shows that it is using a lot of IO, I mean a lot. The first thing that happens is a command decision from the boss to kill the query. I took a quick peek and found the plan still in the plan cache while we waited for this thing to roll back. As I look at the plan I see something really wrong. It looks like the table receiving the inserts isn’t getting the ten records we thought but 169 million records, every record from the table in the select clause.

I look at the query and on the surface everything looks good. When I hit the check mark it compiles without error. Looking at the plan in detail I notice a little icon that tipped me off on where to look next.

image

Oh that’s bad. Warnings: No join predicate. Ouch. But this is a sub-query, how can I have no join predicate? Lets take a look at an example.

DROP TABLE #t1
DROP TABLE #t2
GO
CREATE TABLE #t1 (
  t1id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t1
 ADD   PRIMARY KEY ( t1id )
 
CREATE TABLE #t2 (
  t2id    INT   NOT NULL,
  t1name  VARCHAR(50)   NULL,
  t1place VARCHAR(50)   NULL)
  
ALTER TABLE #t2
 ADD   PRIMARY KEY ( t2id )
 
INSERT INTO #t1
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

INSERT INTO #t2
SELECT 
  spid,
  loginame,
  hostname
FROM   
  MASTER.dbo.sysprocesses
WHERE  hostname <> ''

This gives us two tables with some data. We want to find all the records in #t1 that also exist in #t2 but instead of a join we will use an IN and a sub-query.

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t1id
                FROM   
                 #t2)
 

Looks simple enough and will pass the compile test all day long even though t1id doesn’t exist in #t2. Since you can use columns from the top level query in the sub-query this is a perfectly valid piece of T-SQL.

It gives us this plan when we look at it though.

image

And there you have it, a join without an ON clause as far as the optimizer is concerned. By not qualifying all the columns in the sub-query we opened ourselves up to this error. This isn’t the optimizers fault! If we re-write this as it was mean to be:

SELECT 
  t1name,
  t1place
FROM   
  #t1
WHERE  t1id IN (SELECT 
                 t2id
                FROM   
                 #t2)
 

We get a plan that is more to our liking.

image

You could also re-write this as a join instead of using the IN clause and would have avoided this problem as well. Scary way to learn that SQL Server will do exactly what you tell it to even if you are wrong!

Fusion-IO releases new 2.1 driver and firmware

And it is well worth the upgrade. I recently had the opportunity to interview David Flynn CEO of Fusion-IO and that will be coming up soon. I have been beta testing the 2.0 driver for quite some time and have been very happy with the performance and reduction in required system memory (by half!). The 2.1 driver is the official release of the 2.x series and has some gains even over the 2.0 drive I’ve been testing. I always to a little test run with HD Tach before diving into my other more detailed tools and right off the top the 2.1 driver is faster yet again than the 1.27 and the 2.0 driver. The blue is the 2.0 the red is the 2.1. I don’t know about you but getting a performance bump from a firmware and driver upgrade is always a good thing!

 

image

Creating UDL Files On The Fly

Many years ago, in the dark ages of DTS I created a little app that would take two parameters and build a UDL file. This an some crafty VBScript allowed me to loop through a list of servers on the fly. I haven’t thought about this code in almost ten years when I came across John Paul Cooks’ blog post on using UDL files. I thought I’d just post up the code, it is basic but got the job done! I did clean it up a bit since it was written when .net 1.0 was all the rage.  The secret sauce is writing the file encoded Unicode and doing binary writes to get all the little bits in the correct palace. the UDL file format is picky that way. Enjoy!

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.IO;
namespace CreateUDL
{
 class Program
 {
 static void Main(string[] args)
 {
 string str2 = "";
 string str3 = "";
 try
 {
 str2 = args[0];
 str3 = args[1];
 }
 catch
 {
 if (args.Length < 2)
 {
 Console.WriteLine("Not enough arguments!");
 return;
 }
 }
 str2 = str2 + str3 + ".udl";
 if (str2.Length == 0)
 {
 Console.WriteLine("must provide file name");
 return;
 }
 if (str3.Length == 0)
 {
 Console.WriteLine("must provide server name");
 return;
 }
 else
 {
 try
 {
 FileStream output = new FileStream(str2, FileMode.OpenOrCreate);
 BinaryWriter writer = new BinaryWriter(output);
 writer.Write((short)(-257));
 string s = "[oledb]\r\n";
 byte[] bytes = new byte[(s.Length * 2) + 1];
 bytes = Encoding.Unicode.GetBytes(s);
 writer.Write(bytes);
 s = "; Everything after this line is an OLE DB initstring\r\n";
 bytes = new byte[(s.Length * 2) + 1];
 bytes = Encoding.Unicode.GetBytes(s);
 writer.Write(bytes);
 s = "Provider=SQLOLEDB.1;Integrated Security=SSPI;"+
 "Persist Security Info=False;"+
 "Initial Catalog=master;Data Source=" + str3 + "\r\n";
 bytes = new byte[(s.Length * 2) + 1];
 bytes = Encoding.Unicode.GetBytes(s);
 writer.Write(bytes);
 writer.Flush();
 writer.Close();
 writer = null;
 output = null;
 }
 catch(Exception e)
 {
 Console.WriteLine(e.Message);
 }
 }
 }
 }
}



Building My New Home, SQL Server IO

April of 2009 I decided it was time to blog about my SQL Server experiences in earnest. Steve Jones (twitter|blog) over at SQL Server Central was very supportive and gave me some space there. I’ve been playing with diffrent blogging platforms over the last six months and have settled on wordpress. I’m also taking some advice from Brent Ozar (twitter|blog) and syndicating it out to the world. Again, Steve has been nice enough to just feed my new blog into my old blog space on SSC so there won’t be any loss of old articles or comments on them. With that said you can visit me at http://www.sqlserverio.com I plan to grow it to be more than a blog but who knows how that will turn out!

The PASS Session Selection Process, My Experience Part 3

In my last two posts I talked specifically about the process. This round I’m going to discuss the tools of the trade. Note: These are my views and opinions and not that of PASS in any way shape or form.

This was my first year on the program committee. From talking to other people on the committee, there have been several tools and methods used in the past to do the work of selecting abstracts. I can’t speak to the previous tools just to this years. And it will change again next year. PASS is a dynamic, volunteer, organization things can change pretty quickly when they need to. I also have to say that Elena Sebastiano and Jeremiah Peschka were both very helpful and responsive to questions. Lance Harra was also on the Professional Development track team and was easy to work with. This isn’t Lance’s first time doing this and for that I was grateful. He helped me stay focused and really helped guide the selection process. In all, the final selection process was pretty smooth in that regard.

The Tool
At first blush the web based tool we were to use seemed pretty simple. Jeremiah did a training session and I felt like I could use it without much fuss. The tool is integrated into the main PASS website, which is based on DotNetNuke. Since they have to work inside the DNN framework there are some limitations. It has a limited amount of space to display a lot of information. In the inner panel you could have to scroll down and to the right.

The main page has all the sessions listed, but on multiple pages. I actually missed this my first night going through the first page of submissions. I thought there were only 20~ submissions because I couldn’t see the page counter until I scrolled the frame all the way to the bottom.

The detail page was laid out pretty well the first thing I would change if possible would be the column names. They looked just like that, camel case column names. Secondly, the section to enter notes and set ratings was a little slow to use. You had to click on a rate button fill out the form and submit. This requires a server round trip every time. It does keep you from losing anything you have put into the form so far though.

When you are done with the detail ratings you are back to the main page for your final ratings. Again every button push was a server call and got tiring at times. This is just your final ratings and reason for rejection or to approve an abstract. You also have to set a final reason that ultimately is used by the heads of the program committee to pick the session list.

This brings me to the selection of reasons an abstract was rejected. I have to say it was limited and was difficult to choose. There isn’t a “You were awesome but not enough slots” in the drop down. We have to put in a reason so I tried to pick the most appropriate one I could.

In all, the tool was functional and allowed us to do the work. Again, this is the first year for this tool and I’m sure it will undergo some changes.

Odds and Ends

One of the things I thought was odd was the lack of knowledge sharing. I could see my partners totals for each submission but not any of the notes. Since we aren’t in the same room let alone in the same state it pretty much means out of band emails or phone calls to talk about abstracts. Also, as my first time doing this it would have been nice to see why Lance had rated an abstract the way he did.

After talking with some of the submitters, it appears that they don’t get any feedback on why they were chosen or not, just what we picked in the reasons drop down. I took notes on pretty much every abstract with the assumption that it would be fed back to the submitter, so if they chose to submit again next year they wouldn’t make some of the same mistakes.

Lastly, a speaker is limited to the number of sessions they can present. This guarantees that you don’t see the same three people the whole summit. The problem is we don’t know if they have been chosen more than the allotted times. If we pick them then they get pulled for another track they have to depend on our alternate selection to fill a slot. We did some second guessing on some folks with the assumption they would be gobbled up by other tracks. In hind sight I it would have been helpful if we knew the person had put in say five submissions to and what tracks to make our choices a little better. Possibly prioritize the tracks and publish to the tracks down stream who is off the table. Maybe even allow the submitter to put a preference on their submissions so we have just that little bit more information on what they would like to speak on as well.

The PASS Session Selection Process, My Experience Part 2

In part one I talked about the abstracts. While important, abstracts are only one part of a complicated selection dance. Since there are so many submissions and so few slots even the best abstracts may not be chosen.

The Selection Criteria

We were given a set of instructions and categories to rate things. There weren’t too many of them and they were subjective in nature. We rated abstracts, topics and the presenter with a final subjective rating to act as a catch all.

Abstracts
Covered in part one.

Topics
This was a little odd to me. The topic and abstract are clearly dependent on each other. I wouldn’t approve an abstract if the topic wasn’t something appropriate. Conversely, I wouldn’t approve a topic if the abstract was weak. Also, don’t confuse the title of the abstract with the topic. The abstract could be the best written one in the world and not get approved if the topic was say how to knit doilies.

Presenter
This was probably the one thing that soaked most of my time. It involved lots of footwork on my end. If I don’t know you, haven’t seen you present or don’t have enough information how do I rank you? This is where filling in your Bio on the abstract submission is important. If you had spoken at a previous PASS Summit there is a possibility I could see your past rankings. They range from 1 to 5. If you are in the high 3’s or 4’s that works out well enough for me. If you haven’t spoken at the Summit I would look at your online presence. Do you blog, tweet or do other things to show you can communicate with the community? Also, I look to see if you have presented at a regional or local level and try to contact people I do know to ask how they thought you did. If you have spoken and I can find your slide decks that also helped me out. Luckily, I have been involved with the community for a long time and have attended every PASS since 2003.

Subjective Rating
I really don’t know how to deal with this one. I used it mostly to sum up my thoughts put a final rating on the submission. To me, all of it is subjective. It’s my opinion if the submission should move forward. This is like saying “I love the abstract, topic and speaker but the sky is blue today so I’m giving it a 1”.

Final Thoughts

Personally, I’d like to see clear breakouts with instructions on how to use them. There are several fundamental criteria that would keep a session out right off the top, you answer them and if they don’t tally up you move on to the next submission. There are exceptions to every rule, but in most cases I think it would work well.

My next, and probably last, post will cover the tools and processes that PASS makes available to do this job.

The PASS Session Selection Process, My Experience Part 1

Well, I have just finished up my part in the abstract selection process for the PASS Summit. This was my first year as a volunteer in this process so I don’t have any specific previous experience to draw on. I have some experiences that are similar. Some of you may know that I was actually a double major Mass Communications and Theatre in college. I have experience in putting together competitions and also as a judge at several events. So, while I might not have PASS experience I have had to judge others on their abstract writing and presentation skills. This process isn’t an easy one. It also is getting more difficult every year just due to the volume of submissions. To put into perspective, there were 7 regular session professional development slots with 4 alternates and 47 submitters. We had more seasoned speakers than slots for sure. I won’t know if the sessions I promoted will be selected or not. Unlike Steve, I didn’t get the impression my word was the final one. I understand why Steve isn’t happy about the selection process, I just assumed that my recommendations were just that, recommendations and someone else would have the final say. There are other factors I have no control over. If a speaker is chosen by other tracts there is no guarantee that the speaker will be available for the PD track.

Abstracts

Quite a number of the abstracts were well written with clear goals. A number of them though needed some revisions or additions. I went to the PASS website and looked at the abstract submittal guidelines and found them incomplete. If these had been submitted without some additional information or not by a known speaker and leader in the community I saw them as incomplete. Lets break it down.

The Title:
Being witty is fun, but if I don’t know what are actually talking about the people seeing it in the program guide won’t ether and may skip you over.

The Bio:
Even if you are well known, a Bio just helps set that in stone. It also keeps me from doing two things, digging around the internet to see what you have done or relying on my less than perfect memory about you. Without a Bio you are trusting me to gather the information and set you apart from the other 40+ people wanting a slot.

The Abstract:
You don’t have to write War and Peace. You do need to be descriptive and clear. If you can do that in the space of a Tweet great, you have 1000 words don’t be shy. Focus your topic. Don’t mix and match several things that may not even be related in the same presentation. If the first third applies to me but the other two thirds don’t I probably won’t go to the presentation at all. Again, funny is great but if you don’t cut to the chase I don’t know what your topic is really about.

Session Prerequisites:
Unless this is a 100 level session you will have to detail some prerequisites that your audience should have. Since of humor or personality flaws aren’t prerequisites. One year of analysis services is.

Session Goals:
Right now we have three slots on the form. Most folks didn’t take full advantage of listing the learning points.  The goals should be tangible and something you can repeat on your own after sitting through the session. Feeling better about yourself is something you get from therapy. Learning how to prioritize tasks is a goal. If there aren’t enough slots then use your abstract space for that as well.

Level:
This seems to be one of the more difficult areas. It can be subjective. I would use your prerequisites and goals guide you. If there are a few specific prerequisites that generally indicates something higher than novice. If it requires specific knowledge about a specific feature that may be advanced or higher. If your goals are very specific about a feature or technique that may also raise the session level.

Next post I’ll talk about the process and tools that we used to make these difficult decisions.