Category Archives: Programming

Finding SQL Server Installs Using Powershell

Old Dog, New Tricks

I’ve been writing tools for SQL Server for a lot of years. Some of these tools were never completely released. Some of them were just for me. Some of them overlapped other tools already on the market and free for all. Recently, I started updating my bag of tricks and tools. This seemed like a great time to get back into PowerShell. I decided to pull out a bit of C# code I cobbled together nine years ago as part of a tool to find SQL Server instances on a network. I never really got around to making it a “production” ready tool since there was already a most excellent one on the scene in the form of SQLPing from Chip Andrews. SQLPing is a fantastic scan tool and can scan many more things than the method covered here.

Hello Operator?

When Microsoft implemented named instances with SQL Server 2000 they had to have a way to direct incoming traffic pointed to a single IP to the correct TCP port. So, they built in a traffic cop. When SQL Server 2005 came around it was pulled from the core network listener and put into its own service, the browser service. There was little documentation on how the browser worked. Luckily for poor sods like me using a network packet sniffing tool it was pretty easy to figure out how to talk to the browser. Once I figured out how to get the browser service to tell me what instances it knew about it was trivial to implement. These days Microsoft is being much more open about these kinds of things and actually have released documentation on how the browser service and the SQL Server Resolution Protocol works.

The Basic Mechanics.

As most of you know SQL Server’s default instance is on 1433 and the browser service is on 1434. Our goal is to send a UDP packet to port 1434. According to the docs we only need to send a single byte containing the number two. This prompts the listener to give us a list of instances and what port they are bound to. When I wrote my implementation it really was that simple. I dug around and figured out how to get PowerShell to send a UDP packet. I tested it and lo’ it worked, on my machine….

It’s Never So Easy.

When I tested it on my lab VM cluster with multiple nodes and multiple instances it would fail! I just didn’t get it. My C# code from the stone age worked just fine. My PowerShell code was a hit or miss. I started troubleshooting the problem just as I had in the beginning. I fired up my network sniffer and watched the traffic flow back and fourth. Again, I saw exactly what I expected, or more accurately what I wanted to see.  A single packet with the hex number two in the first byte position. I ran the test several times over the next hour or so. Eventually, I just had to walk away from it. The next day I started over again. I read the documentation, it still said the same thing. I ran the test, still a two in the first byte position. Then I spotted it. The packet wasn’t one byte long. I went back and read the document again. It gives an upper boundary of 32 bytes but no lower limit. The packet I saw come through wasn’t one byte long it was always more than that. Armed with that I started big and worked my way down until I got errors. Now I know that a packet of three bytes always triggers a response. a  two with two zeros.

Meet QuerySQLListener.

Here is the function I put together. It takes a server name and returns an array of strings with the fun bits in it.

function QuerySQLListener{
    [cmdletbinding(
        DefaultParameterSetName = '',
        ConfirmImpact = "low"
    )]
    Param(
        [Parameter(
            Mandatory = $True,
            Position = 0,
            ParameterSetName = '',
            ValueFromPipeline = $True)]
        [string]$Computer
    )
    Begin {
        $ErrorActionPreference = "SilentlyContinue"
        $Port = 1434
        $ConnectionTimeout = 1000
        $Responses  = @();
    }
    Process {
        $UDPClient = new-Object system.Net.Sockets.Udpclient
        $UDPClient.client.ReceiveTimeout = $ConnectionTimeout
        $IPAddress = [System.Net.Dns]::GetHostEntry($Computer).AddressList[0].IPAddressToString
        $UDPClient.Connect($IPAddress,$Port)
        $ToASCII = new-object system.text.asciiencoding
        $UDPPacket = 0x02,0x00,0x00
        Try {
            $UDPEndpoint = New-Object system.net.ipendpoint([system.net.ipaddress]::Any,0)
            $UDPClient.Client.Blocking = $True
            [void]$UDPClient.Send($UDPPacket,UDPPacket.length)
            $BytesRecived = $UDPClient.Receive([ref]$UDPEndpoint)
            [string]$Response = $ToASCII.GetString($BytesRecived)
            $res = ""
            If ($Response) {
                $Response = $Response.Substring(3,$Response.Length-3).Replace(";;","~")
                #$i = 0;
                $Response.Split("~") | ForEach {
                $Responses += $_
            }
            $socket = $null;
            $UDPClient.close()
        }
        }
        Catch {
            $Error[0].ToString()
            $UDPClient.Close()
        }
    }
    End {
        return ,$Responses
    }
}

 

It Isn’t Perfect But It Works.

I”m sure there is a cleaner way to implement it but I’m really just getting into PowerShell again after several months of tinkering with it last time. If you have any suggestions or improvements I’ll gladly take them!

Changing Directions

I See Dead Tech….

Knowing when a technology is dying is always a good skill to have. Like most of my generation we weren’t the first on the computer scene but lived through several of it’s more painful transitions. As a college student I was forced to learn antiquated technologies and languages. I had to take a semester of COBOL. I also had to take two years of assembler for the IBM 390 mainframe and another year of assembler for the x86 focused on the i386 when the Pentium was already on the market. Again and again I’ve been forced to invest time in dying technologies. Well not any more!

Hard drives are dead LONG LIVE SOLID STATE!

I set the data on a delicate rinse cycle

I’m done with spinning disks. Since IBM invented them in nineteen and fifty seven they haven’t improved much over the years. They got smaller and faster yes but they never got sexier than the original. I mean, my mom was born in the fifties, I don’t want to be associated with something that old and way uncool. Wouldn’t you much rather have something at least invented in the modern age in your state of the art server?

Don’t you want the new hotness?

I mean seriously, isn’t this much cooler? I’m not building any new servers or desktop systems unless they are sporting flash drives. But don’t think this will last. You must stay vigilant, NAND flash won’t age like a fine wine ether. There will be something new in a few years and you must be willing to spend whatever it takes to deploy the “solid state killer” when it comes out.

Tell Gandpa Relational is Soooo last century

The relational model was developed by Dr. EF Codd while at IBM in 1970, two years before I was born. Using some fancy math called tuple calculus he proved that the relational model was better at seeking data on these new “hard drives” that IBM had laying around. That later tuned into relational algebra that is used today. Holy cow! I hated algebra AND calculus in high school why would I want to work with that crap now?

NoSQL Is The Future!

PhD’s, all neck ties and crazy gray hair.

Internet Scale, web 2.0 has a much better haircut.

In this new fast paced world of web 2.0 and databases that have to go all the way to Internet scale, the old crusty relational databases just can’t hang. Enter, NoSQL! I know that NoSQL covers a lot of different technologies, but some of the core things they do very well is scale up to millions of users and I need to scale that high. They do this by side stepping things like relationships, transactions and verified writes to disk. This makes them blazingly fast! Plus, I don’t have to learn any SQL languages, I can stay with what I love best javascript and JSON. Personally, I think MongoDB is the best of the bunch they don’t have a ton of fancy PhD’s, they are getting it done in the real world! Hey, they have a Success Engineer for crying out loud!!! Plus if you are using Ruby, Python, Erlang or any other real Web 2.0 language it just works out of the box. Don’t flame me about your NoSQL solution and why it is better, I just don’t care. I’m gearing up to hit all the major NoSQL conferences this year and canceling all my SQL Server related stuff. So long PASS Summit, no more hanging out with people obsessed with outdated skills.

Head in the CLOUD

Racks and Racks of Spaghetti photo by: Andrew McKaskill

Do you want this to manage?

Or this?

With all that said, I probably won’t be building to many more servers anyway. There is a new way of getting your data and servers without the hassle of buying hardware and securing it, THE CLOUD!

“Cloud computing is computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the system that delivers the services. Parallels to this concept can be drawn with the electricity grid where end-users consume power resources without any necessary understanding of the component devices in the grid required to provide the service.” http://en.wikipedia.org/wiki/Cloud_computing

Now that’s what I’m talking about! I just plug in my code and out comes money. I don’t need to know how it all works on the back end. I’m all about convenient, on-demand network access to a shared pool of configurable computing resources. You know, kind of like when I was at college and sent my program to a sysadmin to get a time slice on the mainframe. I don’t need to know the details just run my program. Heck, I can even have a private cloud connected to other public and private clouds to make up The Intercloud(tm). Now that is sexy!

To my new ends I will be closing this blog and starting up NoSQLServerNoIOTheCloud.com to document my new jersey, I’ll only be posting once a year though, on April 1st.

See you next year!

Fast File Copy With Managed Code: UBCopy update!

If you have been following my trials with working with C# and files you know it hasn’t been a bed of roses. I ran into a roadblock when I was trying to build a high performance file copy tool in the CLR. I eventually found a solution. It works but it isn’t very clean. So, I did a little more work, removed a broken bit and simplified another bit to make it a little more stable. I’m not done yet. I know there are a couple of places I can clean the code up a bit more and a way to speed things up a bit more at the cost of using more memory, but that is for another blog post.

Logging, to the rescue

The first major change was adding a logging facility to the UBCopy program via log4net. I had stayed away from adding it fearing to much overhead but after some tweaking it seems pretty darn fast. This allowed me to do some tight timings and log debug information to a log file to track long running copies without much fuss. It also allows me to turn off all console messaging since this is used in an automated fashion if something fails I’ve got the log file to look at.

Please don’t lock my file

There are a few places where the file target file being written to could potentially be locked. Right now I have to create the file and set the file length then close it before opening it for unbuffered writes. I haven’t done anything about this grab yet. The second one is when I am writing the last block to the file. Since you have to write in page aligned multiples I was closing the file, reopening it in a buffered mode and flushing the last partial buffer to disk. I haven’t fixed these up yet but I am working on a solution. I haven’t encountered a problem yet where the file gets locked during this process but it is still a potential threat that must be dealt with. Even though the file stream is seekable and writeable you can’t issue a setlength command on it. I plan on looking deeper into the filestream code and see what they are doing when you issue a setlength. If I can eliminate the open,close and open routine it will speed things up a bit and keep anyone from grabbing my file handle.
Here’s and example of what I’m talking about, after the set file size is done there is a very small window of opportunity here.

//open output file set length to prevent growth and file fragmentation and close it.
            //We do this to prevent file fragmentation and make the write as fast as possible.
            try
            {
                _outfile = new FileStream(_outputfile, FileMode.Create, FileAccess.Write, FileShare.None, 8,
                                          FileOptions.WriteThrough);
                //set file size to minimum of one buffer to cut down on fragmentation
                _outfile.SetLength(_infilesize > CopyBufferSize ? _infilesize : CopyBufferSize);
                _outfile.Close();
                _outfile.Dispose();
            }
            catch (Exception e)
            {
                throw;
            }
            //open file for write unbuffered
            try
            {
                _outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
                                          FileOptions.WriteThrough | FileFlagNoBuffering);

            }
            catch (Exception e)
            {
                throw;
            }

Unbuffered, mostly

Since unbuffered writes prevent you from writing data that isn’t memory page aligned (usually 4096), you get an error when writing to the file. To get around this all writes are the same size period, this really only effects the last write which may have a partially full buffer. It will write down the full buffer, including invalid data. Luckily, an easy fix is to reset the end of file pointer back to the correct location and everything is happy again. This allows me to eliminate an additional bit of code. here is the old code.

//open file for write buffered We do this so we can write the tail of the file
//it is a cludge but hey you get what you get in C#
outfile = new FileStream(outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
             FileOptions.WriteThrough);
//go to the right position in the file
outfile.Seek(infilesize - bytesRead1, 0);
//flush the last buffer syncronus and buffered.
outfile.Write(Buffer1, 0, bytesRead1);

And now the new, faster and better way!

//close the file handle that was using unbuffered and write through and move the EOF pointer.
Log.Debug("Close Write File Unbuffered");
_outfile.Close();
_outfile.Dispose();

try
{
    if (IsDebugEnabled)
    {
        Log.Debug("Open File Set Length");
    }
    _outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
                                FileOptions.WriteThrough);
    _outfile.SetLength(_infilesize);
    _outfile.Close();
    _outfile.Dispose();
}
catch (Exception e)
{
    if (IsDebugEnabled)
    {
        Log.Debug("Failed to open for write set length");
        Log.Debug(e);
    }
    throw;
}

Buffered or Unbuffered

If the file was smaller than the size of the buffer I was just copying the file buffered. This was a check and route to a diffrent code path for copying files. Not a huge deal but just another place for problems to creep up over time. So, I removed it when I changed the way unbuffered multi-threaded writes were happening I was able to get rid of the buffered copy routine all together. Now, it will initially set the file size to a single buffer. Write the data and reset the file pointer to the original size.

You want a MOVE but get a COPY

There was also a bug that on small files UBCopy wouldn’t do a move just a copy in some cases. There was an error in the code that prevented the delete of the source file after the copy to the new location due to a lock that was being held by the program. Not the end of the world, just wasn’t working like it should 100% of the time.

So, here is the new AsyncUnbuffCopy routine. Calling it is super easy. If you want to use the code as-is you will need to download log4net and include it in your project.

//
// AsyncUnbuffCopy.cs
//
// Authors:
//  Wesley D. Brown <wes@planetarydb.com>
//
// Copyright (C) 2010 SQLServerIO (http://www.SQLServerIO.com)
//
// Permission is hereby granted, free of charge, to any person obtaining
// a copy of this software and associated documentation files (the
// "Software"), to deal in the Software without restriction, including
// without limitation the rights to use, copy, modify, merge, publish,
// distribute, sublicense, and/or sell copies of the Software, and to
// permit persons to whom the Software is furnished to do so, subject to
// the following conditions:
//
// The above copyright notice and this permission notice shall be
// included in all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
// EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
// MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
// NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
// LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
// OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
// WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
//

using System;
using System.IO;
using System.Security.Cryptography;
using System.Text;
using System.Threading;
using log4net;

namespace UBCopy
{
    internal class AsyncUnbuffCopy
    {
        private static readonly ILog Log = LogManager.GetLogger(typeof(AsyncUnbuffCopy));
        private static readonly bool IsDebugEnabled = Log.IsDebugEnabled;

        //file names
        private static string _inputfile;
        private static string _outputfile;

        //checksum holders
        private static string _infilechecksum;
        private static string _outfilechecksum;

        //show write progress
        private static bool _reportprogress;

        //cursor position
        private static int _origRow;
        private static int _origCol;

        //number of chunks to copy
        private static int _numchunks;

        //track read state and read failed state
        private static bool _readfailed;

        //syncronization object
        private static readonly object Locker1 = new object();

        //buffer size
        public static int CopyBufferSize;
        private static long _infilesize;

        //buffer read
        public static byte[] Buffer1;
        private static int _bytesRead1;

        //buffer overlap
        public static byte[] Buffer2;
        private static bool _buffer2Dirty;
        private static int _bytesRead2;

        //buffer write
        public static byte[] Buffer3;

        //total bytes read
        private static long _totalbytesread;
        private static long _totalbyteswritten;

        //filestreams
        private static FileStream _infile;
        private static FileStream _outfile;

        //secret sauce for unbuffered IO
        const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

        private static void AsyncReadFile()
        {
            //open input file
            try
            {
                _infile = new FileStream(_inputfile, FileMode.Open, FileAccess.Read, FileShare.None,
CopyBufferSize, FileFlagNoBuffering);
            }
            catch (Exception e)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Failed to open for read");
                    Log.Debug(e);
                }
                throw;
            }
            //if we have data read it
            while (_totalbytesread < _infilesize)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Read _buffer2Dirty    : " + _buffer2Dirty);
                }
                _bytesRead1 = _infile.Read(Buffer1, 0, CopyBufferSize);
                Monitor.Enter(Locker1);
                try
                {
                    while (_buffer2Dirty) Monitor.Wait(Locker1);
                    Buffer.BlockCopy(Buffer1, 0, Buffer2, 0, _bytesRead1);
                    _buffer2Dirty = true;
                    _bytesRead2 = _bytesRead1;
                    _totalbytesread = _totalbytesread + _bytesRead1;
                    Monitor.PulseAll(Locker1);
                    if (IsDebugEnabled)
                    {

                        Log.Debug("Read       : " + _totalbytesread);
                    }
                }
                catch (Exception e)
                {
                    Log.Fatal("Read Failed.");
                    Log.Fatal(e);
                    _readfailed = true;
                    throw;
                }
                finally { Monitor.Exit(Locker1); }
            }
            //clean up open handle
            _infile.Close();
            _infile.Dispose();
        }

        private static void AsyncWriteFile()
        {
            //open output file set length to prevent growth and file fragmentation and close it.
            //We do this to prevent file fragmentation and make the write as fast as possible.
            try
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Open File Set Length");
                }
                _outfile = new FileStream(_outputfile, FileMode.Create, FileAccess.Write, FileShare.None, 8,
                                          FileOptions.WriteThrough);

                //set file size to minimum of one buffer to cut down on fragmentation
                _outfile.SetLength(_infilesize > CopyBufferSize ? _infilesize : CopyBufferSize);

                _outfile.Close();
                _outfile.Dispose();
            }
            catch (Exception e)
            {
                Log.Fatal("Failed to open for write set length");
                Log.Fatal(e);
                throw;
            }

            //open file for write unbuffered
            try
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Open File Write Unbuffered");
                }
                _outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
                                          FileOptions.WriteThrough | FileFlagNoBuffering);

            }
            catch (Exception e)
            {
                Log.Fatal("Failed to open for write unbuffered");
                Log.Fatal(e);
                throw;
            }

            var pctinc = 0.0;
            var progress = pctinc;

            //progress stuff
            if (_reportprogress)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Report Progress : True");
                }
                pctinc = 100.00 / _numchunks;
            }
            if (IsDebugEnabled)
            {
                Log.Debug("While Write _totalbyteswritten          : " + _totalbyteswritten);
                Log.Debug("While Write _infilesize - CopyBufferSize: " + (_infilesize - CopyBufferSize));
            }
            while ((_totalbyteswritten < _infilesize) && !_readfailed)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Write Unbuffered _buffer2Dirty    : " + _buffer2Dirty);
                }
                lock (Locker1)
                {
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Write Unbuffered Lock");
                    }
                    while (!_buffer2Dirty) Monitor.Wait(Locker1);
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Write Unbuffered _buffer2Dirty    : " + _buffer2Dirty);
                    }
                    Buffer.BlockCopy(Buffer2, 0, Buffer3, 0, _bytesRead2);
                    _buffer2Dirty = false;
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Write Unbuffered _buffer2Dirty    : " + _buffer2Dirty);
                    }
                    _totalbyteswritten = _totalbyteswritten + CopyBufferSize;
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Written Unbuffered : " + _totalbyteswritten);
                    }
                    Monitor.PulseAll(Locker1);
                    //fancy dan in place percent update on each write.

                    if (_reportprogress && !IsDebugEnabled)
                    {
                        Console.SetCursorPosition(_origCol, _origRow);
                        if (progress < 101 - pctinc)
                        {
                            progress = progress + pctinc;
                            Console.Write("%{0}", Math.Round(progress, 0));
                        }
                    }
                }
                try
                {
                    _outfile.Write(Buffer3, 0, CopyBufferSize);
                }
                catch (Exception e)
                {
                    Log.Fatal("Write Unbuffered Failed");
                    Log.Fatal(e);
                    throw;
                }
            }

            //close the file handle that was using unbuffered and write through
            Log.Debug("Close Write File Unbuffered");
            _outfile.Close();
            _outfile.Dispose();

            try
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Open File Set Length");
                }
                _outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
                                          FileOptions.WriteThrough);
                _outfile.SetLength(_infilesize);
                _outfile.Close();
                _outfile.Dispose();
            }
            catch (Exception e)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Failed to open for write set length");
                    Log.Debug(e);
                }
                throw;
            }
        }

        public static int AsyncCopyFileUnbuffered(string inputfile, string outputfile, bool overwrite,
bool movefile, bool checksum, int buffersize, bool reportprogress)
        {
            if (IsDebugEnabled)
            {
                Log.Debug("inputfile      : " + inputfile);
                Log.Debug("outputfile     : " + outputfile);
                Log.Debug("overwrite      : " + overwrite);
                Log.Debug("movefile       : " + movefile);
                Log.Debug("checksum       : " + checksum);
                Log.Debug("buffersize     : " + buffersize);
                Log.Debug("reportprogress : " + reportprogress);
            }
            //report write progress
            _reportprogress = reportprogress;

            //set file name globals
            _inputfile = inputfile;
            _outputfile = outputfile;

            //setup single buffer size, remember this will be x3.
            CopyBufferSize = buffersize * 1024 * 1024;

            //buffer read
            Buffer1 = new byte[CopyBufferSize];

            //buffer overlap
            Buffer2 = new byte[CopyBufferSize];

            //buffer write
            Buffer3 = new byte[CopyBufferSize];

            //clear all flags and handles
            _totalbytesread = 0;
            _totalbyteswritten = 0;
            _bytesRead1 = 0;
            _buffer2Dirty = false;

            //if the overwrite flag is set to false check to see if the file is there.
            if (File.Exists(outputfile) && !overwrite)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Destination File Exists!");
                }
                Console.WriteLine("Destination File Exists!");
                return 0;
            }

            //create the directory if it doesn't exist
            if (!Directory.Exists(outputfile))
            {
                try
                {
                    // ReSharper disable AssignNullToNotNullAttribute
                    Directory.CreateDirectory(Path.GetDirectoryName(outputfile));
                    // ReSharper restore AssignNullToNotNullAttribute
                }
                catch (Exception e)
                {
                    Log.Fatal("Create Directory Failed.");
                    Log.Fatal(e);
                    Console.WriteLine("Create Directory Failed.");
                    Console.WriteLine(e.Message);
                    throw;
                }
            }

            //get input file size for later use
            var inputFileInfo = new FileInfo(_inputfile);
            _infilesize = inputFileInfo.Length;

            //get number of buffer sized chunks used to correctly display percent complete.
            _numchunks = (int)((_infilesize / CopyBufferSize) <= 0 ? (_infilesize / CopyBufferSize) : 1);

            if (IsDebugEnabled)
            {
                Log.Debug("File Copy Started");
            }
            Console.WriteLine("File Copy Started");

            //create read thread and start it.
            var readfile = new Thread(AsyncReadFile) { Name = "ReadThread", IsBackground = true };
            readfile.Start();

            if (IsDebugEnabled)
            {
                //debug show if we are an even multiple of the file size
                Log.Debug("Number of Chunks: " + _numchunks);
            }

            //create write thread and start it.
            var writefile = new Thread(AsyncWriteFile) { Name = "WriteThread", IsBackground = true };
            writefile.Start();

            if (_reportprogress)
            {
                //set fancy curor position
                _origRow = Console.CursorTop;
                _origCol = Console.CursorLeft;
            }

            //wait for threads to finish
            readfile.Join();
            writefile.Join();

            //leave a blank line for the progress indicator
            if (_reportprogress)
                Console.WriteLine();

            if (IsDebugEnabled)
            {
                Log.Debug("File Copy Done");
            }

            Console.WriteLine("File Copy Done");

            if (checksum)
            {
                if (IsDebugEnabled)
                {
                    Log.Debug("Checksum Source File Started");
                }
                Console.WriteLine("Checksum Source File Started");
                //create checksum read file thread and start it.
                var checksumreadfile = new Thread(GetMD5HashFromInputFile) {
Name = "checksumreadfile", IsBackground = true };
                checksumreadfile.Start();

                if (IsDebugEnabled)
                {
                    Log.Debug("Checksum Destination File Started");
                }
                Console.WriteLine("Checksum Destination File Started");
                //create checksum write file thread and start it.
                var checksumwritefile = new Thread(GetMD5HashFromOutputFile) {
Name = "checksumwritefile", IsBackground = true };
                checksumwritefile.Start();

                //hang out until the checksums are done.
                checksumreadfile.Join();
                checksumwritefile.Join();

                if (_infilechecksum.Equals(_outfilechecksum))
                {
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Checksum Verified");
                    }
                    Console.WriteLine("Checksum Verified");
                }
                else
                {
                    if (IsDebugEnabled)
                    {
                        Log.Debug("Checksum Failed");
                        Log.DebugFormat("Input File Checksum : {0}", _infilechecksum);
                        Log.DebugFormat("Output File Checksum: {0}", _outfilechecksum);
                    }
                    Console.WriteLine("Checksum Failed");
                    Console.WriteLine("Input File Checksum : {0}", _infilechecksum);
                    Console.WriteLine("Output File Checksum: {0}", _outfilechecksum);
                }
            }

            if (movefile && File.Exists(inputfile) && File.Exists(outputfile))
                try
                {
                    File.Delete(inputfile);
                }
                catch (IOException ioex)
                {
                    if (IsDebugEnabled)
                    {
                        Log.Error("File in use or locked cannot move file.");
                        Log.Error(ioex);
                    }
                    Console.WriteLine("File in use or locked");
                    Console.WriteLine(ioex.Message);
                }
                catch (Exception ex)
                {
                    if (IsDebugEnabled)
                    {
                        Log.Error("File Failed to Delete");
                        Log.Error(ex);
                    }
                    Console.WriteLine("File Failed to Delete");
                    Console.WriteLine(ex.Message);
                }
            return 1;
        }

        //hash input file
        public static void GetMD5HashFromInputFile()
        {
            var fs = new FileStream(_inputfile, FileMode.Open, FileAccess.Read, FileShare.None,
CopyBufferSize);
            MD5 md5 = new MD5CryptoServiceProvider();
            byte[] retVal = md5.ComputeHash(fs);
            fs.Close();

            var sb = new StringBuilder();
            for (var i = 0; i < retVal.Length; i++)
            {
                sb.Append(retVal[i].ToString("x2"));
            }
            _infilechecksum = sb.ToString();
        }

        //hash output file
        public static void GetMD5HashFromOutputFile()
        {
            var fs = new FileStream(_outputfile, FileMode.Open, FileAccess.Read, FileShare.None,
CopyBufferSize);
            MD5 md5 = new MD5CryptoServiceProvider();
            byte[] retVal = md5.ComputeHash(fs);
            fs.Close();

            var sb = new StringBuilder();
            for (var i = 0; i < retVal.Length; i++)
            {
                sb.Append(retVal[i].ToString("x2"));
            }
            _outfilechecksum = sb.ToString();
        }
    }
}

SQLMeetings.com Is Live!

You heard right, I’ve finished with the 1.0 release. It isn’t pretty but it works. If you need a primer check out my last post. If you need a list just fill out the form and I’ll get you taken care of.

I’d like to thank Scott Stauffer(blog|Twitter) for helping me out with some tools.

I’d like to thank Sean McCown for providing me with an MSDN subscription so I could finish this project! If you aren’t following the MidnighDBA’s (blog|Twitter) you should!

If you would like to test things out then please sign up to testlist@sqlmeetings.com

Lets take a look at the features:

  • Create custom list <mylist>@sqlmeetings.com.
  • Send to list via email or web site.
  • Only list owners can send emails to list.
  • View number of replies and who has replied.
  • Manage subscriptions and list owners.
  • See who has unsubscribed and how.
  • Automatic bounce detection and auto unsubscribe from list.
  • Automatic “out of office” detection and ignore.
  • Export subscriber list.
  • Subscribe or unsubscribe via email or web.

Lets take a quick tour.

On the front page you can see a list of recent emails that have gone out.

image

You can update your account information.

image

You can manage your subscriptions.

image

You can email your list.

image

Check to see if anyone has replied to a mailing.

image

Manage your subscribers.

image

The goal is to keep it simple as possible. I hope I have done that. There are some features I would still like to have just for managing my own lists. If you have any thoughts please leave me a comment, who knows if you use SQL Meetings I’ll probably add anything you want that makes it useful to you.

Version 1.1~1.4
Cleanup and bug fixes. I’m a believer that Great is the enemy of good. This is good, so I’m launching it. That doesn’t mean it won’t have bugs and there are a few outstanding code cleanup tasks I have already scheduled. Remember, asp.net isn’t my day job at all, this was and is a learning experience for me.

Version 1.5 features(Possible):
Posterous friendly emails.
Tweet with link to message from @SQLMeetings.
Re-email a send message, you send out a mail and want to send that same mail again.

Version 2.0 features(Possible):
Linkedin integration.
Facebook integration.
Schedule reminders using generic template, you may not have your details hammered out yet but still want to send out a reminder say one week before the meeting.

Giving Back, SQLMeetings.com Is Going Live Soon

I’ve been pretty quite since the PASS Summit and with good reason. Every year we have a chapter leader meeting. Every year, there is a laundry list of things that chapters would like PASS HQ to do for them. Time and again I’ve watched other people in the community step up and build something to fill a void in the PASS structure. In the early days SQLServerCentral.com hosted websites for chapters until PASS HQ got the infrastructure in place. We lean on other tools like Google groups or meetup.com to get other things done as well but aren’t controlled by PASS or PASS HQ. It always strikes me as odd that infrastructure related items are always on the list and are always backlogged. We are a professional organization of technology people and deal with stuff like this every single day.

So, I have decided to help out. I’m staring up SQLMeetings.com. The goals are pretty simple to start with.

Provide an easy way to email your user group.
Provide an easy way for your users to RSVP.
Provide an easy way for group leaders to manage lists.
Provide an easy way for group leaders to track RSVP’s.

Provide an easy way to email your user group.
Sounds pretty straight forward. Just fire up outlook and BCC your group the email field. For a long time that is basically what I did. It was a pain to manage email changes, RSVP’s and bounced email. That led me to setup an email list server setup and moved that list there as a read only list. That was better, it provided bounce management but it was still hard to get people on to the list. Luckily, it used a SQL Server back end and I wrote an integration point with our then DotNetNuke website. If you singed up via DNN it automatically added you to the email list. If your email ever bounced you were deactivated from the list. You couldn’t change your email though, or RSVP easily ether. Now that we have moved off DNN I’ve lost the signup integration point and have fallen back to telling people to subscribe using cactuss_meetings@wesworld.net again.

Provide and easy way for users to RSVP.
I got nothing on this one. I’ve tried using meetup.com but it pushes your users to another website from your own and another barrier for them to easily RSVP. Basically, I get emails from people saying they will be there I pad the numbers and add some fudge in and order the food.

Provide an easy way from group leaders to manage lists.
If you have ever used a traditional list server everything is done via email with commands embedded in the body of the mail. It isn’t the slickest of user experiences. I was just editing the list server tables by hand, being a SQL Server expert and all.

Provide an easy way for group leaders to track RSVP’s.
Over time, you like to see how your RSVP’s stack up to actual attendance and watch the growth of your group over time. Again, I did this with a high tech piece of kit, pen and paper.

This is my goal for “1.0” feature sets.
user groups can have a personalized email.
lists are <email>@sqlmeetings.com. For example my local UG would be cactuss@sqlmeetings.com. This account is used to receive, process, and send all emails.
List owners are the only people allowed to email the list for distribution. You can have multiple list owners so one person isn’t stuck sending out the email.
List management is handled two ways. If you want to subscribe via email you send an email with subscribe in the subject to cactuss@sqlmeetings.com and it handles the rest. If you are a list owner you can manage the list via the web. Things like adding users, marking users as list owners and deactivating users is done via web.
To RSVP the only thing you have to do is hit the reply button. The list server tracks what users have replied to what email. As a list owner you can look at response rate via the web site.

The list server part is done. I wrote a windows service that handles processing the emails. The web UI will be done by the end of the week(Keep your fingers crossed). I am horrible at web stuff and have asked a couple of other folks to help out. This first iteration is beta stuff to flesh out functionality.

This is where you come in. Do you need something like this? If you want to use it just drop me an email admin (at) wesworld (dot) net or hit me up on twitter @WesBrownSQL. I need some folks to test out the base functions and start suggesting things for the 2.0 like twitter integration and post to posterous.

Oh, did I mention this is free? It is something I needed for my UG and figured others would like it too.

At The End of the IO Road With C#? Pave New Road!

Not being one for letting a problem get the best of me, I took another look at the asynchronous overlapped IO problem. If you read my last post on the subject, you know I’ve done a lot of work on this already. None of the things I said last time have changed at all. If you want to do asynchronous and un-buffered IO in C# using the native file stream calls you can’t… So, I rolled my own. The kicker is, I don’t use any unmanaged code to do this. No call to VirtualAlloc() or anything else using DLL imports. Oh, and the speed is spectacular.

The Goal

My ultimate goal was to build a routine that would do un-buffered asynchronous IO. That means I don’t want the OS doing any buffering or funny stuff with the IO’s I issue. That goes for reads and writes. SQL Server uses this method to harden writes to the disk and it also performs well with excellent predictability. If you have ever use windows to do a regular copy you will see it eating up memory to buffer both reads and writes. If you copy the same file a couple of times you will notice that the first time it runs in about the speed you expect it, but the second time it may run twice as fast. This is all Windows, buffering as much data and holding on to that buffer. That’s great for smaller files but if you are pushing around multi-gigabyte files it is a disaster. As the system becomes starved for memory it pages then starts throttling back. Your 100MB/sec copy is now crawling along at 20MB/sec.

Where we left off..

I had settled on a simple routine that would allow me to do un-buffered reads from a file and write to a buffered file ether on disk or across the network.

internal class UnBufferedFileCopy
{
	public static int CopyBufferSize = 8 * 1024 * 1024;
	public static byte[] Buffer = new byte[CopyBufferSize];
	const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

	public static int CopyFileUnbuffered(string inputfile, string outputfile)
	 {
		var infile = new FileStream(inputfile, FileMode.Open, FileAccess.Read
		, FileShare.None, 8, FileFlagNoBuffering | FileOptions.SequentialScan);
		var outfile = new FileStream(outputfile, FileMode.Create, FileAccess.Write
		, FileShare.None, 8, FileOptions.WriteThrough);

		int bytesRead;
		while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0)
		{
			outfile.Write(Buffer, 0, bytesRead);
		}

		outfile.Close();
		outfile.Dispose();
		infile.Close();
		infile.Dispose();
		return 1;
	}
}

There are two problems with this routine. First off, only the read from source is truly un-buffered. C# offers the write through flag and I thought that would be enough. I fired up process monitor and watched the IO issued on writes and it wasn’t buffer sized requests, it was always broken up into 64k chunks. So, the read request would fetch say 16MB of data and pass that to the write request who would then break that up into chunks. This wasn’t the behavior I was going for! Doing some additional research I found adding the no buffering flag to the write through flag gave me the results I was after, almost. You can’t do un-buffered writes. Synchronous or asynchronous doesn’t matter. To do a un-buffered write the buffer area that you build from the byte array must be page aligned in memory and all calls must return a multiple of the page size. Again, this just isn’t possible in managed code. So, I investigated a horrible kludge of a solution. I do un-buffered writes until I get to the last block of data. Then I close and reopen the file in a buffered mode and write the last block. It isn’t pretty but it works. It also means that I can’t use write through and un-buffered on a file smaller than the buffer size. Not a huge deal but something to be aware of if you are doing a lot of small files. If you are going the small file route the first routine will probably be OK.

internal class UnBufferedFileCopy
{
	public static int CopyBufferSize = 8 * 1024 * 1024;
	public static byte[] Buffer1 = new byte[CopyBufferSize];
	const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

	public static int CopyFileUnbuffered(string inputfile, string outputfile)
	{
		var infile = new FileStream(inputfile, FileMode.Open, FileAccess.Read
			, FileShare.None, 8, FileFlagNoBuffering | FileOptions.SequentialScan);
		//open output file set length to prevent growth and file fragmentation and close it.
		//We have to do it this way so we can do unbuffered writes to it later
		outfile = new FileStream(outputfile, FileMode.Create, FileAccess.Write
			, FileShare.None, 8, FileOptions.WriteThrough);
		outfile.SetLength(infilesize);
		outfile.Dispose();

		//open file for write unbuffered
		outfile = new FileStream(outputfile, FileMode.Open, FileAccess.Write
			, FileShare.None, 8, FileOptions.WriteThrough | FileFlagNoBuffering);
		long totalbyteswritten;
		int bytesRead1;
		//hold back one buffer
		while (totalbyteswritten < infilesize - CopyBufferSize)
		{
			bytesRead1 = _infile.Read(Buffer1, 0, CopyBufferSize);
			totalbyteswritten = totalbyteswritten + CopyBufferSize;
			outfile.Write(Buffer1, 0, bytesRead1);
		}

		//close the file handle that was using unbuffered and write through
		outfile.Dispose();

		//open file for write buffered We do this so we can write the tail of the file
		//it is a cludge but hey you get what you get in C#
		outfile = new FileStream(outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
		FileOptions.WriteThrough);

		//go to the right position in the file
		outfile.Seek(infilesize - bytesRead1, 0);
		//flush the last buffer syncronus and buffered.
		outfile.Write(Buffer1, 0, bytesRead1);

		outfile.Dispose();
		infile.Dispose();
		return 1;
	}
}

This is as close to fully un-buffered IO on both the read and write side of things. There is a lot going on, but it is still synchronous all the way. If you look at performance monitor it will show you a saw tooth pattern as you read then write since you are only ever doing one or the other. Using this to copy a file across the LAN to another server never got better than 75MB/Sec throughput. Not horrible but a long way from the 105MB/Sec I get from something like FastCopy or TerraCopy. Heck, it’s not even close to the theoretical 125MB/Sec a gigabit connection could support. That leaves the last piece of the puzzle, going asynchronous.

Threading in C#, To Produce or Consume?

We know that using the asynchronous file IO built into C# isn’t an option. That doesn’t mean we can’t pattern something of our own like it. I’ve done quite a bit of threading in C#. It isn’t as difficult as C/C++ but you can still blow your foot off. It adds a whole other level of complexity to your code. This is where a little thought and design on paper and using a flow chart can help you out quite a bit. Also, it’s good to research design patterns and multi-threading. A lot of smart people have tackled these problems and have developed well designed solutions. Our particular problem is a classic producer consumer pattern, a simple one at that. We have a producer, the read thread, putting data in a buffer. We have a consumer, the write thread, that takes that data and writes it to disk. My first priority is to model this as simply as possible. I’m not worried with multiple readers or writers. I am concerned with locking and blocking. Keeping the time something has to be locked to a minimum is going to be key. That lead me to a simple solution. One read thread and the buffer it reads into, one write thread and the buffer it reads from and one intermediate buffer to pass data between them. Basically, an overlap buffer that is the same size as the read and write buffer. To give you a better visual example before showing you the code here are a couple of flow charts.

Read File
http://www.lucidchart.com/documents/view/4cac057f-d81c-472e-9764-52c00afcbe04

Write File
http://www.lucidchart.com/documents/view/4cac0726-dd14-46a6-8d44-53710afcbe04

There a few of things you need to be aware of. There is no guarantee of order on thread execution. That is why I’m using a lock object and a semaphore flag to let me know if the buffer is actually available to be written or read from. Keep the lock scope small. The lock can be a bottleneck and basically drop you back into a synchronous mode. Watch for deadlocks. With the lock and the semaphore flag in play if your ordering is wrong you can get into a deadlock between the two threads where they just sit and spin waiting for ether the lock or the flag to clear. At this point I’m confident I don’t have any race or deadlocking situations.

Here is a simplified sample, I’m serious this is as small a sample as I could code up.

internal class AsyncUnbuffCopy
{
	//file names
	private static string _inputfile;
	private static string _outputfile;
	//syncronization object
	private static readonly object Locker1 = new object();
	//buffer size
	public static int CopyBufferSize;
	private static long _infilesize;
	//buffer read
	public static byte[] Buffer1;
	private static int _bytesRead1;
	//buffer overlap
	public static byte[] Buffer2;
	private static bool _buffer2Dirty;
	private static int _bytesRead2;
	//buffer write
	public static byte[] Buffer3;
	//total bytes read
	private static long _totalbytesread;
	private static long _totalbyteswritten;
	//filestreams
	private static FileStream _infile;
	private static FileStream _outfile;
	//secret sauce for unbuffered IO
	const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

	private static void AsyncReadFile()
	{
		//open input file
		_infile = new FileStream(_inputfile, FileMode.Open, FileAccess.Read, FileShare.None, CopyBufferSize,
		FileFlagNoBuffering);
		//if we have data read it
		while (_totalbytesread < _infilesize)
		{
			_bytesRead1 = _infile.Read(Buffer1, 0, CopyBufferSize);
			lock (Locker1)
			{
				while (_buffer2Dirty)Monitor.Wait(Locker1);
				Buffer.BlockCopy(Buffer1, 0, Buffer2, 0, _bytesRead1);
				_buffer2Dirty = true;
				Monitor.PulseAll(Locker1);
				_bytesRead2 = _bytesRead1;
				_totalbytesread = _totalbytesread + _bytesRead1;
			}
		}
		//clean up open handle
		_infile.Close();
		_infile.Dispose();
	}

	private static void AsyncWriteFile()
	{
		//open output file set length to prevent growth and file fragmentation and close it.
		//We have to do it this way so we can do unbuffered writes to it later
		_outfile = new FileStream(_outputfile, FileMode.Create, FileAccess.Write, FileShare.None, 8,
		FileOptions.WriteThrough);
		_outfile.SetLength(_infilesize);
		_outfile.Close();
		_outfile.Dispose();
		//open file for write unbuffered
		_outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
		FileOptions.WriteThrough | FileFlagNoBuffering);
		while (_totalbyteswritten < _infilesize - CopyBufferSize)
		{
			lock (Locker1)
			{
				while (!_buffer2Dirty) Monitor.Wait(Locker1);

				Buffer.BlockCopy(Buffer2, 0, Buffer3, 0, _bytesRead2);
				_buffer2Dirty = false;
				Monitor.PulseAll(Locker1);
				_totalbyteswritten = _totalbyteswritten + CopyBufferSize;
			}
			_outfile.Write(Buffer3, 0, CopyBufferSize);
		}
		//close the file handle that was using unbuffered and write through
		_outfile.Close();
		_outfile.Dispose();
		lock (Locker1)
		{
			while (!_buffer2Dirty) Monitor.Wait(Locker1);
			//open file for write buffered We do this so we can write the tail of the file
			//it is a cludge but hey you get what you get in C#
			_outfile = new FileStream(_outputfile, FileMode.Open, FileAccess.Write, FileShare.None, 8,
			FileOptions.WriteThrough);
			//this should always be true but I haven't run all the edge cases yet
			if (_buffer2Dirty)
			{
				//go to the right position in the file
				_outfile.Seek(_infilesize - _bytesRead2, 0);
				//flush the last buffer syncronus and buffered.
				_outfile.Write(Buffer2, 0, _bytesRead2);
			}
		}
		//close the file handle that was using unbuffered and write through
		_outfile.Close();
		_outfile.Dispose();
	}
	
	public static int AsyncCopyFileUnbuffered(string inputfile, string outputfile, int buffersize)
	{
		//set file name globals
		_inputfile = inputfile;
		_outputfile = outputfile;
		//setup single buffer size, remember this will be x3.
		CopyBufferSize = buffersize * 1024 * 1024;
		//buffer read
		Buffer1 = new byte[CopyBufferSize];
		//buffer overlap
		Buffer2 = new byte[CopyBufferSize];
		//buffer write
		Buffer3 = new byte[CopyBufferSize];
		//get input file size for later use
		var f = new FileInfo(_inputfile);
		long s1 = f.Length;
		_infilesize = s1;

		//create read thread and start it.
		var readfile = new Thread(AsyncReadFile) { Name = "ReadThread", IsBackground = true };
		readfile.Start();

		//create write thread and start it.
		var writefile = new Thread(AsyncWriteFile) { Name = "WriteThread", IsBackground = true };
		writefile.Start();

		//wait for threads to finish
		readfile.Join();
		writefile.Join();
		Console.WriteLine();
		return 1;
	}
}

As you can see, we have gotten progressively more complex with each pass until we have finally arrived at my goal. With zero unmanaged code and only one undocumented flag I’ve built a C# program that actually does fast IO like the low level big boys. To handle the small file issue I just drop back to my old copy routine to move these files along. You can see a working sample at http://github.com/SQLServerIO/UBCopy It also has an MD5 verification built in as well.

So, how well does it work?

FastCopy 1.99r4
TotalRead = 1493.6 MB
TotalWrite = 1493.6 MB
TotalFiles = 1 (0)
TotalTime= 15.25 sec
TransRate= 97.94 MB/s
FileRate = 0.07 files/s

UBCopy 1.5.2.1851 — Managed Code
File Copy Started
%100
File Copy Done
File Size MB : 1493.62
Elapsed Seconds : 15.26
Megabytes/sec : 102.63
Done.

I think it will due just fine.

At The End of the IO Road With C#

Previously I’ve written about doing fun IO stuff in C#. I found out that some of my old tricks still worked in C# but….

Now having done a lot of C++ I knew about async IO buffered and un-buffered and could have made unmanaged code calls to open or create the file and pass the handle back, but just like it sounds it is kind of a pain to setup and if you are going down that path you might as well code it all up in C++ anyway.

I was mostly right. I have been working on a file sync tool for managing all my SQL Sever backup files. Naturally, I wanted to be as fast as humanly possible. Wanting that speed and getting it from the CLR are two completely different things. I know how to do asynchronous IO, and with a little trick, you can do un-buffered IO as well. The really crappy part is you can’t do both in the CLR.

From my previous post, you know that SQL Server does asynchronous, un-buffered IO on reads and writes. The CLR allows you to so asynchronous reads with a fun bit of coding and an call back structure. I took this code from one of the best papers on C# and IO: Sequential File Programming Patterns and Performance with .NET I made some minor changes and cleaned up the code a bit.

internal class AsyncFileCopy
    {
        // globals
        private const int Buffers = 8; // number of outstanding requests
        private const int BufferSize = 8*1024*1024; // request size, one megabyte
        public static FileStream Source; // source file stream
        public static FileStream Target; // target file stream
        public static long TotalBytes; // total bytes to process    
        public static long BytesRead; // bytes read so far    
        public static long BytesWritten; // bytes written so far
        public static long Pending; // number of I/O's in flight
        public static Object WriteCountMutex = new Object[0]; // mutex to protect count
        // Array of buffers and async results.  
        public static AsyncRequestState[] Request = new AsyncRequestState[Buffers];

        public static void AsyncBufferedFileCopy(string inputfile, string outputfile)
        {
            Source = new FileStream(inputfile, // open source file
                                    FileMode.Open, // for read
                                    FileAccess.Read, //
                                    FileShare.Read, // allow other readers
                                    BufferSize, // buffer size
                                    FileOptions.Asynchronous); // use async
            Target = new FileStream(outputfile, // create target file
                                    FileMode.Create, // fault if it exists
                                    FileAccess.Write, // will write the file
                                    FileShare.None, // exclusive access
                                    BufferSize, // buffer size
                                    FileOptions.Asynchronous); //unbuffered async
            TotalBytes = Source.Length; // Size of source file
            Target.SetLength(TotalBytes); //Set target file lenght to avoid file growth
            var writeCompleteCallback = new AsyncCallback(WriteCompleteCallback);
            for (int i = 0; i < Buffers; i++) Request[i] = new AsyncRequestState(i);
            // launch initial async reads
            for (int i = 0; i < Buffers; i++)
            {
                // no callback on reads.                     
                Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i);
                Request[i].ReadLaunched.Set(); // say that read is launched
            }
            // wait for the reads to complete in order, process buffer and then write it. 
            for (int i = 0; (BytesRead < TotalBytes); i = (i + 1)%Buffers)
            {
                Request[i].ReadLaunched.WaitOne(); // wait for flag that says buffer is reading
                int bytes = Source.EndRead(Request[i].ReadAsyncResult); // wait for read complete
                BytesRead += bytes; // process the buffer <your code goes here>
                Target.BeginWrite(Request[i].Buffer, 0, bytes, writeCompleteCallback, i); // write it
            } // end of reader loop
            while (Pending > 0) Thread.Sleep(10); // wait for all the writes to complete                 
            Source.Close();
            Target.Close(); // close the files                     
        }

        // structure to hold IO request buffer and result.

        // end AsyncRequestState declaration
        // Asynchronous Callback completes writes and issues next read
        public static void WriteCompleteCallback(IAsyncResult ar)
        {
            lock (WriteCountMutex)
            {
                // protect the shared variables
                int i = Convert.ToInt32(ar.AsyncState); // get request index
                Target.EndWrite(ar); // mark the write complete
                BytesWritten += BufferSize; // advance bytes written
                Request[i].BufferOffset += Buffers*BufferSize; // stride to next slot 
                if (Request[i].BufferOffset < TotalBytes)
                {
                    // if not all read, issue next read
                    Source.Position = Request[i].BufferOffset; // issue read at that offset
                    Request[i].ReadAsyncResult = Source.BeginRead(Request[i].Buffer, 0, BufferSize, null, i);
                    Request[i].ReadLaunched.Set();
                }
            }
        }

        #region Nested type: AsyncRequestState

        public class AsyncRequestState
        {
            // data that tracks each async request
            public byte[] Buffer; // IO buffer to hold read/write data
            public long BufferOffset; // buffer strides thru file BUFFERS*BUFFER_SIZE
            public IAsyncResult ReadAsyncResult; // handle for read requests to EndRead() on.
            public AutoResetEvent ReadLaunched; // Event signals start of read 

            public AsyncRequestState(int i)
            {
                // constructor    
                BufferOffset = i*BufferSize; // offset in file where buffer reads/writes
                ReadLaunched = new AutoResetEvent(false); // semaphore says reading (not writing)
                Buffer = new byte[BufferSize]; // allocates the buffer
            }
        }

        #endregion
    }

The Fun bit about this code is you don’t need to spawn your own threads to do the work. All of this happens from a single thread call and the async happens in the background. I do make sure and grow the file to prevent dropping back into synchronous mode on file growths.

This next bit is the un-buffered stuff.

internal class UnBufferedFileCopy
{
    public static int CopyBufferSize = 8 * 1024 * 1024;

    public static byte[] Buffer = new byte[CopyBufferSize];

    const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

    public static int CopyFileUnbuffered(string inputfile, string outputfile)
    {
        var infile = new FileStream(inputfile,
                                    FileMode.Open, FileAccess.Read, FileShare.None, 8
, FileFlagNoBuffering | FileOptions.SequentialScan);
        var outfile = new FileStream(outputfile, FileMode.Create, FileAccess.Write,
                                     FileShare.None, 8, FileOptions.WriteThrough);

        int bytesRead;
        while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0)
        {
            outfile.Write(Buffer, 0, bytesRead);
        }

        outfile.Close();
        outfile.Dispose();
        infile.Close();
        infile.Dispose();
        return 1;
    }
}

Since this is a synchronous call I’m not worried about extending the file for performance. There is the fragmentation issue to worry about. Without that the code is a bit cleaner. The secret sauce on this one is creating your own file option and passing it in.

const FileOptions FileFlagNoBuffering = (FileOptions)0x20000000;

I hear you asking now, where did this thing come from? Well, that is simple it is a regular flag you can pass in if you are doing things in C or C++ when you create a file handle. I got curious as to what the CLR was actually doing in the background. It has to make a call to the OS at some point and that means unmanaged code.

internal class UnmanagedFileCopy
{
    public static int CopyBufferSize = 8 * 1024 * 1024;

    public static byte[] Buffer = new byte[CopyBufferSize];

    private const int FILE_FLAG_NO_BUFFERING = unchecked(0x20000000);
    private const int FILE_FLAG_OVERLAPPED = unchecked(0x40000000);
    private const int FILE_FLAG_SEQUENTIAL_SCAN = unchecked(0x08000000);
    private const int FILE_FLAG_WRITE_THROUGH = unchecked((int)0x80000000);
    private const int FILE_FLAG_NONE = unchecked(0x00000000);

    public static FileStream infile;
    public static SafeFileHandle inhandle;
    public static FileStream outfile;
    public static SafeFileHandle outhandle;

    [DllImport("KERNEL32", SetLastError = true, CharSet = CharSet.Auto, BestFitMapping = false)]
    private static extern SafeFileHandle CreateFile(String fileName,
                                                    int desiredAccess,
                                                    FileShare shareMode,
                                                    IntPtr securityAttrs,
                                                    FileMode creationDisposition,
                                                    int flagsAndAttributes,
                                                    IntPtr templateFile);

    public static void CopyUnmanaged(string inputfile, string outputfile)
    {
        outhandle = CreateFile(outputfile,
                   (int)FileAccess.Write,
                   (int)FileShare.None,
                   IntPtr.Zero,
                   FileMode.Create,
                   FILE_FLAG_NO_BUFFERING | FILE_FLAG_WRITE_THROUGH,
                   IntPtr.Zero);

        inhandle = CreateFile(inputfile,
                                  (int)FileAccess.Read,
                                  (int)FileShare.None,
                                  IntPtr.Zero,
                                  FileMode.Open,
                                  FILE_FLAG_NO_BUFFERING | FILE_FLAG_SEQUENTIAL_SCAN,
                                  IntPtr.Zero);

        outfile = new FileStream(outhandle, FileAccess.Write, 8, false);
        infile = new FileStream(inhandle, FileAccess.Read, 8, false);

        int bytesRead;
        while ((bytesRead = infile.Read(Buffer, 0, CopyBufferSize)) != 0)
        {
            outfile.Write(Buffer, 0, bytesRead);
        }

        outfile.Close();
        outfile.Dispose();
        outhandle.Close();
        outhandle.Dispose();
        infile.Close();
        infile.Dispose();
        inhandle.Close();
        inhandle.Dispose();
    }
}

If I was building my own unmanaged calls this would be it. When you profile the managed code for object creates/destroys you see that it is making calls to SafeFileHandle. Being the curious guy I am I did a little more digging. For those of you who don’t know there is an open source implementation of the Common Language Runtime called Mono. That means you can download the source code and take a look at how things are done. Poking around in the FileStream and associated code I saw that had all the file flags in the code but commented out un-buffered… Now I had a mystery on my hands. I tried to implement asynchronous un-buffered IO using all unmanaged code calls and couldn’t do it. There is a fundamental difference between a byte array in the CLR and what I can setup in native C++. One of the things you have to be able to do if you want asynchronous un-buffered IO is to sector align all reads and writes, including in and out of memory buffers. You can’t do it in C#. You have to allocate an unmanaged segment of memory and handle the reads and writes through that buffer. At the end of the day, you have written all the C++ you need to do the file copy stuff and rapped it in a managed code loop.

So, you can do asynchronous OR un-buffered but not both. From Sequential File Programming Patterns and Performance with .NET

the FileStream class does a fine job. Most applications do not need or want un-buffered IO. But, some applications like database systems and file copy utilities want the performance and control un-buffered IO offers.

And that is a real shame, I’d love to write some high performance IO stuff in C#. I settled on doing un-buffered IO since these copies are from a SQL Server which will always be under some kind of memory pressure, to the file server. If I could do both asynchronous and un-buffered I could get close to wire speed, around 105 to 115 megabytes a second. Just doing un-buffered gets me around 80 megabytes per second. Not horrible, but not the best.

What happens when Windows is the step-child? Adventures in Ruby on Rails.

Like many of you I’ve heard the developer community going on about Rails for quite a while now. It wasn’t until recently I had any reason to dip into that world. Over at Nitosphere the website is all run on Rails. We got a inexpensive web host and it was pretty easy to get it up and running. Like most shared web host, it is all linux/open source based. We have now grown to the point that hosting our own server would be cheap enough and give us complete control over the box. As a Microsoft ISV I thought it would be nice to have our new box be a Windows box. It would also be nice to hook in to SQL Server instead of MySQL as well. After a little digging I did find that Microsoft is sponsoring IronRuby, a Ruby clone that runs on the .net platform. Unfortunately, it isn’t completely compatible with one of the packages that we need to run the website on. So, back to Ruby. There is also a gem to run Rails apps against SQL Server, It isn’t compatible with some of the stuff on our website ether. Finally, I fell back to ODBC to connect to SQL Server. Everything wired up but there was still an incompatibility issue. I’ll keep trying to work it out but our fall back was MySQL.

You will need:

Source Control:

If you plan on getting the source for anything you will need ether GIT or Subversion.

GIT for windows:

http://code.google.com/p/msysgit/

http://msysgit.googlecode.com/files/Git-1.7.0.2-preview20100309.exe

Subversion clients:

http://www.sliksvn.com/en/download/ basic client 32bit or 64bit

some folks prefer tortoisesvn

http://tortoisesvn.tigris.org/ 

 

Ruby core:

http://rubyinstaller.org/

rubyinstaller-1.8.7-p249-rc2.exe

Ruby 1.8.7 has the most compatibility with existing gems. Get the latest installer if the one linked isn’t it. There are installers for 1.8.6 and 1.9.1. Again, check to see if they will support the gems you will need to get your site up and running!

http://rubyforge.org/frs/download.php/66888/devkit-3.4.5r3-20091110.7z

install the dev kit if you would like to compile some gems instead of manually downloading them. If you don’t install the devkit some gems will fail to install since they can’t compile to a native extension. To get around that you can also use –platform=mswin32 when you install a gem.

Example: gem install fastercsv –platform=mswin32 will fetch the precompiled windows gem if it exists. for more information on the devkit check out this link. http://www.akitaonrails.com/2008/7/26/still-playing-with-ruby-on-windows

 

Databases:

Sqlite

http://www.sqlite.org/download.html

http://www.sqlite.org/sqlite-3_6_23.zip

http://www.sqlite.org/sqlitedll-3_6_23.zip

By default when you create an application Ruby on Rails defaults to sqlite3. If you want to use Sqlite3 you will need to download the dll’s and the command line executable.

MySQL

http://www.mysql.com/downloads/mysql/5.1.html 

Since I couldn’t get our site to talk to SQL Server we are staying on MySQL for now. You can use the latest installer 64 bit or 32 bit but you must have the 32 bit library for Ruby to work properly. The libmySQL.dll from the 5.0.15 install did the trick for me.

http://downloads.mysql.com/archives.php?p=mysql-5.0&v=5.0.15 

SQL Server

If you would like to try the SQL Server adapter just do a gem install activerecord-sqlserver-adapter to get it.

 

HTTP Ruby Server:

I went with mongrel, it may not be the best but it was pretty easy to setup and get up and running. I am running version 1.1.5 right now since that seems to work best with mongrel_service which you will need if you don’t want to stay logged into your web server with a dos prompt running. I opted for the latest beta of mongrel_service since it cut out some dependencies and seems pretty stable at the moment. As always adding –pre gets the latest beta gem. Also, –include-dependencies will grab everything the gem will need to run, including mongrel.

 

Gems specific to my install:

Substruct uses rmagick for thumbnail generation, which requires image magic to do the actual work.

http://www.imagemagick.org/script/binary-releases.php?ImageMagick=dv31jd0gev1d3lk182a4pma8i6#windows

http://www.imagemagick.org/download/binaries/ImageMagick-6.6.0-7-Q8-windows-dll.exe

Redcloth is a textile markup language for Ruby. If you didn’t install the devkit don’t for get to add –platform=mswin32 to your gem install commands.

http://rubyforge.org/frs/?group_id=216&release_id=36337

 

My installation steps:

Install Ruby

Make sure c:\ruby\bin (or where you installed it to) is in the path. I recommend a path with no spaces so no c:\program files.

extract the devkit to the c:\ruby directory.

extract the sqlite exe and dlls to c:\ruby\bin

extract libmySQL.dll from the 32 bit 5.0.15 archive

Open an command prompt with administrator privileges.

Issue these commands:

gem update –system

gem install rails –no-ri –no-rdoc

gem install sqlite3-ruby –no-ri –no-rdoc

gem install mysql –no-ri –no-rdoc

gem install mongrel_service –no-ri –no-rdoc –platform mswin32 –include-dependencies –pre

 

After that install any gems you need for your Rails app. Make sure and test that your app works in production mode with mongrel before anything else. There will be some kinks to work out I’m sure. Once you are happy that everything is running as expected you can install your mongrel service.

mongrel_rails service::install -N MyAppsServiceName -c c:\app\myapp -p 3000 -e production

The –N is the service name. –c is where the app will be served from. –p is the port number that it will listen on. –e is the mode it will run in like development or production. I chose a few high ports 3000 to 3008 for my services to run in.

You can always remove a service if something is wrong.

mongrel_rails service::remove -N MyAppsServiceName

 

Setting up IIS7:

Install the application request routing 2.0 and URL Rewrite plug-ins using the Web Platform Installer

http://www.microsoft.com/web/Downloads/platform.aspx

Once that is done you will need to create a new web farm.

 create farm

Next you will need to add at least one server entry. You may want to edit your host file and add additional aliases to your IP Address so you can run multiple copies of mongrel to service all request. The recommendation is one per cpu/core.

add server

The last step the wizard ask if you want to add the routing rules. The answer is yes.

add rules

You can confirm the routing rules are in place.

edit inbound rule

Make sure you have a website in IIS running and listening on port 80. Without this there is nothing for IIS to route to your new server farm.

 

If you have any questions post them up. I’m not a Rails expert but I have just been through the pain of Rails on Windows!

Sometimes, you have to fix it yourself

The Problem

SQL Server is a huge product with lots of moving parts. Bugs happen. Microsoft has a place to voice your issues or problems. They allow you to vote on the issue and then decide when or if it will get fixed. I’ve used Connect when I hit a bug and I have voted on items that were important to me. Recently I hit a bug in sp_createstats. I use this system stored procedure generate statistics in an automated process I’ve got that manages statistics. I added a new vendor database to the system and on the first run hit “Column ‘DAYSOPEN’ in table ‘dbo.TBL_OPPORTUNITY’ cannot be used in an index or statistics or as a partition key because it is non-deterministic.”. Well, we all know you can’t create stats on a computed column! I quickly went to the connect site and someone else had already entered it. The down side was it had so few votes it was only slated to go into the next cumulative update/service pack. When I hit this issue they hadn’t yet announced service pack 4. I already had this procedure coded into my routines and really didn’t want to rewrite them to get past this one problem.

The Solution

!!WARNING!!

By doing what I am about to describe could break at a later date or randomly kill baby kittens.

Since it is a system stored procedure I am loathe to make any changes to it directly. There are ways to modify some system stored procedures but they involve the installation CD and creativity. With that door closed there was only one avenue open to me. Create my own system stored procedure with the fix in it. There is a problem with this solution as well, if it gets dropped due to a service pack or an upgrade anything calling it will break. The first thing I did was to see if the procedure text was available by executing sp_helptext sp_createstats. Luckily it was! Now all I had to do was figure out where it was broken. The procedure is pretty simple and uses some cursors to loop through all the objects and create column statistics where they don’t exist.

declare ms_crs_cnames cursor local for select c.name from sys.columns c  
     where c.object_id = @table_id  
     and (type_name(c.system_type_id) not in ('xml'))  
     and c.name not in (select col_name from #colpostab where col_pos = 1)  
     and ((c.name in (select col_name from #colpostab)) or (@indexonly <> 'INDEXONLY'))
    -- populate temporary table of all (column, index position) tuples for this table  

It was pretty easy to spot. The weren’t checking to see if the column was computed so I added a line to the where clause.

and c.is_computed = 0

That’s it. One little check to see if it is a computed column. Now that I had fixed it I created a new procedure named sp_createstats_fixed in the master database. Just creating it in master doesn’t make it act like the original procedure or make it a system stored procedure. For that I had to execute EXECUTE sp_MS_marksystemobject ‘sp_createstats_fix’. This is an undocumented stored procedure and could change or go way any time. The only way to unmark it in SQL Server 2005 is to drop the procedure and recreate it. Now it acts just like the old procedure. Next I had to replace all references to the old proc with the new one. I made an entry into our bug tracking system about the change so we would have a record of what I did and why.

Conclusions

This wasn’t the most elegant solution. It could break later. The upside is it only took me about 30 minutes to fix and deploy versus the hours of re-coding and then testing that I would have had to do before. Do I think you should go around creating your own system stored procedures? Not at all. I don’t recommend you put anything in the master database period. If the problem had been more complex I would have redone the original routines to exclude the broken procedure. This time it just happened to be a very quick fix to a non-critical part of our system.

Out Of My Comfort Zone: Building a Web App

If you read Fundamentals of Storage Systems – Stripe Size, Block Size, and IO Patterns you know I built a little web tool to help you with sizing and estimating your RAID array’s performance. This is way out of my area of expertise. Luckily for me I like a challenge and had a guiding hand from some friends. I haven’t done any web programming since I wrote a photo album mod for Snitz! forum package in 2002, using classic ASP. I still get thank you emails from folks that have been running it for years. Needless to say my skills are a little rusty. My first instinct was to fire up Visual Studio 2008 and build an ASP.net page tied to a back end SQL Server 2008 database. Well, ASP.net is just different enough that I was struggling to do the most basic things. So, I thought it would be good to “get back to basics”. A very good friend of mine has been a professional web developer for the same company since about the time my photo album came out. He lives and breathes web technology. The problem was he doesn’t do ASP.net at all. Everything he does is standards compliant HTML and JavaScript. I told him about my spread sheet of calculations and my desire to turn it into a web page. Joe quickly begged me to leave him alone. Once I calmed him down, he did a little sample page to get me on the right track. I took a look at it and thought it would be easy to goof around in this JavaScript stuff. I was wrong. What a convoluted world web developers live in.

I hacked on it the weekend before Christmas and was pretty happy with my handy work, right up until I showed it to some people. Comments like “Kill it with fire!” were common. I explained what it was and that took some of the hostility out of the feedback. The next Monday I asked the two lead web heads at work to look at it. Once Ben had washed his eyes out with bleach he told me under no certain terms could he fix it. My powers of persuasion, and a threat to never approve another schema change, helped bring him around. He explained several new technologies to me I wasn’t aware off, like CSS. Ben showed me quickly how to format the page so peoples heads wouldn’t burst into flames when they saw it. He also spoke of things like Ajax and JSON. Being the clueless data guy that I am I turned to my trusty friend yet again, Google.

Quick Definition List

JavaScript, the underpinning of the modern dynamic web 2.0 world.

Ajax (asynchronous JavaScript and XML, a group of interrelated web technologies that allow for dynamic web design.

JSON (JavaScript Notation), a lightweight data-interchange format, a way to structure data like XML implementing name/value pairs and ordered lists.

JQuery, a JavaScript library used to traverse HTML elements and build dynamic content easily.

JavaScript Implementation

What struck me was just how much had changed, yet stayed the same. One of the things I’ve always hated is no two browsers render the page the same way. If it worked in Firefox it didn’t work in IE. The other thing I didn’t realize for quite a while was any of the browsers would happily run the worlds worst code. I’m not talking poorly written code, I mean flat wrong. They will let any old JavaScript run. JavaScript is defined as a loosely typed language. The way it is implemented it goes from loosely to sloppy very quickly. Don’t get me wrong here. The language specification EMCAScript, is solid. The implementations are very poor in most browsers and that is where the problems developing for JavaScript really come in. I’m not saying JavaScript is slow, quite the opposite in fact. It is just difficult to make sure you are actually writing the correct code. Having worked with Visual Studio for quite some time, I’ve become spoiled with having my IDE tell me when the syntax is wrong before I hit the compile button. So, I was back to Google again looking for tools to help me out.

The Tools

I quickly settled on Aptana Studio 2.0 to do the actual code work. I also used Notepad ++, which is by far the best text editor in the world. with Aptana Studio allowing me to see errors in real time and use the preview function to look at page renders without having to have browser windows open. I did look at Microsoft’s new Expression Suite 3.0. It is nice. The designer is slick, but in typical Microsoft fashion it mangles some parts of the code that I then have to fix by hand. Aptana Studio also has a built in JavaScript minimizer that shrinks the size of your JavaScript files for quicker load times. Aptana Studio doesn’t enforce the use of line terminators and when I compacted my JavaScript it wouldn’t load. I did some additional digging and found an awesome Lint tool for validating JavaScript JSLint. My favorite thing about it is the quote and link on the front page “Warning: JSLint will hurt your feelings.” I knew I had found the right tool. After several passes I got my code to validate and minimize without any errors. I quickly looked for other Lint based tools and found one for JSON as well JSONLint.

The Work

Over the course of a few weeks I tinkered with the page, adding features, correcting mistakes and cleaning up the layout. The biggest thing that I learned is just how fast JavaScript can be. Even after I did the initial page I had no doubt a database would be involved at some point. I had compiled a small database of hard drives and their characteristics and wanted to add that to the page. The problem was I would basically have to redo the page again in asp.net or some other dynamic language to pull the data from the database. So, I cheated. As a proof of concept I built a JSON structure with a sample of the data to load the drop down list box and all the variables dynamically. I was stunned at just how fast the page was to respond. I hacked together some T-SQL that would generate the JSON object from my database, all 1200 entries. I cleaned it up a bit made sure that JSONLint validated it ran the minimizer on it and plugged it in. Holy cow, it worked! It was also still very vast. I just couldn’t believe it. The only drawback is having to update the JSON if I add new drives to the database.

 

What I’ve Learned

The main takeaways for me on this project are simple, web development stinks. Now I know why web folks have a hard time with T-SQL, they already have to be experts in five or six different technologies across two or more platforms. I use to introduce Joe as “My friend who does JavaScript”, now I know is is a hard core developer! What I build is simple and works. I think the layout is tolerable. If you have any suggestions or feed back just drop me a note.