I posted earlier an overview of the SQLCLR component of the SQL Server 2005 and showed how to implement a simple user-defined function using Visual Studio 2008 and C#. This post explains the importance of .NET Framework to SQLCLR by adding more functionality to SQL Server 2005.

Introduction:

Using T-SQL, we can create a user-defined function, stored procedures, and triggers, but we can’t do add a user-defined aggregate function. An aggregate function operates against a collection of values, but returns only a single value, for example, the sum(), avg(), and count() functions. Each of those functions, work on all the rows in a specified column, but only returns one value like the sum, the average, or the number of rows.

But what if I needed to create my own aggregate function that does a special job, using T-SQL there is no direct way. Still, we can use cursors and loop for each row, which may result in complex code with a bad performance. Fortunately, SQLCLR provides us the ability to implement user-defined aggregate.

Implementation:

The following will show how to implement a simple aggregate function using C# and Visual Studio 2008.

The aggregate function that was chosen for this example is a simple COUNTNULL aggregate function that returns the number of NULL rows in a specified column, of course, you can do it easily in T-SQL, but this is just a simple example.

Inside your Database project in Visual Studio, add a new item and select Aggregate, and a new class is added to the project. Each aggregate function needs to be in a separate struct, notice this struct is marked as [Serializable] and Each aggregate function needs to be in a separate struct, notice this struct is marked as [Serializable] and [SqlUserDefinedAggregate(Format.Native)].

SQL Server 2005 needs to store an effective representation of the aggregate on disk, so the aggregate should be Serializable, and the SqlUserDefinedAggregate attribute indicates that this class or struct is a user-defined aggregate and the property “Format.Native” tells the .NET Framework to handle the serialization process using a simple algorithm, and it works only with value types. That’s why the aggregate is defined as a struct.

You can also specify the property name in the SqlUserDefinedAggregate attribute to the name of your Aggregate function “The name that is called in SQL Server 2005”.
[SqlUserDefinedAggregate (Format.Native,Name = “CountNulls”)]

Now, inside the class, we have to implement four methods in order for the aggregate to work. There are init, accumulate, merge, and terminate.

Init() method is used in order to initialize any needed resources or variables and is called only once when the aggregate is called.

Accumulate() method is the actual implementation of the aggregate function and is called once for each row.

Merge() method is used because sometimes, if the column to aggregate contains a large number of data, the SQLCLR may split those rows into groups and aggregate each group individually and then merge the results of those group again, then the Merge() method is called.

Terminate() method is used free any used resources and to return the result of this aggregate function.

Below is the code for the CountNulls aggregate function with some comments: 

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using Microsoft.SqlServer.Server;
[Serializable]
[SqlUserDefinedAggregate(Format.Native,Name = "CountNulls")]
public struct CountNulls
{
    private int Counter;  // field to count the not null rows
    public void Init()
    {
        Counter = 0;      // initialization
    }
    public void Accumulate(object Value)
    {
        if (Value == DBNull.Value)  // count just the rows that are not equal to NULL
            Counter++;
    }
    public void Merge(CountNulls Group)
    {
        this.Counter += Group.Counter; // when a merge is needed the counter of other groups should be added
    }
    public SqlString Terminate()
    {
        return new SqlString(Counter.ToString()); //returning the results
    }
}

After you compile this class, you have to deploy it to make available to be used inside SQL Server 2005

Finally, go to SQL Server Management Studio and use your aggregate function as follows:

dbo.CountNulls(EndDate)from HumanResources.EmployeeDepartmentHistory

Summary:

Each class or struct should be public and dedicated to only one aggregate function.

Serialization is required so that SQLCLR could store the representation of the aggregate on disk.

Note that this aggregate function can only take one parameter, we can’t implement aggregation with more than one parameter in SQL Server 2005, but we can with SQL Server 2008, and I’ll post about it later.