Skip to content

Back to Basics – Keep it Simple and Develop Your Sense of Smell – From Linq To CSV

February 5, 2010

I was working with a friend recently on a side thing they were doing. They wanted to create an "Export" function for some small bit of data and start it from their website.

  • You’d hit a URL after logging in
  • Some data would come out of a database
  • You’d get a .CSV file downloaded
  • You could open it in Excel or whatever.

I spoke to my friend and they said it was cool to share their code for this post. This post isn’t meant to be a WTF or OMG look at that code, as is it meant to talk about some of the underlying issues. There’s few things going on here and it’s not all their fault, but it smells.

  • They are using a Page when a IHttpHandler will do.
    • Not a huge deal, but there’s overhead in making a Page, and they’re not using any of the things that make a Page a Page. The call to ClearContents is an attempt at telling the Page to back off. It’s easier to just not have a page.
  • They’re "thrashing" a bit in the Page_Load, basically "programming by coincidence."
    • They’re not 100% sure of what needs to be done, so they’re doing everything until it works. Even setting defaults many times or calling methods that set properties and they setting those properties again.
    • This means a few things. First, HTTP is subtle. Second, the Response APIs are confusing (less so in .NET 4) and it’s easy to do the same thing in two ways.
  • They’re not using using() or IDisposable
    • They’re cleaning up MemoryStreams and StreamWriters, but if an exception happens, things’ll get cleaned up whenever. It’s not tight in a "cleaning up after yourself" deterministic (when it needs to be) kind of way.
    • They’re calling Flush() when it’s not really needed, again programming by coincidence. "I think this needs to be done and it doesn’t break…"
  • Old school data access
    • Not bad, pre se, but it could be easier to write and easier to read. DataAdapters, DataSets, are a hard way to do data access once you’ve used Linq to SQL, EF or NHibernate.
  • Re-Throwing an Exception via "throw ex"
    • When you want to re-throw an exception, ALWAYS just throw; or you’ll lose your current call stack and it’ll be hard to debug your code.
  • Not reusable at all
    • Now, reuse isn’t the end-all, but it’s nice. If this programmer wants different kinds of exports, they’ll need to extract a lot from the spaghetti.

Here’s what they came up with first.

Note that this code is not ideal. This is the before. Don’t use it. 

using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using System.Configuration;
using System.Data.SqlClient;
using System.Data;

namespace FooFoo
{
public partial class Download : System.Web.UI.Page
{
protected void Page_Load(object sender, EventArgs e)
{
System.IO.MemoryStream ms = CreateMemoryFile();

byte[] byteArray = ms.ToArray();
ms.Flush();
ms.Close();

Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.Cookies.Clear();
Response.Cache.SetCacheability(HttpCacheability.Private);
Response.CacheControl = "private";
Response.Charset = System.Text.UTF8Encoding.UTF8.WebName;
Response.ContentEncoding = System.Text.UTF8Encoding.UTF8;
Response.AppendHeader("Pragma", "cache");
Response.AppendHeader("Expires", "60");
Response.ContentType = "text/comma-separated-values";
Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
Response.AddHeader("Content-Length", byteArray.Length.ToString());
Response.BinaryWrite(byteArray);
}

public System.IO.MemoryStream CreateMemoryFile()
{
MemoryStream ReturnStream = new MemoryStream();

try
{
string strConn = ConfigurationManager.ConnectionStrings["FooFooConnectionString"].ToString();
SqlConnection conn = new SqlConnection(strConn);
SqlDataAdapter da = new SqlDataAdapter("SELECT * FROM [FooFoo] ORDER BY id ASC", conn);
DataSet ds = new DataSet();
da.Fill(ds, "FooFoo");
DataTable dt = ds.Tables["FooFoo"];

//Create a streamwriter to write to the memory stream
StreamWriter sw = new StreamWriter(ReturnStream);

int iColCount = dt.Columns.Count;

for (int i = 0; i < iColCount; i++)
{
sw.Write(dt.Columns[i]);
if (i < iColCount - 1)
{
sw.Write(",");
}
}

sw.WriteLine();
int intRows = dt.Rows.Count;

// Now write all the rows.
foreach (DataRow dr in dt.Rows)
{
for (int i = 0; i < iColCount; i++)
{

if (!Convert.IsDBNull(dr[i]))
{
string str = String.Format("\"{0:c}\"", dr[i].ToString()).Replace("\r\n", " ");
sw.Write(str);
}
else
{
sw.Write("");
}

if (i < iColCount - 1)
{
sw.Write(",");
}
}
sw.WriteLine();
}

sw.Flush();
sw.Close();
}
catch (Exception Ex)
{
throw Ex;
}
return ReturnStream;
}

}
}

I don’t claim to be a good programmer, but I do OK. I went over my concerns with my friend, and suggested first an HttpHandler. I started with Phil’s basic abstract HttpHandler (based on my super basic HttpHandler boilerplate). I could have certainly done by just implementing IHttpHandler, but I like this way. They’re about the same # of lines of code. The important part is in HandleRequest (or ProcessRequest).

namespace FooFoo
{
public class ExportHandler : BaseHttpHandler
{
public override void HandleRequest(HttpContext context)
{
context.Response.AddHeader("Content-Disposition", "attachment; filename=FooFoo.csv");
context.Response.Cache.SetCacheability(HttpCacheability.Private);

var dc = new FooFooDataContext();
string result = dc.FooFooTable.ToCsv();
context.Response.Write(result);
}

public override bool RequiresAuthentication
{
get { return true; }
}

public override string ContentMimeType
{
get { return "text/comma-separated-values"; }
}

public override bool ValidateParameters(HttpContext context)
{
return true;
}
}
}

At the time I wrote this, I was writing how I wished the code would look. I didn’t have a "ToCsv()" method, but I was showing my friend how I though the could should be separated. Even better if there was a clean DAL (Data Access Layer) and Business Layer along with a service for turning things into CSV, but this isn’t too bad. ToCsv() in this example is a theoretical extension method to take an IEnumerable of something and output it as CSV. I started writing it, but then decided to Google with Bing, and found a decent direction to start with at Mike Hadlow’s blog. He didn’t include all the code in the post, but it saved me some typing, so thanks Mike!

namespace FooFoo
{
public static class LinqToCSV
{
public static string ToCsv<T>(this IEnumerable<T> items)
where T : class
{
var csvBuilder = new StringBuilder();
var properties = typeof(T).GetProperties();
foreach (T item in items)
{
string line = string.Join(",",properties.Select(p => p.GetValue(item, null).ToCsvValue()).ToArray());
csvBuilder.AppendLine(line);
}
return csvBuilder.ToString();
}

private static string ToCsvValue<T>(this T item)
{
if(item == null) return "\"\"";

if (item is string)
{
return string.Format("\"{0}\"", item.ToString().Replace("\"", "\\\""));
}
double dummy;
if (double.TryParse(item.ToString(), out dummy))
{
return string.Format("{0}", item);
}
return string.Format("\"{0}\"", item);
}
}
}

This creates an extension method that lets me call something.toCsv() on anything IEnumerable. It’ll spin through the properties (yes, I know that could have been a LINQ statement also, but I like a nice ForEach sometimes. Feel free to fix this up in the comments!😉 ) and build up the Comma Separated Values.

At some point, it really should format Dates as {0:u} but as of now, it works identically as the before code and attempts to rectify most of the issues brought up. Of course, one could take something like this as far and make it as robust as they like.

As Mike points out, you can also do little projections to control the output:

string csv = things.Select(t => new { Id = t.Id, Name = t.Name, Date = t.Date, Child = t.Child.Name }).AsCsv();

Your thoughts, Dear Reader?

Related Links


© 2009 Scott Hanselman. All rights reserved.


Comments are closed.

%d bloggers like this: