Problem
The following is a set of classes that read metadata from Sql Server tables and populates local objects. It works, but I feel like there’s too much duplication happening and too many places changes need to be made if something changes.
On the database side, the relevant metadata is collected into 4 tables with an additional Description
column (I should not that I’m not able to use extended properties to store the description field and just read directly from the sys
tables, hence the separate table approach). I won’t list those in detail here, but the TableData
class shown below mirrors the structure.
Firstly, there are 4 classes to represent a single record in each of the 4 tables, DatabaseData
, TableData
, Schema Data
and ColumnData.
I’ll only list the code for TableData
here, the others are structured in the same way (just with the relevant members and constructor arguments):
public class TableData
{
public readonly int DatabaseID;
public readonly int SchemaID;
public readonly int TableID;
public readonly string TableName;
public readonly DateTime CreatedDate;
public readonly DateTime ModifiedDate;
public readonly string Description;
public TableData(int DatabaseID, int SchemaID, int TableID, string TableName, DateTime CreatedDate, DateTime ModifiedDate, string Description)
{
this.DatabaseID = DatabaseID;
this.SchemaID = SchemaID;
this.TableID = TableID;
this.TableName = TableName;
this.CreatedDate = CreatedDate;
this.ModifiedDate = ModifiedDate;
this.Description = Description;
}
}
These objects can be constructed through the DataBuilder
static class. I’m not entirely convinced this is worth anything at the moment since it’s not currently adding any additional value versus constructing the objects directly:
public static class DataBuilder
{
public static TableData CreateTableData(int DatabaseID, int SchemaID, int TableID, string Name, DateTime CreatedDate, DateTime ModifiedDate, string Description)
{
return new TableData(DatabaseID, SchemaID, TableID, Name, CreatedDate, ModifiedDate, Description);
}
// ... equivalent builders for ColumnData, SchemaData, DatabaseData
}
The main attraction here is the DataFetcher
classes, which do the actual work of getting data from the server and creating collections of ColumnData
/TableData
/etc objects. There are a few different ways data will be fetched, so I wrote an abstract class SqlDataFetcher
for the common functionality:
public abstract class SqlDataFetcher
{
public readonly string ConnectionString;
public SqlDataFetcher(string ConnectionString) {
this.ConnectionString = ConnectionString;
}
public abstract HashSet<DatabaseData> GetDatabases();
public abstract HashSet<SchemaData> GetSchemas(DatabaseData Database);
public abstract HashSet<TableData> GetTables(DatabaseData database, SchemaData schema);
public abstract HashSet<ColumnData> GetColumns(DatabaseData database, SchemaData schema, TableData table);
protected DataTable GetData(string database, string commandText)
{
DataTable data = new DataTable();
using (SqlConnection connection = new SqlConnection(ConnectionString))
{
connection.Open();
connection.ChangeDatabase(database);
data.Load((
new SqlCommand()
{
Connection = connection,
CommandText = commandText,
CommandTimeout = 10
}).ExecuteReader());
connection.Close();
}
return data;
}
}
The first concrete implementation of SqlDataFetcher
reads data from the metadata tables:
public class MetadataFetcher : SqlDataFetcher
{
public MetadataFetcher(string ConnectionString) : base(ConnectionString) { }
public override HashSet<DatabaseData> GetDatabases()
{
HashSet<DatabaseData> databases = new HashSet<DatabaseData>();
DataTable data = this.GetData(
"DataDictionary",
"select DatabaseID, Name, Collation, CreatedDate, Description from dbo.DatabaseData"
);
foreach (DataRow row in data.Rows)
{
databases.Add(DataBuilder.CreateDatabaseData(
row.Field<Int32>("DatabaseID"),
row.Field<String>("Name"),
row.Field<String>("Collation"),
row.Field<DateTime>("CreatedDate"),
row.Field<String>("Description")
));
}
return databases;
}
// same logic for these ...
public override HashSet<SchemaData> GetSchemas(DatabaseData Database) ...
public override HashSet<TableData> GetTables(DatabaseData database, SchemaData schema) ...
public override HashSet<ColumnData> GetColumns(DatabaseData database, SchemaData schema, TableData table) ...
}
Finally, the various SqlDataFetcher
s (there will be others – one that reads directly from the sys
tables, for eg) can be constructed via methods the SqlDataFetcherBuilder
static class, with variants provided for connecting via either Trusted Connection or UserID/Password:
public static class SqlDataFetcherBuilder
{
// Creates a new MetadataFetcher object. Uses Integrated Security (Trusted Connection) to connect.
public static MetadataFetcher CreateMetadataFetcher(string Server, string Database)
{
return new MetadataFetcher(BuildConnectionString(Server, Database));
}
// Creates a new MetadataFetcher object. Uses a username and password to connect.
public static SysDataFetcher CreateMetadataFetcher(string Server, string Database, string UserID, string Password)
{
return new MetadataFetcher(BuildConnectionString(Server, Database, UserID, Password));
}
// Builds a connection string from Server and Database.
private static SqlConnectionStringBuilder BuildConnectionString(string Server, string Database)
{
SqlConnectionStringBuilder connectionString = new SqlConnectionStringBuilder();
connectionString.Add("Server", Server);
connectionString.Add("Database", Database);
connectionString.Add("Trusted_Connection", "Yes");
return connectionString;
}
// Builds a connection string from Server, Database, User ID and Password.
private static SqlConnectionStringBuilder BuildConnectionString(string Server, string Database, string UserID, string Password)
{
SqlConnectionStringBuilder connectionString = new SqlConnectionStringBuilder();
connectionString.Add("Server", Server);
connectionString.Add("Database", Database);
connectionString.Add("User ID", UserID);
connectionString.Add("Password", Password);
return connectionString;
}
}
Known / suspected issues
-
As mentioned at the start, feels like a lot of common factors. If I add/rename/remove a column in one of my tables on Sql Server, there’s currently 7 places I’d need to update the code
-
Queries hard-coded in (actually, this is just temporary, with the intent to change them to stored procedures / views on the server)
-
Not sure where the value is in the DataBuilder class
-
Absolutely no exception handling at the moment
Also note, all my code is documented, but I stripped them out here to keep the question relatively short.
How can I improve this?
Solution
public readonly int DatabaseID;
public readonly int SchemaID;
public readonly int TableID;
public readonly string TableName;
public readonly DateTime CreatedDate;
public readonly DateTime ModifiedDate;
public readonly string Description;
I like that they’re readonly
. Very very much. But fields should not be public – these should be properties; you could write them as public auto-properties with a private setter, like so:
public int DatabaseID { get; private set; }
public int SchemaID { get; private set; }
public int TableID { get; private set; }
//...
Your builder classes feel like static helpers with mixed concerns, and seem to confuse the terms builder and factory – two creational patterns that differ in several ways; I invite you to read up on them both, and to look into the abstract factory pattern to unlock the usefulness and power of factories – because I don’t quite see the use for the DataBuilder
(and SqlDataFetcherBuilder
for that matter) either.
The SqlDataFetcherBuilder
class has mixed concerns and is doing more than what its name suggests, and leaves me wondering why connection strings aren’t just stored in application settings, under a <ConnectionStrings>
section.
The SqlDataFetcher
abstract class smells, particularly the abstract
members: I don’t see what class other than MetadataFetcher
will need to implement these abstract methods.
I don’t mean to be harsh, but the whole code looks over-engineered with good intentions, but you seem to have written factories and abstractions for the sake of writing them more than for solving an actual problem: all the static
-ness is only moving the object instantiation outside of your other classes, without buying you anything to help testability or, as you’ve noted, maintainability:
If I add/rename/remove a column in one of my tables on Sql Server, there’s currently 7 places I’d need to update the code
The main problem is that your code looks like it was written for .net 2.0 – if you can use LINQ to SQL, or better, Entity Framework, most of that boilerplate could be simply removed – you would only have a class for each of your entities, and then a DbContext
implementation… and your queries could be written like this:
using (var context = new MetadataDbContext(connectionString))
{
return context.DatabaseDatas.ToList();
}
Where DatabaseDatas
maps to a table of the same name, with rows mapped to an entity type called DatabaseData
, which is the only place you’d have to go and modify something if the mapped SQL Server table ever needs to change. No more magic strings and flaky row.Field<T>("column")
calls!