I need to create some objects that will allow me to access columnar data by header name. The access looks something like this:
class ParentObj {
string[] columnNames;
ChildObj[] Children;
}
class ChildObj {
Dictionary<string, string> Items;
}
All the child dictionaries will have the same set of keys, matching the column list from the parent.
Now this particular object is going to be used on computers that are resource constrained, and could contain large quantities of data. My big concern was should I actually use separate dictionaries, or should I just store the values in an array, and lookup the column number from a single dictionary.
After trying both methods with very large random data, I found the memory usage to only increase 10megs using separate Dictionaries. Certainly not worth the added complexity of managing my own Dictionary implementation.
My next question was if the CLR was wasting a bunch of memory on a bunch of duplicate keys. CLR Profiler to the rescue!
I modifed my test program to be able to either use the same keys in all child objects, or use random keys as well. After making all the strings the same size, it was clear that .NET does not keep a separate copy of each of the duplicate strings around:
Object memory when all keys are the same:
Object Memory when the keys are random:
So in conclusion, using many dictionary instances with the same set of keys does not result in a significant amount of duplicate data in memory (or memory waste).