.NET Nugget 24: Finding which resource strings you haven't provided a localized override for

As I recently mentioned I’ve been working on an application that uses localization since the end users are Spanish speaking.  The best I can do in Spanish is to ask where the bathroom is and to ask if I can sharpen my pencil…oh, and use the phrase, “No gum in my class!”.  Luckily I’m working with the business analyst who is a native Spanish speaker and she provides my translations for me. 

As I’m writing the application I abstract strings that need to be localized into a resource file (the refactoring in CodeRush makes this dead simple), but generally I only put it in the default resource file to begin with.  The way localization works is that the UICulture property of the thread is inspected and the .NET Framework (when pulling from a resource file) will look to see if that particular resource exists in a specific culture resource file.  If it doesn’t exist it will then fall back to the default resource file and pull from there.  For example, let’s say I have two resource files : somefile.LocalizedText.resx and somefile.LocalizedText.es-mx.resx.  The one with the es-mx in the name is specifically for Spanish resource values (es-mx is a culture code where es = Spanish and mx = Mexico).  Whenever I ask for a resource value from the compiled somefile.LocalizedText class the framework will inspect the UICulture on the thread and if that culture is en-mx it will look in the somefile.LocalizedText.es-mx satellite assembly to pull the value.  If it can’t find it there it will fall back to the default satellite assembly and pull the value from the somefile.LocalizedText.  It’s all pretty automatic and all I have to do is make sure there is a value in both resource files for each string that needs to be localized.

When I’m done with a section of code I usually end up with the default resource file having lots of strings in it that the Spanish resource file does not.  I then get the strings I’m missing from the file and send them off to the business analyst for translation.  Once she returns me the translations I add them to the Spanish resource file and test it out.  This leads to the question of which strings do I have in one file, but not in the other?  I spent way too long one day looking through the files and searching for the ones I needed to add so I took a few moments the next time and came up with the following code:

XDocument baseDoc = XDocument.Load(@"somefile.LocalizedText.resx");
XDocument spanishDoc = XDocument.Load(@"somefile.LocalizedText.es-mx.resx");
var englishDataItems = from d in baseDoc.Descendants("data")
                       select d.Attribute("name").Value;
var spanishDataItems = from d in spanishDoc.Descendants("data")
                       select d.Attribute("name").Value;
var missingDataItemsFromSpanish = englishDataItems.Except(spanishDataItems);
foreach (string x in missingDataItemsFromSpanish)
{
    Console.WriteLine(x);
}

This code outputs all the things in the base version that isn’t overridden in the localized version.  It first loads two XDocument objects with the contents of each resource file.  It then sets up two LINQ queries to find all the different resource string names in both documents.  The code then sets up another query to find the values that exist in the English data items, but not the Spanish.  I then output the values from the last query.  Due to the awesomeness of LINQ the queries aren’t actually executing until I start looping through to results to output them.

I did this pretty quickly using SnippetComplier to get what I needed, but it’s on my list to someday get this into a PowerShell script so I can just point it at two different resource files anywhere.