One of the great benefits I have from working in our Neuchâtel office is my proximity to a great many talented members of Autodesk’s Worldwide Localization team, who are responsible for translating most of our products into various languages. Over the last few months, I’ve been working even more closely than usual with that team, mainly looking at ways Autodesk might broaden the ability to localize our software.
A couple of hot topics are of particular interest, these days, in the world of localization: machine translation and crowdsourcing. The implementation I’ll be showing over the next few posts actually hits on both of these points with a view – quite specifically – to enabling the automatic translation (and eventually crowdsourced editing) of tooltips in AutoCAD-based products.
In the last few posts, we looked at the ability to read and modify AutoCAD’s tooltips. This post extends that implementation to hook in a machine translation engine – specifically Microsoft’s Bing Translator – to provide the ability to auto-translate AutoCAD’s tooltips into one of 35 different languages.
Now for a few words on machine translation. Apparently the real innovation in this domain, in recent years, has been the understanding that statistical methods such as Statistical Machine Translation (SMT) can much more effectively generate accurate translations than rule-based approaches such as natural language processing. Autodesk makes quite heavy use of SMT – for instance we use it widely as a productivity tool for our translators, as you’ll hear Mirko Plitt talk about on YouTube – and we even have our own STM engine with its own API.
To make STM work well, you need to have “translation memories” derived from sets of your company’s (or domain’s) content translated into various languages for comparison. For instance, we have our user documentation in various languages that have been used to “train” our engine (although that implies some intelligence – SMT isn’t about rules as much as it is about statistics… in spite of being quite clever, in terms of the implementation and the quality of the results, the translation process is actually rather dumb). Which means our engine supports the languages we currently translate our software into, but has little or no support for those that we don’t. Which is one of the reasons I turned to Microsoft’s implementation, which gives the ability to translate English into 35 different languages (even if it’s likely to handle Autodesk terminology less well than our own implementation). At some point I’ll no doubt end up with a hybrid approach, which uses Autodesk’s for languages we support and Microsoft for others. But that’s for another day.
Another reason for going with Microsoft, at the moment, is the recent brouhaha around Google’s plan to start charging for the use of the Google Translate API (although at least they’re not simply shutting it down, which was originally announced). Microsoft has taken the opportunity to strike a (seemingly rare, these days) PR win over Google by reiterating their commitment to providing a free translation API.
Anyway, enough talk – now for some code. :-) Here’s the C# code that hooks Bing Translator into our tooltip modification approach:
using Autodesk.AutoCAD.ApplicationServices;
using Autodesk.AutoCAD.EditorInput;
using Autodesk.AutoCAD.Runtime;
using Autodesk.Windows;
using System.Runtime.Serialization;
using System.Collections.Generic;
using System.Windows.Documents;
using System.Windows.Controls;
using System.Windows;
using System.Text;
using System.Linq;
using System.Net;
using System.Xml;
using System.IO;
using System;
[assembly: ExtensionApplication(typeof(TranslateTooltips.Commands))]
namespace TranslateTooltips
{
public class Commands : IExtensionApplication
{
// Keep track of currently translated items
static List<string> _handled = null;
// Our source and target languages
static string _srcLang = "en";
static string _trgLang = "";
public void Initialize()
{
HijackTooltips();
}
public void Terminate()
{
}
[CommandMethod("ADNPLUGINS", "TRANSTIPS", CommandFlags.Modal)]
public static void ChooseTranslationLanguage()
{
Document doc =
Autodesk.AutoCAD.ApplicationServices.Application.
DocumentManager.MdiActiveDocument;
Editor ed = doc.Editor;
// Get the list of language codes and their corresponding
// names
List<string> codes = GetLanguageCodes();
string[] names = GetLanguageNames(codes);
// Make sure we have as many names as languages supported
if (codes.Count == names.Length)
{
// Ask the user to select a target language
string lang =
ChooseLanguage(ed, codes, names, _trgLang, false);
// If the language code returned is empty or the same
// as the source, turn off translations
if (lang == "" || lang == _srcLang)
{
ed.WriteMessage(
"\nTooltip translation is turned off."
);
_trgLang = "";
}
else if (lang != null)
{
// Otherwise get the name corresponding to a language
// code
string name =
names[
codes.FindIndex(0, x => x == lang)
];
// Print it to the user
ed.WriteMessage(
"\nTooltips will be translated into {0}.\n", name
);
// Set the new target language
_trgLang = lang;
}
}
}
private static string ChooseLanguage(
Editor ed, List<string> codes, string[] names,
string lang, bool source
)
{
// First option (0) is to unselect
ed.WriteMessage("\n0 None");
// The others (1..n) are the languages
// available on the server
for (int i = 0; i < names.Length; i++)
{
ed.WriteMessage("\n{0} {1}", i + 1, names[i]);
}
// Adjust the prompt based on whether selecting
// a source or target language
PromptIntegerOptions pio =
new PromptIntegerOptions(
String.Format(
"\nEnter number of {0} language to select: ",
source ? "source" : "target"
)
);
// Add each of the codes as hidden keywords, which
// allows the user to also select the language using
// the 2-digit code (good for scripting on startup,
// to avoid having to hard code a number)
foreach (string code in codes)
{
pio.Keywords.Add(code, code, code, false, true);
}
// Set the bounds and the default value
pio.LowerLimit = 0;
pio.UpperLimit = names.Length;
if (codes.Contains(lang))
{
pio.DefaultValue =
codes.FindIndex(0, x => lang == x) + 1;
}
else
{
pio.DefaultValue = 0;
}
pio.UseDefaultValue = true;
// Get the selection
PromptIntegerResult pir = ed.GetInteger(pio);
string resLang = null;
if (pir.Status == PromptStatus.Keyword)
{
// The code was entered as a string
if (!codes.Contains(pir.StringResult))
{
ed.WriteMessage(
"\nNot a valid language code."
);
resLang = null;
}
else
{
resLang = pir.StringResult;
}
}
else if (pir.Status == PromptStatus.OK)
{
// A number was selected
if (pir.Value == 0)
{
// A blank string indicates none
resLang = "";
}
else
{
// Otherwise we return the corresponding
// code
resLang = codes[pir.Value - 1];
}
}
return resLang;
}
public static void HijackTooltips()
{
Document doc =
Autodesk.AutoCAD.ApplicationServices.Application.
DocumentManager.MdiActiveDocument;
Editor ed = doc.Editor;
_handled = new List<string>();
// Respond to an event fired when any tooltip is
// displayed inside AutoCAD
Autodesk.Windows.ComponentManager.ToolTipOpened +=
(s, e) =>
{
if (!String.IsNullOrEmpty(_trgLang))
{
// The outer object is of an Autodesk.Internal
// class, hence subject to change
Autodesk.Internal.Windows.ToolTip tt =
s as Autodesk.Internal.Windows.ToolTip;
if (tt != null)
{
if (tt.Content is RibbonToolTip)
{
// Enhanced tooltips
RibbonToolTip rtt = (RibbonToolTip)tt.Content;
rtt.Content =
TranslateIfString(
rtt.Content, rtt.Command
);
TranslateObjectContent(
rtt.Content, rtt.Command
);
// Translate any expanded content
// (adding a suffix to the ID to
// distinguish from the basic content)
rtt.ExpandedContent =
TranslateIfString(
rtt.ExpandedContent, rtt.Command + "-x"
);
TranslateObjectContent(
rtt.ExpandedContent, rtt.Command + "-x"
);
}
else if (tt.Content is UriKey)
{
// This is called once for tooltips that
// need to be resolved by the system
// Here we close the current tooltip and
// move the cursor to 0,0 and back again,
// to cause the resolved tooltip to be
// displayed, which will call this event
// again following a different path
tt.Close();
System.Drawing.Point pt =
System.Windows.Forms.Cursor.Position;
System.Windows.Forms.Cursor.Position =
System.Drawing.Point.Empty;
System.Windows.Forms.Application.DoEvents();
System.Windows.Forms.Cursor.Position = pt;
}
else
{
// A basic, string-only tooltip
tt.Content = TranslateIfString(tt.Content, null);
}
}
}
};
}
private static object TranslateIfString(
object obj, string id
)
{
// If the object passed in is a string,
// return its translation to the caller
object ret = obj;
if (obj is string)
{
string trans =
TranslateContent((string)obj, id);
if (!String.IsNullOrEmpty(trans))
{
ret = trans;
MarkAsTranslated(id);
}
}
return ret;
}
private static void TranslateObjectContent(
object obj, string id
)
{
// Translate more complex objects and their
// contents
if (obj != null)
{
if (obj is TextBlock)
{
// Translate TextBlocks
TextBlock tb = (TextBlock)obj;
TranslateTextBlock(tb, id);
}
else if (obj is StackPanel)
{
// And also handle StackPanels of content
StackPanel sp = (StackPanel)obj;
TranslateStackPanel(sp, id);
}
}
}
private static void TranslateTextBlock(
TextBlock tb, string id
)
{
// Translate a TextBlock
string trans =
TranslateContent(tb.Text, id);
if (!String.IsNullOrEmpty(trans))
{
tb.Text = trans;
MarkAsTranslated(id);
}
}
private static void TranslateStackPanel(
StackPanel sp, string id
)
{
// Translate a StackPanel of content
TextBlock tb;
foreach (UIElement elem in sp.Children)
{
tb = elem as TextBlock;
if (tb != null)
{
TranslateTextBlock(tb, id);
}
else
{
FlowDocumentScrollViewer sv =
elem as FlowDocumentScrollViewer;
if (sv != null)
{
TranslateFlowDocumentScrollViewer(
sv, id
);
}
}
}
}
private static void TranslateFlowDocumentScrollViewer(
FlowDocumentScrollViewer sv, string id
)
{
// Translate a FlowDocumentScrollViewer, which
// hosts content such as bullet-lists in
// certain tooltips (e.g. for HATCH)
int n = 0;
Block b = sv.Document.Blocks.FirstBlock;
while (b != null)
{
List l = b as List;
if (l != null)
{
ListItem li = l.ListItems.FirstListItem;
while (li != null)
{
Block b2 = li.Blocks.FirstBlock;
while (b2 != null)
{
Paragraph p = b2 as Paragraph;
if (p != null)
{
Inline i = p.Inlines.FirstInline;
while (i != null)
{
string contents =
i.ContentStart.GetTextInRun(
LogicalDirection.Forward
);
// We need to suffix the IDs to
// keep them distinct
string trans =
TranslateContent(
contents, id + n.ToString()
);
if (!String.IsNullOrEmpty(trans))
{
i.ContentStart.DeleteTextInRun(
contents.Length
);
i.ContentStart.InsertTextInRun(trans);
MarkAsTranslated(id + n.ToString());
}
n++;
i = i.NextInline;
}
}
b2 = b2.NextBlock;
}
li = li.NextListItem;
}
}
b = b.NextBlock;
}
}
private static void MarkAsTranslated(string id)
{
// Mark an item as having been translated
if (!String.IsNullOrEmpty(id) && !_handled.Contains(id))
_handled.Add(id);
}
private static void UnmarkAsTranslated(string id)
{
// Remove an item from the list of marked items
if (!String.IsNullOrEmpty(id) && _handled.Contains(id))
_handled.Remove(id);
}
private static bool AlreadyTranslated(string id)
{
// Check the list, to see whether an item has been
// translated
return _handled.Contains(id);
}
private static string TranslateContent(
string contents, string id
)
{
string res = contents;
if (!AlreadyTranslated(id))
{
res =
GetTranslatedText(_srcLang, _trgLang, contents);
if (String.IsNullOrEmpty(res))
{
res = contents;
}
}
return res;
}
// Replace the following string with the AppId you receive
// from the Bing Developer Center
const string AppId =
"Kean's Application Id – please get your own :-)";
private static string GetTranslatedText(
string from, string to, string content
)
{
// Translate a string from one language to another
string uri =
"http://api.microsofttranslator.com/v2/Http.svc/" +
"Translate?appId=" + AppId + "&text=" + content +
"&from=" + from + "&to=" + to;
// Create the request
HttpWebRequest request =
(HttpWebRequest)WebRequest.Create(uri);
string output = null;
WebResponse response = null;
try
{
// Get the response
response = request.GetResponse();
Stream strm = response.GetResponseStream();
// Extract the results string
DataContractSerializer dcs =
new DataContractSerializer(
Type.GetType("System.String")
);
output = (string)dcs.ReadObject(strm);
}
catch (WebException e)
{
ProcessWebException(
e, "\nFailed to translate text."
);
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
}
return output;
}
private static List<string> GetLanguageCodes()
{
// Get the list of language codes supported
string uri =
"http://api.microsofttranslator.com/v2/Http.svc/" +
"GetLanguagesForTranslate?appId=" + AppId;
// Create the request
HttpWebRequest request =
(HttpWebRequest)WebRequest.Create(uri);
WebResponse response = null;
List<String> codes = null;
try
{
// Get the response
response = request.GetResponse();
using (Stream stream = response.GetResponseStream())
{
// Extract the list of language codes
DataContractSerializer dcs =
new DataContractSerializer(typeof(List<String>));
codes = (List<String>)dcs.ReadObject(stream);
}
}
catch (WebException e)
{
ProcessWebException(
e, "\nFailed to get target translation languages."
);
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
}
return codes;
}
public static string[] GetLanguageNames(List<string> codes)
{
string uri =
"http://api.microsofttranslator.com/v2/Http.svc/" +
"GetLanguageNames?appId=" + AppId + "&locale=en";
// Create the request
HttpWebRequest req =
(HttpWebRequest)WebRequest.Create(uri);
req.ContentType = "text/xml";
req.Method = "POST";
// Encode the list of language codes
DataContractSerializer dcs =
new DataContractSerializer(
Type.GetType("System.String[]")
);
using (Stream stream = req.GetRequestStream())
{
dcs.WriteObject(stream, codes.ToArray());
}
WebResponse response = null;
try
{
// Get the response
response = req.GetResponse();
using (Stream stream = response.GetResponseStream())
{
// Extract the list of language names
string[] results = (string[])dcs.ReadObject(stream);
string[] names =
results.Select(x => x.ToString()).ToArray();
return names;
}
}
catch (WebException e)
{
ProcessWebException(
e, "\nFailed to get target language."
);
}
finally
{
if (response != null)
{
response.Close();
response = null;
}
}
return null;
}
private static void ProcessWebException(
WebException e, string message
)
{
// Provide information regarding an exception
Document doc =
Autodesk.AutoCAD.ApplicationServices.Application.
DocumentManager.MdiActiveDocument;
Editor ed = doc.Editor;
ed.WriteMessage("{0}: {1}", message, e.ToString());
// Obtain detailed error information
string strResponse = string.Empty;
using (
HttpWebResponse response =
(HttpWebResponse)e.Response
)
{
using (
Stream responseStream =
response.GetResponseStream()
)
{
using (
StreamReader sr =
new StreamReader(
responseStream, System.Text.Encoding.ASCII
)
)
{
strResponse = sr.ReadToEnd();
}
}
}
// Print it to the user
ed.WriteMessage(
"\nHttp status code={0}, error message={1}",
e.Status, strResponse
);
}
}
}
I’ve done my best to document the above code, so won’t now go into the details. I tried hard to support the various types of tooltip I could find in the product, but please do post a comment if you find something that doesn’t work for you. Please note that this implementation is for English versions of AutoCAD and also requires AutoCAD to be restarted when you change languages (as modified tooltips keep their values – unless we store the original text, we can’t retranslate). Both of these will hopefully be addressed in the next post in this series.
Here’s what happens when we run the TRANSTIPS command to turn on tooltip translation. Firstly we get presented with the list of languages the Bing server tells us it supports:
Command: TRANSTIPS
0 None
1 Arabic
2 Bulgarian
3 Catalan
4 Chinese Simplified
5 Chinese Traditional
6 Czech
7 Danish
8 Dutch
9 English
10 Estonian
11 Finnish
12 French
13 German
14 Greek
15 Haitian Creole
16 Hebrew
17 Hungarian
18 Indonesian
19 Italian
20 Japanese
21 Korean
22 Latvian
23 Lithuanian
24 Norwegian
25 Polish
26 Portuguese
27 Romanian
28 Russian
29 Slovak
30 Slovenian
31 Spanish
32 Swedish
33 Thai
34 Turkish
35 Ukrainian
36 Vietnamese
Enter number of target language to select <0>:
We can select Thai (for instance) either by its number (33) or by its 2-digit language code (th), if we know it. The reason I added this capability was to make it more straightforward for startup scripts to select the translation language via the TRANSTIPS command without relying on the languages being presented with the same numbers (something we have no control over).
Enter number of target language to select <0>: 33
Tooltips will be translated into Thai.
Once selected, we see tooltips from various parts of the product get translated into Thai.
From the ribbon:
From a toolbar:
From the status bar:
From dialogs:
While there are some limitations with this implementation – many of which I hope to address in future posts – I have been really impressed with the responsiveness of the translation service. There really is no noticeable – or at least annoying – lag when hooking up machine translation in this way. Although I fully admit I didn’t test it on a 56kbps dial-up connection. :-)
In the next post – in this series, at least – I’ll add local caching of translations in XML files. Which will hopefully help with live language-switching, greater responsiveness for repeated translation requests, and the ability to enable in-product editing/approval (and eventually crowdsourcing).
If any of you are interested in testing this out for your own language and letting me know how you find the results, I’d certainly appreciate the feedback. At some point I see this evolving into a Plugin of the Month, but I’d certainly like to hear what this blog’s readers think about it, in the meantime.
Update
Our localization team's SMT engine is now being maintained for internal use only, so I've removed the (now broken) links to it from the above post. We are working towards another way of providing a "preferred" SMT service for Autodesk-centric content: more on that, in due course.