This is an interesting topic – and one that I’m far from being expert in – so it would be great if readers could submit comments with additional information.
Intellectual property protection is a major concern for software developers, and issues that are seen today with .NET languages have been troubling AutoCAD developers since the introduction of AutoLISP.
So, what are these issues?
As a professional software developer, if you ship source-code to your customers there is substantial risk of it being borrowed or stolen for use in other unlicensed situations. This is true if you ship the actual source code used to build your modules, or if the “compiled” modules are actually not fully compiled, but (for example) stored in a CPU-independent, intermediate language that can quite easily be decompiled and have source code reconstituted or reverse-engineered (albeit without comments and usually without the original symbol names).
ObjectARX
This is much less of an issue with languages that are compiled to CPU-specific executable or machine code, such as for ObjectARX (and before it, ADS): the output from a C++ compiler does not, for instance, included any source code, unless you choose to ship debugging-related files such as PDBs (files which include information about function prototypes – information that is considered by some software providers as intellectual property in and of itself).
LISP
Intellectual property protection was historically a big issue with AutoLISP: in its original incarnation (i.e. before the integration of Visual LISP) LISP code was always interpreted by AutoCAD. Interpreted languages do not have a compilation phase – the code is “made sense of” by the interpreter at runtime. There were, however, things that could be done to obfuscate and protect the code before distribution. Most developers used these two tools on their code before shipping it:
• The Kelvinator – this nifty tool did a few things to obfuscate AutoLISP code:
Removed unnecessary whitespace
Stripped out comments
Mangled “internal” symbols (those not exposed as external variables or function names)
The Kelvinator was apparently written by Kelvin R. Throop, along with a number of other AutoLISP utilities. I wasn't around at the time (it was written in the late 80s and I joined in the mid 90s), so without ever having met Kelvin I do wonder whether he really exists, or whether this is just another use of the pseudonym used in a number of Science Fiction stories over the 20 or so years preceding the Kelvinator's development. This theory seems to be confirmed by this memo in the Autodesk File on John Walker's Fourmilab site. But if the real Kelvin R. Throop could step forward (or if someone else could confirm his existence or identity), then I'd willingly present my humble apologies. :-)
In terms of its functionality the Kelvinator was really quite effective: the only step that could be effectively reversed was the whitespace removal – by any LISP pretty printer. Getting meaningful comments & function/variable names back was nigh on impossible (although you could work out what certain variables were used for in a particular algorithm pretty easily).
• Protect.exe
This tool used very lightweight encryption to stop LISP files from being read by plain ASCII tools. From memory, I think a bit was reversed in each ASCII character – it was that level of encryption – which also meant that unprotection tools were also available (including one of the “c” versions of R13, if I remember correctly, which would print protected LISP as plain text to the AutoCAD text window :-).
Most professional developers at the time would Kelvinate and then Protect their LISP code before distributing it. Other options were to use LISP compilation – an early compiler was available pre-R13 versions, and Vital LISP (which later became Visual LISP) provided compilation capabilities.
Visual LISP changed the game for LISP developers: it allowed them to properly compile LISP into a protected format. Visual LISP supports two modes of compilation: each LISP file can be compiled to a format called FAS, which can be loaded directly into AutoCAD, and these can in turn be packaged into VLX modules. Originally Visual LISP allowed compilation to ARX, but these were essentially copies of the VL runtime with the compiled code stored as a resource, and clearly it’s more efficient to not force distribution of the runtime when it's in any case a standard component.
Over the years I've not come across anyone raising concerns about the security of the FAS or VLX formats, which leads me to believe they're a "secure" way to protect and distribute LISP modules (which ultimately means the effort required to make sense of the underlying logic of the application would be too high for it to be considered a useful practice for those developers unscrupulous enough to otherwise attempt it).
VB6
The VB6 IDE allows compilation to executable code – albeit code that requires a runtime component – and, once again, to the best of my knowledge this code is considered "secure” (see above note).
VBA
The VBA component embedded in AutoCAD allows password protection of DVB files (see the Protection tab on the Project Properties dialog in the VBA IDE). This password protection uses standard encryption to lock away source code from prying eyes. I understand that cracking tools are available for protected VBA modules, but there is no secret back-door – my team occasionally gets asked to help access source code stored in protected DVB files (as the person working on the code left the company without telling anyone the password for the module – or at least that’s what we’re told :-). Ultimately there’s nothing we can do to help in these cases, unfortunately. The cracking tools that are available seem to work on a brute force principle – they will cycle through possible passwords until one works – so if you want to really protect your code I’d recommend using a nice, long password.
.NET
So – life is basically good for LISP, ObjectARX, VBA and VB6 applications… what about .NET?
One of the major concerns shared by many .NET developers is around code security. Managed code gets compiled into assemblies that contain Microsoft Intermediate Language (MSIL) - basically CPU-independent instructions - in addition to metadata describing types, members and references to code in other assemblies. At runtime the MSIL gets converted to CPU-specific instructions by a just-in-time (JIT) compiler.
These assemblies are quite easy to disassemble into something resembling the original source code (albeit – once again – without comments or meaningful variable names, but internal function names and types get maintained).
Just as the Kelvinator was available for AutoLISP, there are obfuscation tools available for .NET languages. Visual Studio ships with one called the Dotfuscator Community Edition: this tools works on .NET assemblies, reducing the ease with which someone could reverse-engineer the source code.
At a basic level the tool strips namespace and member names - replacing them with symbols such as "a", "b", "c", ... - but it can only do so much to hide the logic of the code's execution, especially if there is no great dependency on functions and sub-routines (the more linear the code, the easier it is to understand, in my brief experience of analysing obfuscated code). That said, the Dotfuscator tool does have a number of options I haven't explored in depth, so it may well provide additional, helpful capabilities.
From taking a cursory look into what information is available by default in .NET assemblies, it's become clear to me that this should be an area of concern for anyone serious about protecting their source code investment. I strongly recommend that any professional software developer working with .NET spend time investigating obfuscation technologies for .NET assemblies. This guide, although a little dated, does seem to provide a fairly good background to .NET obfuscation, as well as presenting a number of different vendors' offerings.
In my next post I'm going to take a look at the Reflector tool - a very useful tool that can be used to disassemble .NET assemblies - and discuss how it might be used to help further one's understanding of .NET development.