Obfuscation protects your source code from potential security threats while keeping the application's functionality in place.
Obfuscation is a process that involves converting your source code and data into an equivalent code or format such that it becomes difficult to reverse-engineer it using the decompiler tools without changing any of the application's code or functionality. This article discusses what obfuscation is, what the benefits of obfuscation are, and how you can use obfuscating tools in .NET, Java, and even JavaScript to prevent your code from potential threats.
Note that obfuscation is not restricted to your source code only. You can use obfuscation techniques to hide your data as well. Obfuscation can actually be implemented at three levels: control flow, layout (objects, literals, variables, and method names), and data. In this article, we will restrict our discussion to code obfuscation only.
What Is Obfuscation?
Literary speaking, obfuscation is a process that makes something harder to understand. According to Wikipedia, "Obfuscation is the concealment of meaning in communication, making communication confusing, intentionally ambiguous, and more difficult to interpret."
Code obfuscation involves converting your source code into a form that becomes difficult or impossible to reverse-engineer. Reverse-engineering is a process in which the program's structure, function, and operations are analyzed. It involves a detailed step-by-step analysis of the inner workings of an application.
Code Obfuscation
Fine, but what is obfuscation of source code? Why do we require it? Obfuscation of code is typically done to disallow reverse-engineering of the executables so as to manage risks that stem from any unauthorized access to an application's source code. In essence, it is a practice that involves protection of your intellectual property and trade secrets; you obfuscate your source code to prevent it from being open source and vulnerable to potential security threats. The MSDN states, "Obfuscation is a technique that provides for seamless renaming of symbols in assemblies as well as other tricks to foil decompilers. Decompilers are tools that convert the machine or intermediate code into their high-level language counterparts. When it is properly applied, obfuscation can increase the protection against decompilation by many orders of magnitude, while leaving the application intact. Obfuscation is commonly used in Java environments and for years has been helping companies protect the intellectual property in their Java-based products."
Code obfuscation scrambles the symbols, code, and data of a program, rendering it impossible to reverse-engineer, while at the same time preserving the application's functionality. Wikipedia states, "Obfuscated code is source code in a computer programming language that has been made difficult to understand. Programmers may deliberately obfuscate code to conceal its purpose, to deter reverse engineering, or as a puzzle or recreational challenge for readers. Programs known as obfuscators also exist, which transform human-readable code into obfuscated code using various techniques."
Benefits of Code Obfuscation at a Glance
The major benefits of obfuscation of source code are these (although there are other benefits as well):
•· Protection of intellectual property
•· Improved performance
•· Reduced security threats
•· Reduced size of the application's executable
•· No specific hardware required
•· No network delays
Obfuscation and Application Performance
Obfuscating tools generally improve performance of the executables by compaction, which trims the unwanted classes, methods, instance members, and metadata information from the bytecode or Microsoft Intermediate Language (MSIL) code. This reduces the size of the executables and hence enhances performance. Compacted executables load faster and consume less memory and resources when executing. Pruning and renaming is another obfuscation process that can shrink executable sizes and boost application performance.
Why Should You Obfuscate Your Java and .NET Code?
Programs written in C# are compiled to an intermediate language using MSIL, Microsoft .NET's Common Language Runtime (CLR). This is then interpreted at runtime and converted to machine language instructions using the Just in Time (JIT) compiler. Similarly, programs written in Java are compiled to an intermediate code called bytecode in the context of the Java Virtual Machine (JVM). You also have the JIT converting this intermediate bytecode to machine code at runtime. The libraries of these languages provide support for reflection, which means you can reflect on the types and retrieve information at runtime seamlessly. Reflection is defined as the process by which a program can inspect metadata information dynamically using the reflection API.
Intermediate Code Is Vulnerable to Security Threats
The JIT compilation technology supported by both .NET and Java is the reason that applications developed in these technologies are portable. The "portability" of such applications stems from the fact that the intermediate code that is generated by these technologies can be deployed in any platform that has a supporting runtime. As an example, the MSIL code that is generated by source code compiled by .NET's CLR running in the Windows operating system is the same that will be generated in the Linux operating system using the CLR for Mono Framework. However, the fact that Java and .NET technologies generate intermediate code on compilation (which later is converted to "machine code" using the JIT) poses a potential threat to code security. The reason is that such intermediate code is easier to reverse-engineer because it doesn't compile into machine code instructions. When the source code written in these languages is vulnerable to security threats easily, what is the way to protect your intellectual property? Here is where obfuscation comes in.
Wikipedia states, "Obfuscating code to prevent reverse engineering is typically done to manage risks that stem from unauthorized access to source code. These risks include loss of intellectual property, ease of probing for application vulnerabilities and loss of revenue that can result when applications are reverse engineered, modified to circumvent metering or usage control and then recompiled. Obfuscating code is, therefore, also a compensating control to manage these risks. The risk is greater in computing environments such as Java and Microsoft's .NET which take advantage of just-in-time compilation technology that allow developers to deploy an application as intermediate code rather than code which has been compiled into machine language before being deployed."
Although you cannot prevent decompilation completely, you can apply a few techniques--like symbol naming and manipulating the control flow structures--to make your code harder to understand. You can then implement your own custom class loaders that can load your assemblies or class files (intermediate code) on the fly and execute them in the context of the JVM or the CLR. You can also apply string encryption algorithms and incremental obfuscation techniques to protect your source code from prying eyes.
The MSDN states, "Control flow is a powerful obfuscation technique, the goal of which is to hide the intent of a sequence of instructions without changing the logic. More importantly, it is used to remove the clues that decompilers look for in order to faithfully reproduce high-level source code statements, such as if-then-else statements and loops. In fact, this technique tends to break decompilers."
Downsides of Obfuscation
Obfuscation is an important technology to protect your intellectual property, but it has downsides too. Here are some of them:
•· Obfuscated code can break code that depends on serialization.
•· There can be conflicts with the reflection APIs.
•· Debugging obfuscated executables can be difficult.
•· Exception stack trace information is often lost in obfuscated code.
•· Obfuscated code impacts portability.
The biggest problem faced with obfuscated code is in debugging. Debugger tools step through the source code of a program using break points while the program is in execution. In this regard, Wikipedia states, "Obfuscated code is extremely difficult to debug. Variable names will no longer make sense, and the structure of the code itself will likely be modified beyond recognition. This fact generally forces developers to maintain two builds: One with the original, unobfuscated source code that can be easily debugged, and another for release. While both builds should be tested to make sure they perform identically, the second build is generally reliably constructed from the first by an obfuscator. This limitation does not apply to intermediate language (e.g., Java, C#) obfuscators, which generally work on compiled assemblies rather than on source code."
To Obfuscate or Not to Obfuscate?
The big question is, what should you obfuscate? Should you obfuscate the entire application? You seldom need to obfuscate your entire application, especially if it's a Web application. You just need to obfuscate the assemblies that contain code critical to your application. Obfuscation of assemblies, as we know, would improve the application's performance to a good extent. You can and must protect your application from security threats using obfuscation only if it contains critical information, like licensing policies, etc. Note that you need not obfuscate your ASP.NET code; the users of an ASP.NET application just need to view the HTML output from the assemblies that have been deployed.
Factors to Consider When Selecting Obfuscating Tools
Francis Mawutor Kugblenu of Bleking Institute of Technology states, "There are many obfuscation tools available that apply various techniques to make Reverse Engineered Bytecode more difficult to understand. While obfuscators may not provide full security, they are effective in slowing down or preventing a successful Reverse Engineering. There are some obfuscators that go further by targeting specific decompilers and causing them to crash."
When selecting the right obfuscating tool to suit your needs, consider the following:
•· Performance and security
•· XML-based configuration
•· Support for stack trace information
•· Support for troubleshooting the obfuscated executables
•· Deployment support
•· Support for integration with the IDEs
.NET Obfuscating Tools
There are plenty of tools available to obfuscate .NET assemblies. Amongst them, the following are noteworthy:
- Dotfuscator is a very popular tool used widely for obfuscating .NET assemblies. Designed by PreEmptive Solutions, it is tightly integrated with Visual Studio and provides code safety and faster executables.
- Salamander .NET Obfuscator from Remotesoft is another popular obfuscating tool that provides a GUI environment for obfuscating .NET assemblies. The best part is that it can operate without changing your debug information; you can use your debugger as usual without any runtime issues. Note that you may use Skater .NET Obfuscator Light Edition (it's free) to protect your .NET applications for personal and even for commercial purposes.
- Desaware's open-source obfuscation tool comes free with the ebook titled Obfuscating .NET. The Desaware Web site states, "In this ebook, you'll learn about a technique called Obfuscation that can help you avoid this problem by removing unnecessary information from the metadata and scrambling some of the remaining information to make disassembled components fail to recompile. And you'll receive an in depth look at one particular approach to obfuscating your .NET assemblies, along with a link to download Desaware's new open source QND-Obfuscator at no additional cost! This obfuscator is suitable for most applications and is an extraordinary value given that commercial obfuscators run many hundreds (sometimes thousands) of dollars."
Java Obfuscating Tools
Let's look at some of the most widely used tools for obfuscating Java executables.
- Zelix KlassMaster is a Java bytecode obfuscation tool. It comes with a lot of powerful features like reduced size of the bytecode, name and flow obfuscation, name exclusion, support for stack trace translation, and change logs.
- DashO for Java is a code obfuscating tool from PreEmptive Solutions that can compact and obfuscate Java executables.
- CodeShield for Java is a command-line bytecode obfuscator that can protect your intellectual property by obfuscating Java class files. It also manages control flow of the Java programs. Some of the key features of CodeSheild include advanced code protection techniques, reduced size of executables, support for all Java Standard or Enterprise Editions, support for JVMs in any platforms, and reliability and safety. The CodeSheild Web site states, "CodeShield protects your intellectual property by foiling attempts at decompiling. Java class files lend themselves to easy decompilation and inspection by many decompiler products.... This product is designed to make your easily decompiled class files secure. Besides symbol name obfuscation, CodeShield even mangles the control-flow of your program."
- yGuard is a free Java bytecode obfuscating tool that you can use to protect your Java source code from potential threats by replacing the package, class, method, and field names with characters that are difficult to understand. It also reduces the size of the class files or jar files to a great extent.
- ProGuard is a free Java class file obfuscator and optimizer. It reduces the size of the Java bytecode after obfuscating it. It can detect and remove the unused classes, fields, methods, and attributes in your Java source code. It also optimizes your Java bytecode and truncates all unused instructions. Further, it can rename the classes, methods, and fields to protect your source code.
JavaScript Obfuscating Tools
JavaScript is one of the most popular Web-scripting languages ever. However, scripting languages are not compiled; rather, they are downloaded and executed in the Web browser. These are some of the most popular obfuscating tools for protecting JavaScript code:
- Use Jasob to protect and optimize your JavaScript and CSS code. Download a free trial copy here.
- ObfuscateJS is a command-line obfuscation tool for obfuscating and compressing your JavaScript code. ObfuscateJS compresses your JavaScript code by removing the whitespace characters and comments in your code.
- Thicket Obfuscator for JavaScript is another tool for protecting, compressing, and optimizing your JavaScript code. Some of the striking features of this tool include support for faster Web page loads; the option to encrypt string literals; output encoding in ASCII, European ASCII, and UNICODE formats; and support for both command-line and GUI interfaces. Download an evaluation version of this tool.
- Stunnix JavaScript Obfuscator is another great Web-based tool that protects, encrypts, and compresses your JavaScript source code. It can work with both client- and server-based JavaScript. Download a free evaluation copy.
- The easy-to-use JavaScript Obfuscator is a free tool for obfuscating JavaScript source code. It can be used to obfuscate JavaScript local variables and JavaScript function arguments. As an example, consider the following script block:
var myVariable="Welcome to MC Press Online!";
function DisplayMessage(message)
{
alert(message);
}
When you obfuscate the above code snippet using this free online tool, here is the resultant script:
var _0x35fe=["x57x65x6Cx63x6Fx6Dx65x20x74x6Fx20x4Dx43x20x50x72x65x73x73x20x4Fx6Ex6Cx69x6Ex65x21"];var myVariable=_0x35fe[0x0];function DisplayMessage(_0x3cd2x3){alert(_0x3cd2x3);} ;
Cool, isn't it? It's completely free. Try it!
Suggested Readings
Here are links to some good references on this topic for further reading:
- http://en.wikipedia.org/wiki/Obfuscated_code
- http://msdn.microsoft.com/en-us/magazine/cc164058.aspx
- http://www.dmoz.org/Computers/Programming/Languages/Java/Development_Tools/Obfuscators/
- http://msdn.microsoft.com/en-us/library/ms227240.aspx
- http://www.cs.princeton.edu/~boaz/Papers/obf_informal.html
Protect Your Code
Obfuscation is a technology that protects your source code from potential security threats while at the same time keeping the application's functionality in place. Obfuscators generally strip out all the debug information from your code and rename packages, classes, methods, literals, etc. to names that are hard to interpret or understand easily. In this article, we explored obfuscation and its benefits and examined the features of some of the widely used .NET, Java, and JavaScript obfuscating tools.
LATEST COMMENTS
MC Press Online