this post was submitted on 10 Mar 2024
24 points (92.9% liked)
No Stupid Questions (Developer Edition)
898 readers
1 users here now
This is a place where you can ask any programming / topic related to the instance questions you want!
For a more general version of this concept check out !nostupidquestions@lemmy.world
Icon base by Lorc under CC BY 3.0 with modifications to add a gradient
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Compiled binaries can be decompiled back into source code. It’s not perfect by any means, but I was very surprised how well it worked the first time I decompiled a .Net application. With this as your base you can then make changes and recompile a new binary. This glosses over a lot of detail, and there are other ways like obtaining a leaked copy of the source code.
Yeah, it's particularly easy with Java and C#, as they don't compile all the way to machine code, but rather just to an intermediate representation (byte code).
The reason this works well for certain applications and not others comes down to programming language / framework and compilation optimization.
If the application was compiled directly into an executable binary and optimized, it can be decompiled, but it won't be human-readable. Programmers would have to delve in and manually trace the code paths to figure out how it works. Fun fact, this is how a lot of the retro game decompilation projects are happening. Teams of volunteers are going through the unreadable decompilations and working together to figure them out.
Dotnet and Java based applications are easier, because they don't usually get directly compiled into machine-executable binaries, and even when they do, it's still easy to decompile them. This is because they're both compiled to an intermediate language that's more optimized than the original, then that IL is run by a runtime. Dotnet's IL is called Common Intermediate Language and Java's is called bytecode. This sounds weird, but it's kinda cool, because it lets people write different languages without having to have a full compiler. They just have to be able to get it compiled to an intermediate language, and then the existing runtime can take it from there.
That's because .net (by default) compiles to IL and is later compiled to machine code by the JIT.
Once compiled to machine code you are unlikely to get anything close to the source. Usually assembly.
Are the tools involved typically called decompilers, or would you happen to know the different names they may go by? Trying to make sure I have some solid terms to guide my own research. Thanks for the response!
Yep, decompiler is the correct term