Reverse Engineering a Binary 1
DISCLAIMER
Through this paper I am not encouraging people to hack, destroy or steal anything, you must comply with laws and you shall take entire responsibility if you use this knowledge for bad behavior. With great power comes great responsibilities. Reverse engineering is not always legal, check EULA/laws in your country.
THE CODE
In this paper we are going to go over the reverse engineering of a simple compiled C++ binary, if you look below I have included the source code. This program will check user input and compare it against the string 2512 , if it matches you get the printout “Correct!” or if your wrong you get “Incorrect…”:
#include < iostream>
int main() {
using namespace std;
int i;
cout << "Please enter your code: ";
cin >> i;
if ( i == 2512 )
{
cout << "Correct!\n";
} else {
cout << "Incorrect...\n";
}
return 0;
}
Next we will want to compile this source in to a binary which we will than hack away at:
$ g++ -o sample sample.c++
Lets go ahead and have a look at this compiled file and see if we can read it:
$ head -n 1 sample
ELF####>#`@@?#@8@%"#@@@@@?#?##8#8#@8#@####@@4
4
`h#?# ####`#`?#?##T#T#@T#@DD#P?td#
##
What the heck is this? Well as this is now a binary file it is in computer languge and as can see not readable by a human.
REVERSING
Lets imagine for a moment you have this program which you wrote years ago which you forgot the password to, even worse you lost the source code you cant check from there. What can you do? Well you may consider trying to reverse engineer your program to try and find this password, below we will do just that.
The first thing I will do is gain more information about the binary I am trying to reverse:
$ file sample
sample: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
dynamically linked (uses shared libs), for GNU/Linux 2.6.15, not
stripped
What we see above is this is a Linux ELF binary which is not stripped (by stripped a binary we remove the symbols making it much harder to disassemble).
Next we should check to see if there are any interesting ASCII elements in this binary:
$ strings sample
/lib64/ld-linux-x86-64.so.2
CyIk
libstdc++.so.6
__gmon_start__
_Jv_RegisterClasses
_ZNSt8ios_base4InitD1Ev
__gxx_personality_v0
_ZSt3cin
_ZNSirsERi
_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
_ZSt4cout
_ZNSt8ios_base4InitC1Ev
libm.so.6
libgcc_s.so.1
libc.so.6
__cxa_atexit
__libc_start_main
GLIBC_2.2.5
CXXABI_1.3
GLIBCXX_3.4
fff.
fffff.
l$ L
t$(L
|$0H
Please enter your code:
Correct!
Incorrect...
Alright, so it looks like this program will ask us to enter a “code” of some sort and maybe depending on our answer we get a “Correct!” or “Incorrect…” (Of course we know this is true, just play along), it does not however appear the “Code” is written in ASCII (however sometime may it would be).
Alright now comes the fun part, lets use GDB to open this binary file:
$ gdb -q sample
Reading symbols from /sample...done.
(gdb)
As we can see the symbols where loaded successfully, this is because the binary has not been stripped of them.
Alright lets get some useful data and disassemble this binary in to an Assembler format, this would be a good time to refresh on your assembly. https://nessy.info/basics_of_assembler.pdf :
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400844 <main+0>:push %rbp
0x0000000000400845 <main+1>:mov %rsp,%rbp
0x0000000000400848 <main+4>:sub $0x10,%rsp
0x000000000040084c <main+8>:mov $0x4009ec,%esi
0x0000000000400851 <main+13>:mov $0x601180,%edi
0x0000000000400856 <main+18>:callq 0x400730 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x000000000040085b <main+23>:lea -0x4(%rbp),%rax
0x000000000040085f <main+27>:mov %rax,%rsi
0x0000000000400862 <main+30>:mov $0x601060,%edi
0x0000000000400867 <main+35>:callq 0x400740 <_ZNSirsERi@plt>
0x000000000040086c <main+40>:mov -0x4(%rbp),%eax
0x000000000040086f <main+43>:cmp $0x9d0,%eax
0x0000000000400874 <main+48>:jne 0x400887 <main+67>
0x0000000000400876 <main+50>:mov $0x400a05,%esi
0x000000000040087b <main+55>:mov $0x601180,%edi
0x0000000000400880 <main+60>:callq 0x400730 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400885 <main+65>:jmp 0x400896 <main+82>
0x0000000000400887 <main+67>:mov $0x400a0f,%esi
0x000000000040088c <main+72>:mov $0x601180,%edi
0x0000000000400891 <main+77>:callq 0x400730 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400896 <main+82>:mov $0x0,%eax
0x000000000040089b <main+87>:leaveq
0x000000000040089c <main+88>:retq
End of assembler dump.
Lets go ahead and look at a few of the lines above which can help us reverse this “Code”. As I look through the Assembly I notice a “CMP” (This is a compare function), Thinking back to how this binary works I can see how a comparison would be needed in order to validate my “Code”.
Looking below it appears the data $0x9d0 is compared with that of %eax (32bit Register):
0x000000000040086f <main+43>:cmp $0x9d0,%eax
Then right after this comparison there is a jump(JNE, Jump if not equal), it that if or comparison is not equal we will jump to 0x400887. Looking at this jump we see we are directed down the near end of the where a iostream function is called and then the program ends, we can assume that if our “Code” does not match the CMP fails and we jump to this line where the “Incorrect…” message is displayed:
0x0000000000400874 <main+48>:jne 0x400887 <main+67>
###
0x0000000000400887 <main+67>:mov $0x400a0f,%esi
0x000000000040088c <main+72>:mov $0x601180,%edi
0x0000000000400891 <main+77>:callq 0x400730 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400896 <main+82>:mov $0x0,%eax
0x000000000040089b <main+87>:leaveq
0x000000000040089c <main+88>:retq
End of assembler dump.
So if our CMP is successful the JNE will not be taken and we will move on to the next lines:
0x0000000000400876 <main+50>:mov $0x400a05,%esi
0x000000000040087b <main+55>:mov $0x601180,%edi
0x0000000000400880 <main+60>:callq 0x400730 <_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc@plt>
0x0000000000400885 <main+65>:jmp 0x400896 <main+82>
###
0x0000000000400896 <main+82>:mov $0x0,%eax
0x000000000040089b <main+87>:leaveq
0x000000000040089c <main+88>:retq
End of assembler dump.
Well this looks promising, by moving on we call another iostream function which we can assume is the message “Correct!”. After printing this message we then take a Jump (JMP, Jumps always) down to 0x400896 which is the end of the file.
At this point we conclude we only need to learn the value of $0x9d0 and we will have our “Code”, using the “printf” command we can get the “Signed decimal integer” of this value:
$ printf '%d\n' 0x9d0
2512
This looks very promising, lets give it a try:
$ ./sample
Please enter your code: 2512
Correct!
Congratulations! You have reversed this program.