Reverse Engineer Android Applications
An introduction guide to reverse engineer Android applications

  Apr 08, 2017 -   read
  android, apk, apktool, reverse-engineer, java

As digital information becomes more valuable, companies are spending a lot of money to secure their data. Protecting mobile applications becomes even more crucial since apps have been created to do almost every thing, from booking a cab to buying a house. The goal of this article is to walk you through the very first step in the process of hacking a mobile application, understanding the internal structure of a mobile app by reverse engineering its apk.

Android application package

Before we can jump into reversing an apk, we should understand how an Android app is built and the anatomy of an Android app. The process of generating an apk is pretty much straight forward, it can be sum up nicely in the following diagram:



Result of the above process is a zip file with .apk extension. When you unzip it, you can find the following files and directories:

$ unzip some-app.apk -d some-dir/
$ ls -la some-dir/
    total 20880
    drwxr-xr-x  13 <username>  staff   442B Apr 25 01:57 ./
    drwxr-xr-x  48 <username>  staff   1.6K Apr 25 01:57 ../
    -rw-r--r--   1 <username>  staff    39K Jan  1  2009 AndroidManifest.xml
    drwxr-xr-x   6 <username>  staff   204B Apr 25 01:57 META-INF/
    -rw-r--r--   1 <username>  staff    53B Jan  1  2009 android-support-multidex.version.txt
    drwxr-xr-x   3 <username>  staff   102B Apr 25 01:57 assets/
    -rw-r--r--   1 <username>  staff   934B Jan  1  2009 build-data.properties
    -rw-r--r--   1 <username>  staff   6.0M Jan  1  2009 classes.dex
    drwxr-xr-x   3 <username>  staff   102B Apr 25 01:57 com/
    drwxr-xr-x   3 <username>  staff   102B Apr 25 01:57 lib/
    drwxr-xr-x   3 <username>  staff   102B Apr 25 01:57 org/
    drwxr-xr-x  51 <username>  staff   1.7K Apr 25 01:57 res/
    -rw-r--r--   1 <username>  staff   4.1M Jan  1  2009 resources.arsc
Directory Content & Description
AndroidManifest.xml a binary file describing the name, version, access rights, referenced libraries… it can be converted into readable plaintext XML with tools such as AXMLPrinter2, apktool, or androguard
META-INF contains the manifest file MANIFEST.MF, the certificate of the application CERT.RSA, the list of resources hash CERT.SF
assets contains applications assets, which can be retrieved by AssetManager
classes.dex compiled classes in the dex file format understandable by the Dalvik VM and Android RT
lib contains compiled native shared library for different CPU architecture: armeabi, armeabi-v7a, arm64-v8a, x86, x86_64, mips
res contains resources not compiled into resources.arsc
resources.arsc contains precompiled resources, such as binary XML for layouts, styles…

Obtaining the target

There are a few ways to obtain the target application apk:

  • Using Android Debug Bridge (a.k.a adb):
$ adb shell pm list packages | grep "google"
$ adb pull /data/app/com.google.android.music-1/base.apk

Application general information

We can easily obtain general information after having the APK by using a tool called aapt which is bundled inside Android SDK

$ <android-sdk-path>/build-tools/<version>/aapt dump badging <sample.apk>

Output will contain some helpful information like package name, version name, version code…

Reversing the target

Prerequisite

Configuration & resource files

First, we will use apktool to de-obfuscate the apk to obtain readable AndroidManifest.xml, assets and resource XML files. It will also produce a list of machine readble smali files, together with original application certificate and resources hash (kept in original directory). If the target application makes use of shared native libraries, there will be a folder called lib containing all *.so artifacts for different CPU architecture.

$ apktool d some-app.apk -o </some/output/dir>
    I: Using Apktool 2.2.2 on some-app.apk
    I: Loading resource table...
    I: Decoding AndroidManifest.xml with resources...
    I: Loading resource table from file: /Users/<username>/Library/apktool/framework/1.apk
    I: Regular manifest package...
    I: Decoding file-resources...
    I: Decoding values */* XMLs...
    I: Baksmaling classes.dex...
    I: Copying assets and libs...
    I: Copying unknown files...
    I: Copying original files...
$ ls -la some-app
    drwxr-xr-x   10 <username>  staff   340B Apr 25 01:38 ./
    drwxr-xr-x   47 <username>  staff   1.6K Apr 25 01:38 ../
    -rw-r--r--    1 <username>  staff    25K Apr 25 01:38 AndroidManifest.xml
    -rw-r--r--    1 <username>  staff   739B Apr 25 01:38 apktool.yml
    drwxr-xr-x    3 <username>  staff   102B Apr 25 01:38 assets/
    drwxr-xr-x    3 <username>  staff   102B Apr 25 01:38 lib/
    drwxr-xr-x    4 <username>  staff   136B Apr 25 01:38 original/
    drwxr-xr-x  200 <username>  staff   6.6K Apr 25 01:38 res/
    drwxr-xr-x  818 <username>  staff    27K Apr 25 01:38 smali/
    drwxr-xr-x    6 <username>  staff   204B Apr 25 01:38 unknown/

From here, we can pretty much know about the application layout structure, configuration for receivers as well as content provider if there is any.

Java source codes

To learn more about the application logic, we need to know the Java part of it. Hence, we use dex2jar to convert the previous obtained classes.dex file, which is in Dalvik bytecode format, to Java bytecode format. If the application is a multidex application, we need to put together all classes.dex files for the decompiler in order to parse properly.

$ d2j-dex2jar -o some-app.jar classes.dex
    dex2jar classes.dex -> some-app.jar
$ ls -la
    -rw-------   1 <username>  staff   6.7M Apr 25 02:07 some-app.jar

Using jd-gui or any Java decompiler tools, you then can view the readable Java source files.



:beer: Cheers :beer:

Dang Chien
Software engineer, solution architect and Agile practitioner.
Do you find this article helpful ❤️ ? Support me with a coffee via paypal.me/ck1910, give me a shout out on Twitter at @dangchien87, or help to share this article instead.