Which app has over 2.5 billion active users, over 5 billion downloads, and is the most popular app in over 100 countries?
Hint: check the article title.
Yes, that’s right. WhatsApp is the most popular messaging service in the world. According to Mark Zuckerberg, over 100 billion messages are sent over WhatsApp every day.
With such almost-astronomical traffic, one can’t help but wonder how WhatsApp works - its system design, server architecture, technology . How does it handle so many concurrent users and messages? What kind of frameworks and programming languages enable that kind of scale? How do they keep all that data secure? So many questions!
In this article, we are going to take a deep dive into WhatsApp’s architecture and system design. We’ll answer all the above-mentioned questions and more.
If you’ve ever wondered about the top dog in the chat app world, keep reading.
Disclaimer: We scoured the internet to collect every resource on WhatsApp architecture design and have compiled and summarized it here. To the best of our knowledge, this information is accurate. However, as companies do update their tech stack frequently, this information is subject to change.
Let’s start with the frontend and work our way to the hardware on the backend.
The first part of the WhatsApp system design that a user interacts with is the mobile or web app. WhatsApp supports nearly all platforms. It has an iOS app, Android app, desktop app, web app, and Windows Phone app. Up until 2017, you could even use WhatsApp on a BlackBerry.
With so many supported platforms, you may have guessed that WhatsApp would be a hybrid app. But, in fact, it’s not. They actually built a native app for each platform. Here's a list of all the supported platforms with the front-end language(s) that were used to build each one:
In addition to the programming language itself, another important technology that WhatsApp uses on the frontend is an SQLite database.
SQLite is a stand-alone, self-contained, relational database that is meant to be embedded into applications—which means it lives on your device. WhatsApp uses it to store conversations. Since it would be a waste of resources to download all the messages from the cloud every time you open the app, WhatsApp chooses to store the messages locally. In fact, WhatsApp only stores messages until they are received at which point they get removed.
WhatsApp uses a highly modified version of XMPP on an Ejabberd server (more on that later) to communicate with the clients.
The XMPP on the client opens an SSL socket to the WhatsApp servers. All the sent messages are queued on the servers until the client opens or reconnects to this socket to retrieve the messages. Once a message is successfully retrieved by the client, a success status is sent back to the WhatsApp server. The server then forwards this status to the original sender; letting them know that the message was received by adding the “checkmark” icon next to the successfully sent message.
Keep in mind that, while XMPP is one of the most popular messaging protocols for chat apps, it is definitely not the only option for choosing a messaging protocol.
WhatsApp uses end-to-end encryption. Ideally, this means that only the original sender and the true recipient of the message can read the message in plain text.
When you send a message, it gets encrypted using a specific encryption protocol (more on that next). WhatsApp then stores this encrypted message on their servers until it’s delivered to the recipient. Upon delivery, the recipient's device decrypts the message back into a readable, plaintext message using a unique cryptographic key. Across this entire process, WhatsApp never knows the content of your message.
WhatsApp’s encryption technology is called Signal Encryption Protocol, which was developed by Open System Whispers to be a modern, open-source, strong encryption protocol for asynchronous messaging systems.
While end-to-end encryption may make you feel safe in theory. In practice, end-to-end encryption isn’t as privacy-protecting as one would hope.
Let’s move on to the backend.
To the best of our knowledge, the current WhatsApp back-end system design looks like this:
Let’s explore some of the more interesting aspects of WhatsApp’s back-end architecture:
WhatsApp's choice of programming language is in large part what allows it to work on such a colossal scale.
Erlang is a functional programming language that is oriented towards building concurrent, scalable, and reliable systems.
It uses a process-based model called the “actor model” in which small, isolated processes communicate with each other through messages. These processes can create new processes, send messages and modify their state in response to receiving messages.
Its process-based property gives Erlang its extremely high concurrency, scalability, and reliability.
These processes can also communicate with processes outside of the core on which it runs. This makes it easy to scale the system horizontally (by adding more machines) or vertically (by adding more cores). Lastly, since the processes can communicate with each other and, more importantly, restart each other, it’s easy to build self-healing systems. If a bug crashes a process, another process can restart it.
An interesting technical choice by WhatsApp's founders was picking FreeBSD as an operating system instead of a more widely used system (like Linux).
Brian Acton, one of the cofounders of WhatsApp, said this in an interview with Wired about the decision:
“Linux is a beast of complexity.
FreeBSD has the advantage of being a single distribution with an extraordinarily good ports collection.”
Also, when it comes to raw performance, especially in regards to system load per packet, no other operating system can beat FreeBSD.
However, when it comes down to it, the real reason that they decided to use FreeBSD is probably because both co-founders had a long history of working with it at Yahoo!.
Ejabberd is an open-source XMPP server that is written in Erlang. WhatsApp uses a modified version of XMPP as its protocol for handling message delivery. Even the Ejabberd server that WhatsApp uses is heavily customized to optimize for server performance.
What’s the purpose of Ejabberd?
Well, it handles the message routing, deliverability, and general instant messaging aspects of the app. Features of Ejabberd include:
To store data and temporary messages, WhatsApp uses an Erlang-based, distributed DBMS (Database Management System) called Mnesia.
This DBMS provides benefits that many traditional databases don’t such as:
Mnesia is also the only DBMS that’s written in Erlang. This in itself is a benefit because there are no data structure differences between Erlang in the application and Erlang in the DBMS. Coding is, therefore, quicker and more explicit.
BEAM, short for “Bogdan’s Erlang Abstract Machine”, is a virtual machine that compiles and executes Erlang source code. The BEAM is designed specifically for highly concurrent applications - perfect for WhatsApp’s use case. BEAM’s secret sauce is light-weight processes that don’t share memory and are managed by schedulers. These schedulers can manage millions of processes across multiple cores. This makes BEAM highly scalable and resistant to failures, such as those caused by high traffic loads, system updates, and network outages.
BEAM is so crucial to the WhatsApp system design that the WhatsApp team has published many patches and fixes to the core source code.
YAWS (Yet Another Web Server) is an Erlang-based web server that's ideal for dynamic content. WhatsApp uses YAWS for storing multimedia data. YAWS itself uses HTML5 WebSockets that simplify two-way communication by establishing a reliable and fast connection between the server and the app. Through the use of this technology, WhatsApp is able to send and receive multimedia data across billions of devices—in near real time.
In 2017, four years after being acquired by Facebook, WhatsApp was taken off of IBM SoftLayer’s cloud and brought into Facebook’s proprietary data centers.
What we do know is that in 2014 WhatsApp required around 550 servers and over 11,000 cores that ran Erlang. We also know that WhatsApp’s user base was "only" around half a billion in 2014 compared to the more than 2 billion users it reached in 2020.
So, with that data in mind, we'll let you imagine how many servers and cores WhatsApp now requires. We imagine it's a lot.
The easiest way to get a full understanding of WhatsApp’s architecture design is, of course, through a WhatsApp architecture diagram.
Starting from the left side we have multiple different clients (mobile and web apps), each of which hosts a local SQLite database for storing conversations.
The clients use HTTP WebSockets to send and retrieve multimedia data like images and videos from the YAWS web server. But, as you can see, XMPP is used to actually send those files and other messages to other users.
When an XMPP message is sent, it goes through the series of steps depicted above. First, it gets sent to WhatsApp’s custom Ejabberd server which runs on BEAM and FreeBSD. The Ejabberd server saves the message in a Mnesia database table where it gets put into a queue. When the receiving user opens the app, thereby reconnecting to the socket, the message in the queue gets routed through the Ejabberd server and delivered to the recipient.
Once successful delivery can be confirmed, the message gets deleted from the Mnesia database.
Conclusion
While we don’t know the exact specifications of WhatsApp’s technical architecture and system design, we can get a good idea based on the technologies that WhatsApp employs. We hope this article, exploring the WhatsApp architecture design, has answered your burning questions. Now that you've gained an understanding of how the WhatsApp server works, learned what the WhatsApp tech stack looks like, and even scanned a WhatsApp architecture diagram...maybe you're feeling empowered to take on a chat app project of your own.
If you’re ready to give WhatsApp a run for their money, sign up to our developer dashboard and start building your chat app for free.
But keep in mind that many of the technologies in the WhatsApp technology stack were specifically chosen for their ability to scale and handle extremely high concurrency.
If you’re trying to build a dating app or telemedicine, (or anything that doesn’t need almost the entire world to be online at the same time), you may not need the amount of scale that WhatsApp does.
In other words, the WhatsApp tech stack, while perfect for WhatsApp, may not be the best solution for you. To learn about the ideal architecture and tech stack for a chat app, head to this article.
If you still have questions about what IS right for you, feel free to talk to our experts and before you start building your own chat app.
Just hungry for more? Here are some more great resources to dive into:
About the Author
Cosette Cressler is a passionate content marketer specializing in SaaS, technology, careers, productivity, entrepreneurship and self-development. She helps grow businesses of all sizes by creating consistent, digestible content that captures attention and drives action.
Some months ago I was scrolling through my WhatsApp chats and suddenly an idea came through: why do not extract my WhatsApp database and perform some data analysis on it? A lot of interesting metadata about how I and my contacts use WhatsApp could be extracted.
Photo by LinkedIn Sales Solutions on UnsplashIt was not an easy process, since you need to somehow copy that database from your phone to your computer, then understand it, and once you have understood how it is structured, think on what useful information can be extracted and how to present it. It has taken me a lot of time, but now I feel very proud of the results, and I am really excited about sharing them with you.
In this post I will explain you:
If you want to try it yourself, you will need a couple of things:
Are you ready to join me? Let’s go!
If you have an Android phone, I am sorry to tell you that you need a rooted device. The WhatsApp database is stored in a location of the filesystem with restricted access, so you will need special permissions to be able to get it. There are some alternative ways to get it, but at the end all require root access.
In case you have your phone rooted, you have to install adb following these steps, and then connect your mobile phone with a USB cable and run the commands inside your working directory:
adb root
adb shell sqlite3 wa.db “.backup ‘wa.db.bak’”
adb pull /data/data/com.whatsapp/databases/msgstore.db
adb pull /data/data/com.whatsapp/databases/wa.db.bak wa.db
sqlite3 msgstore.db .dump > whatsapp.sql
sqlite3 wa.db .dump >> whatsapp.sql
sqlite3 whatsapp.db < whatsapp.sql
What we are doing here is get root permissions, get the two databases where the WhatsApp data is stored, and join them in a single one.
With a jailbroken iPhone the process might be easier –it would probably look similar to the explained above for Android–, but since my iPhone is not jailbroken (and most out there in the world neither), we will extract it from a backup instead.
First of all, you have to connect your iPhone to your computer and back it up.
You can find detailed instructions for Mac and Windows on the official Apple website. Then, we will extract the WhatsApp database from the backup using an open-source tool called imobax. If you have a Mac, you can directly download the executable file from here. If you are using Windows, then you will need to compile it yourself using gcc and make (instructions on how to install it here): just download the repository to your computer and run the command “make” inside the folder.
If you have used a Mac to make the backup, it will be stored in ~/Library/Application Support/MobileSync/Backup. If you did it in Windows, the directory will be either %userprofile%\Apple\MobileSync\Backup or %appdata%\Apple Computer\MobileSync\Backup. Inside those directories there will be a folder with your backup, and inside all the backed up files, including our database. The problem is that the files have random alphanumeric names. It is there where imobax will help us, by telling us which file is the database.
./imobax -l <backup location> | grep ChatStorage.sqlite | awk ‘{print $1}’The command above will give you the name of the database. Just search for it in the backup folder, copy it to your working directory and rename it as you wish, for example, whatsapp.db.
The WhatsApp database is full of tables, some of them with myriads of columns and rows. Most of them will not be relevant for us, so let’s see which tables and columns we are going to work with.
To later make our SQL queries easier, it is a good idea to create now two views in our database: one for the private, individual messages (let’s call it friends_messages) and another for the messages from group chats (group_messages).
This will not only simplify the SQL queries, but also allow us to use the same query for iPhone and Android.
For my experiment, I have used Redash to create a dashboard to visualize the KPIs that we will explore afterwards. This open-source tool can be easily deployed with Docker following this guide. It goes without saying that you are free to use any other visualization tool which supports SQLite, like PowerBI, Tableau, etc.
Once we have imported the database with the views from previous step created, the next step is to create one query to get the list of contacts and another to get the list of groups. This will be needed to then be able to show a dropdown list to filter those KPIs which are about a single contact or group. In Redash, this is done by creating those two SQL queries, and then adding a parameter of type “Query Based Dropdown List” on the corresponding KPIs, which can be referred as “{{ parameter_name }}” in the SQL query.
Now we are ready to get our hands dirty and write some SQL queries to extract some interesting KPIs about how do we use WhatsApp. The actual number of KPIs that can be extracted is just limited by your imagination and creativity, so you are welcome to create your own. I will leave you here some of the ones I created:
This query can be visualized with a vertical bar chart, with the column “friend_name” on the X-axis and “number_of_messages” on the Y-axis.
Top 20 of contacts with whom I have talked the most during the latest 30 days (names on the bottom are cropped for privacy reasons)This query can be visualized with a line chart, with the column “day” on the X-axis and “ma_mom” on the Y-axis.
Pay attention to the first lines of the query, where a moving average of 30 days is calculated. Without it the chart would look really sharp and noisy, hence applying this filter.
It is also important to remark that {{ friend_name }} is a variable, so Redash (or the corresponding visualization tool) will replace it with the selected contact.
Number of messages per day exchanged with a contact among time. The value has been smoothed with a 30 days moving average.For this KPI, we are grouping the messages by hour of the day and day of the week, using the sqlite function strftime(), and then counting the total number of exchanged messages on each day of the hour and day of the week. Then we can plot it in a pivot table.
This heat map shows the number of messages exchanged with a friend on each hour of each day of the weekIn this query, we are calculating the average length of the messages sent to us by each of our contacts, getting the top 10, and then plotting it in a vertical bars chart.
However, it is quite normal to write long texts split in several messages, so this KPI is not very accurate. I tried to calculate it taking this fact into account, but very complex algorithms would be needed, which unfortunately cannot be implemented using sqlite.
In a similar way to KPI #2, here we are filtering with a 30 days moving average the number of messages sent per day by each of the group members, including myself (“Me”). Since we are working with groups, the view “group_messages” is used instead “friend_messages”, and a new variable is defined in the dashboard (“group_name”), so the user can choose the group on which he wants to see this chart.
Evolution of the participation rate (number of messages per day filtered with a 30 days moving average) of each group participant, including myselfThis was a huge post, where I have covered a long process: extracting the database from our phone, cleaning the data, doing some analysis with SQL and then visualizing those results in a chart.
I had a lot of fun doing it, so I welcome you to try it yourselves and try to elaborate some new KPIs. Feel free to post a comment about your impressions, findings or improvements, I will be happy to read from you!
Just note that different operating systems store different types of WhatsApp artifacts, and if a researcher can extract certain types of WhatsApp data from one device, this does not mean at all that similar types of data can be extracted from another device. For example, if a system unit running Windows is removed, then WhatsApp chats will probably not be found on its drives (the exception is backup copies of iOS devices that can be found on the same drives). When seizing laptops and mobile devices, there will be some peculiarities. Let's talk about this in more detail.
In order to extract WhatsApp artifacts from an Android device, the researcher must have root access ( 'root' ) on the device under investigation, or be able to otherwise extract the physical memory dump of the device, or its file system (for example, using software vulnerabilities specific mobile device).
Application files are located in the phone's memory in the partition where user data is stored.
Typically, this section is named ‘userdata’ . Subdirectories and files of the program are located along the path: ‘/data/data/com.whatsapp/’ .
The main files that contain WhatsApp forensic artifacts in Android OS are databases 'wa.db' and 'msgstore.db' .
Database ‘wa.db’ contains the complete WhatsApp user contact list, including phone number, display name, timestamps and any other information provided during WhatsApp registration. File ‘wa.db’ is located along the path: ‘/data/data/com.whatsapp/databases/’ and has the following structure:
The most interesting tables in the database 'wa.db' for the researcher are:
Table appearance:
Table structure
| Field name | Meaning |
|---|---|
| _id | sequence number of the record (in the SQL table) |
| jid | WhatsApp Contact ID, written in the format whatsapp.net |
| is_whatsapp_user | contains '1' if the contact is an actual WhatsApp user, '0' otherwise |
| status | contains the text displayed in contact status |
| status_timestamp | contains timestamp in Unix Epoch Time (ms) format |
| number | phone number associated with contact |
| raw_contact_id | contact number |
| display_name | contact display name |
| phone_type | phone type |
| phone_label | label associated with contact number |
| unseen_msg_count | number of messages sent by the contact but not read by the recipient |
| photo_ts | contains a timestamp in Unix Epoch Time format |
| thumb_ts | contains timestamp in Unix Epoch Time 9 format0060 |
| photo_id_timestamp | contains timestamp in Unix Epoch Time (ms) format |
| given_name | field value is the same as 'display_name' for each contact |
| wa_name | Whatsapp contact name (displays the name in the contact's profile) |
| sort_name | Contact name used in sort operations |
| nickname | WhatsApp nickname of the contact (displays the nickname specified in the contact's profile) |
| company | company (displays the company listed in the contact profile) |
| title | Title (Madam/Mr. ; displays the title configured in the contact profile) |
| offset | offset |
The database 'msgstore.db' contains information about transferred messages, such as contact number, message text, message status, timestamps, information about transferred files included in messages, etc. File ‘msgstore.db’ is located along the path: ‘/data/data/com.whatsapp/databases/’ and has the following structure:
The most interesting tables in file ‘msgstore.db’ for the researcher are:
Table view:
Table view:
Table appearance:
Table structure
| Field name | Meaning |
|---|---|
| _id | sequence number of the record (in the SQL table) |
| key_remote_jid | WhatsApp Communication partner ID |
| key_from_me | message direction: '0' - incoming, '1' - outgoing |
| key_id | unique message identifier |
| status | message status: '0' - delivered, '4' - waiting on the server, '5' - received at destination, '6' - control message, '13' - message opened by the recipient (read) |
| need_push | is '2' if it is a broadcast message, otherwise '0' |
| data | message text (when 'media_wa_type' is '0') |
| timestamp | contains a timestamp in Unix Epoch Time (ms) format, the value is taken from the device clock |
| media_url | contains the URL of the file being transferred (when the 'media_wa_type' parameter is '1', '2', '3') |
| media_mime_type | MIME type of the transferred file (when the 'media_wa_type' parameter is '1', '2', '3') |
| media_wa_type | message type: '0' - text, '1' - graphic file, '2' - audio file, '3' - video file, '4' - contact card, '5' - location data |
| media_size | transfer file size (when parameter 'media_wa_type' is '1', '2', '3') |
| media_name | transfer file name (when parameter 'media_wa_type' is '1', '2', '3') |
| media_caption | Contains the words 'audio', 'video' for the corresponding values of the parameter 'media_wa_type' (when the parameter 'media_wa_type' is '1', '3') |
| media_hash | base64 encoded hash of the transmitted file calculated using the HAS-256 algorithm (when the 'media_wa_type' parameter is '1', '2', '3') |
| media_duration | duration in seconds for the media file (when 'media_wa_type' is '1', '2', '3') |
| origin | is '2' if it is a broadcast message, otherwise '0' |
| latitude | location data: latitude (when 'media_wa_type' is set to '5') |
| longitude | geodata: longitude (when 'media_wa_type' is '5') |
| thumb_image | service information |
| remote_recource | Sender ID (only for group chats) |
| received_timestamp | time of receipt, contains a timestamp in Unix Epoch Time (ms) format, the value is taken from the device clock (when the 'key_from_me' parameter is '0', '-1' or another value) |
| send_timestamp | not used, usually set to ‘-1’ |
| receipt_server_timestamp | time received by the central server, contains a timestamp in the Unix Epoch Time (ms) format, the value is taken from the device clock (when the 'key_from_me' parameter has '1', '-1' or another value |
| receipt_device_timestamp | the time the message was received by another subscriber, contains a timestamp in the Unix Epoch Time (ms) format, the value is taken from the device clock (when the 'key_from_me' parameter has '1', '-1' or another value |
| read_device_timestamp | message opening (reading) time, contains a timestamp in the Unix Epoch Time (ms) format, the value is taken from the device clock |
| played_device_timestamp | message playback time, contains a timestamp in Unix Epoch Time (ms) format, the value is taken from the device clock |
| raw_data | Transfer file thumbnail (when 'media_wa_type' is '1' or '3') |
| recipient_count | number of recipients (for broadcast messages) |
| participant_hash | is used when sending messages with geodata |
| starred | not used |
| quoted_row_id | unknown, usually '0' |
| mentioned_jids | not used |
| multicast_id | not used |
| offset | offset |
This list of fields is not exhaustive.
For different versions of WhatsApp, some of the fields may or may not be present. Additionally, the fields 'media_enc_hash' , 'edit_version' , 'payment_transaction_id' , etc. may be present.
Table view:
Also, when examining WhatsApp on an Android mobile device, you should pay attention to the following files:
You also need to pay attention to the following directories:
OPUS format files. Whatsapp log files:
Log snippet
2017-01-10 09:37:09.757 LL_I D [524:WhatsApp Worker #1] missedcallnotification/init count:0 timestamp:0
2017-01-10 09:37:09.
758 LL_I D [524:WhatsApp Worker #1] missedcallnotification/update cancel true
2017-01-10 09:37:09.768 LL_I D [1:main] app-init/load-me
2017-01-10 09:37:09.772 LL_I D [1:main] password file missing or unreadable
2017-01-10 09:37:09.782 LL_I D [1:main] statistics Text Messages: 59 sent, 82 received / Media Messages: 1 sent (0 bytes), 0 received (9850158 bytes) / Offline Messages: 81 received (19522 msec average delay) / Message Service: 116075 bytes sent, 211729 bytes received / Voip Calls: 1 outgoing calls, 0 incoming calls, 2492 bytes sent, 1530 bytes received / Google Drive: 0 bytes sent, 0 bytes received / Roaming: 1524 bytes sent, 1826 bytes received / Total Data: 118567 bytes sent, 10063417 bytes received
2017-01-10 09:37:09.785 LL_I D [1:main] media-state-manager/refresh-media-state/writable-media
2017-01-10 09:37:09.806 LL_I D [1:main] app-init/initialize/timer/stop: 24
2017-01-10 09:37:09.811 LL_I D [1:main] msgstore/checkhealth
2017-01-10 09:37:09.
817 LL_I D [1:main] msgstore/checkhealth/journal/delete false
2017-01-10 09:37:09.818 LL_I D [1:main] msgstore/checkhealth/back/delete false
2017-01-10 09:37:09.818 LL_I D [1:main] msgstore/checkdb/data/data/com.whatsapp/databases/msgstore.db
2017-01-10 09:37:09.819 LL_I D [1:main] msgstore/checkdb/list _jobqueue-WhatsAppJobManager 16384 drw=011
2017-01-10 09:37:09.820 LL_I D [1:main] msgstore/checkdb/list _jobqueue-WhatsAppJobManager-journal 21032 drw=011
2017-01-10 09:37:09.820 LL_I D [1:main] msgstore/checkdb/list axolotl.db 184320 drw=011
2017-01-10 09:37:09.821 LL_I D [1:main] msgstore/checkdb/list axolotl.db-wal 436752 drw=011
2017-01-10 09:37:09.821 LL_I D [1:main] msgstore/checkdb/list axolotl.db-shm 32768 drw=011
2017-01-10 09:37:09.822 LL_I D [1:main] msgstore/checkdb/list msgstore.db 540672 drw=011
2017-01-10 09:37:09.823 LL_I D [1:main] msgstore/checkdb/list msgstore.db-wal 0 drw=011
2017-01-10 09:37:09.823 LL_I D [1:main] msgstore/checkdb/list msgstore.
db-shm 32768 drw=011
2017-01-10 09:37:09.824 LL_I D [1:main] msgstore/checkdb/list wa.db 69632 drw=011
2017-01-10 09:37:09.825 LL_I D [1:main] msgstore/checkdb/list wa.db-wal 428512 drw=011
2017-01-10 09:37:09.825 LL_I D [1:main] msgstore/checkdb/list wa.db-shm 32768 drw=011
2017-01-10 09:37:09.826 LL_I D [1:main] msgstore/checkdb/list chatsettings.db 4096 drw=011
2017-01-10 09:37:09.826 LL_I D [1:main] msgstore/checkdb/list chatsettings.db-wal 70072 drw=011
2017-01-10 09:37:09.827 LL_I D [1:main] msgstore/checkdb/list chatsettings.db-shm 32768 drw=011
2017-01-10 09:37:09.838 LL_I D [1:main] msgstore/checkdb/version 1
2017-01-10 09:37:09.839 LL_I D [1:main] msgstore/canquery
2017-01-10 09:37:09.846 LL_I D [1:main] msgstore/canquery/count 1
2017-01-10 09:37:09.847 LL_I D[1:main] msgstore/canquery/timer/stop: 8
2017-01-10 09:37:09.847 LL_I D [1:main] msgstore/canquery 517 | time spent:8
2017-01-10 09:37:09.
848 LL_I D [529:WhatsApp Worker #3] media-state-manager/refresh-media-state/internal-storage available:1,345,622,016 total:5,687,922,688
Some models of Android mobile devices may store WhatsApp artifacts in a different location. This is due to the change in the application data storage space by the system software of the mobile device. So, for example, in Xiaomi mobile devices there is a function to create a second workspace (“SecondSpace”). When this function is activated, the location of the data is changed. So, if in a regular mobile device running the Android OS, user data is stored in the directory ‘/data/user/0/’ (which is a reference to the usual ‘/data/data/’ ), then in the second workspace, application data is stored in the directory ‘/data/user/10/’ . That is, for example, the location of the file 'wa.db' :
whatsapp/databases/wa.db' (which is equivalent to ' /data/data/com.whatsapp/databases/wa.db') ; Unlike Android OS, in iOS, WhatsApp data is transferred to a backup copy (iTunes backup). Therefore, extracting data from this application does not require extracting the file system or creating a physical memory dump of the device under investigation. Most of the relevant information is contained in the database ‘ChatStorage.sqlite’ , which is located along the path: ‘/private/var/mobile/Applications/group.net.whatsapp.WhatsApp.shared/’ (in some programs this path appears as ‘AppDomainGroup-group.net.whatsapp.WhatsApp.shared’ ).
Structure ‘ChatStorage.sqlite’ :
The most informative in the database 'ChatStorage.
sqlite' are tables 'ZWAMESSAGE' and 'ZWAMEDIAITEM' .
Table layout ‘ZWAMESSAGE’ :
Table structure ‘ZWAMESSAGE’
| Field name | Meaning |
|---|---|
| Z_PK | sequence number of the record (in the SQL table) |
| Z_ENT | table identifier, value is '9' |
| Z_OPT | unknown, usually contains values from '1' to '6' |
| ZCHILDMESSAGESDELIVEREDCOUNT | unknown, usually contains the value '0' |
| ZCHILDMESSAGESPLAYEDCOUNT | unknown, usually '0' |
| ZCHILDMESSAGESREADCOUNT | unknown, usually '0' |
| ZDATAITEMVERSION | unknown, usually contains value '3', probably text message pointer |
| ZDOCID | unknown |
| ZENCRETRYCOUNT | unknown, usually '0' |
| ZFILTEREDRECIPIENTCOUNT | unknown, usually contains the values '0', '2', '256' |
| ZISFROMME | message direction: '0' - incoming, '1' - outgoing |
| ZMESSAGEERRORSTATUS | message transfer status. If the message is sent/received, it has the value '0' |
| ZMESSAGETYPE | message type to be transmitted |
| ZSORT | unknown |
| ZSPOTLIGHSTATUS | unknown |
| ZSTARRED | unknown, not used |
| ZCHATSESSION | unknown |
| unknown, not used | |
| ZLASTSESSION | unknown |
| ZMEDIAITEM | unknown |
| ZMESSAGEINFO | unknown |
| ZPARENTMESSAGE | unknown, not used |
| ZMESSAGEDATE | timestamp in OS X Epoch Time format |
| ZSENTDATE | the time the message was sent in OS X Epoch Time format |
| ZFROMJID | WhatsApp Sender ID |
| ZMEDIASECTIONID | contains the year and month the media file was sent |
| ZPHASH | unknown, not used |
| ZPUSHPAME | name of the contact who sent the media file in UTF-8 format |
| ZSTANZID | unique message identifier |
| ZTEXT | message text |
| ZTOJID | WhatsApp Recipient ID |
| OFFSET | offset |
Table appearance ‘ZWAMEDIAITEM’ :
Table structure ‘ZWAMEDIAITEM’
| Field name | Meaning |
|---|---|
| Z_PK | record number (in SQL table) |
| Z_ENT | table identifier, value is '8' |
| Z_OPT | is unknown, usually contains values from '1' to '3'. |
| ZCLOUDSTATUS | contains the value '4' if the file is loaded. |
| ZFILESIZE | contains the file length (in bytes) for uploaded files |
| ZMEDIAORIGIN | unknown, usually '0' |
| ZMOVIEDURATION | duration of the media file, for pdf files it can contain the number of pages of the document |
| ZMESSAGE | contains a serial number (number differs from the one shown in the 'Z_PK' column) |
| ZASPECTRATIO | aspect ratio, not used, usually set to '0' |
| unknown, usually '0' | |
| ZLATTITUDE | width in pixels |
| ZLONGTITUDE | height in pixels |
| ZMEDIAURLDATE | timestamp in OS X Epoch Time format |
| author (for documents, may contain file name) | |
| ZCOLLECTIONNAME | not used |
| ZMEDIALOCALPATH | file name (with path) in the file system of the device |
| ZMEDIAURL | The URL where the media file was located. If the file was transferred from one subscriber to another, it was encrypted, and its extension will be indicated as the extension of the transferred file - .enc |
| ZTHUMBNAILLOCALPATH | path to the file thumbnail in the device file system |
| ZTITLE | file header |
| ZVCARDNAME | hash of the media file, when transferring a file to a group, it may contain the sender ID |
| ZVCARDSTRING | contains information about the type of file being transferred (for example, image/jpeg), when transferring a file to a group, it may contain the recipient identifier |
| ZXMPPTHUMBPATH | path to file thumbnail in device file system |
| ZMEDIAKEY | unknown, probably contains the key to decrypt the encrypted file. |
| ZMETADATA | metadata of the transmitted message |
| Offset | offset |
Other interesting tables in database 'ChatStorage.
sqlite' are:
Also, when examining WhatsApp on an iOS mobile device, pay attention to the following files:
net.whatsapp.WhatsApp.shared/ .
The contents of the ‘group.net.whatsapp.WhatsApp.shared.plist’ file
You also need to pay attention to the following directories:
whatsapp.WhatsApp/Documents/’ . Contains the program operation log (file ‘calls.log’ ) and backup copies of the program operation logs (file ‘calls.backup.log’ ). can be found in several places. First of all, these are directories containing executable and auxiliary files of the program (for Windows 8/10):
The directory ‘C:\Users\%User profile%\ AppData\Local\WhatsApp\’ contains the log file ‘SquirrelSetup.
log’ , which contains information about checking for updates and installing the program.
The directory ‘C:\Users\%User profile%\ AppData\Roaming\WhatsApp\’ contains several subdirectories:
File ‘main-process.log’ contains information about the operation of WhatsApp.
Subdirectory 'databases' contains a file 'Databases.db' , but this file does not contain any information about chats or contacts.
The most interesting from a forensic point of view are the files located in the directory ‘Cache’ . These are mainly files named ‘f_*******’ (where * is a number from 0 to 9) containing encrypted multimedia files and documents, but there are also unencrypted files among them. Of particular interest are the files 'data_0' , 'data_1' , 'data_2' , 'data_3' , located in the same subdirectory. Files 'data_0' , 'data_1' , 'data_3' contain external links to the transmitted encrypted multimedia files and documents.
Example of information contained in file 'data_1'
Also file 'data_3' may contain graphic files.
File 'data_2' contains contact avatars (can be retrieved by searching through file titles).
Avatars contained in the file ‘data_2’ :
Thus, chats themselves cannot be found in the computer's memory, but you can find:
On MacOS, you can find types of WhatsApp artifacts similar to those found on Windows.
Program files are located in the following directories:
Sources
In the following articles in this series:
Decryption of WhatsApp encrypted databases
An article that will provide information on how to generate a WhatsApp encryption key and give practical examples showing how to decrypt encrypted WhatsApp databases. applications.
Retrieve WhatsApp data from cloud storage
An article that will explain what WhatsApp data is stored in the clouds and describe methods for extracting this data from cloud storage.
WhatsApp data extraction: practical examples
An article that will step by step describe which programs and how to extract WhatsApp data from various devices.
Group-IB knows everything about cybercrime, but they tell you the most interesting things.
Action-packed Telegram channel (https://t.me/Group_IB) about information security, hackers and cyberattacks, hacktivists and Internet pirates. Investigation of sensational cybercrimes step by step, practical cases using Group-IB technologies and, of course, recommendations on how not to become a victim on the Internet.
Group-IB YouTube channel
Group-IB photo feed on Instagram www.instagram.com/group_ib
Brief news on Twitter twitter.com/GroupIB
Group-IB is a leading provider of cyber-attack detection, prevention, fraud detection, and online intellectual property protection, headquartered in Singapore.
One of the reasons WhatsApp is becoming one of the most popular messaging services is because of its powerful security feature. It encrypts messages end to end, so the only people who can read those messages are the sender and the recipient - unless someone else can open the sender's or recipient's phones.
But sometimes even the phone owner cannot access their phones due to technical failures. If you can't access your own phone, can you still read encrypted WhatsApp messages?
In September 2012, WhatsApp introduced data encryption as a security feature. This step is taken to prevent session hijacking and packet sniffing that often happened in the past. And WhatsApp uses the forms crypt2, crypt5, crypt7, crypt8 and crypt12 to encrypt all data. This means that hacking the database files to read all chat messages has become almost impossible.
But there are tricks you can use to decrypt the database without keys and supporting files. You can use this method to access your conversations.
The trick below works when reading encrypted WhatsApp messages on Android devices. Before you start, you need to create a copy of your WhatsApp database to make sure you haven't destroyed the original file.
To do this, open Android Explorer or a file browser. Then create a new folder or SD card. Then navigate to this location on your SD card: /WhatsApp/Databases/msgstore.db.crypt. Then copy the msgstore.db.crypt files to the new folder you just created.
Whatsapp encrypts all data in .crypt5/7/8/12 format. But on a rooted Android phone, you can easily decrypt and read these encrypted messages with Whatsapp Viewer.
Locate the backup file of your WhatsApp message, such as msgstore.db.crypt 12, in the device storage / WhatsApp /Database.
Find your key file containing the decryption key to decrypt the encrypted file from /data/data/com.whatsapp/files/key.
Download and install Whatsapp Viewer on your computer. Open Whatsapp Viewer and navigate to File > Decrypt .crypt12.
Now you need to download the database file and the key file. Click on the "..." button next to the database file field to import it and do the same for the key file. After that click OK to decrypt the database file.
When you see the message "Database has been decrypted to msgstore.decrypted.db", decryption is complete. You will find a file called "msgstore.decrypted.db" in the folder where you stored the database file and the key file.
Launch the WhatsApp viewer again and click File > Open. Click on the "..." button to import the msgstore.decrypted.db file and click OK.
You can now select a mobile phone number in the right panel and view its chats in the left panel. You can export it in .text / .html / .json format if you like.
To decrypt the database into something human-readable, we can use the help of one of the decryption apps available on the Google Play store. The recommended application that you can use is Omni-crypt. It can easily decrypt whatsapp database without root. Please note that in order to decrypt a database above crypt6 version, you will need whatsapp-key-db-extractor to extract the encryption key.
com. Open the WhatsApp-Key-DB-Extractor folder and find the file named WhatsAppKeyDBExtract.sh . Right-click on it and select Properties.
On the Permissions tab, select the Allow executing file as a program check box.
After that run WhatsAppKeyDBExtract.sh file in Terminal on Mac.
When you are prompted to unlock your device and confirm the backup operation, open your Android phone and tap BACKUP MY DATA.
Wait for WhatsAppKeyDBExtract to restore WhatsApp and press Enter when finished to exit the Terminal.
Now open Omni-crypt on your Android phone. Click on ENABLE CRYPT BACKUP 6-12 and then click on WHATSAPP database DECRYPTION.
Now open the Whatsapp-Key-DB-Extractor folder and navigate to the extracted folder. Here you can see the 'msgstore.db' and 'wa.db' files. ‘msgstore.db' stores all messages along with attachments, while 'wa.db' stores all information related to contacts.
That's all about how to read encrypted WhatsApp messages. These steps are a bit complicated for normal users. If you are looking for a way to read deleted WhatsApp messages, Tenorshare UltData WhatsApp Recovery offers you an easy way to recover WhatsApp messages and contacts from Android without root.
Step 1 After you have downloaded the software, you go to the main interface below:
Step 2 Then you need to login and debug usb on your android phone to connect properly.
Step 3Now it's time to scan and see what data has been listed. Just choose what you want to recover.
Step 4 Finally, successfully save the files to your computer or device and take a look at them.