GForensic - Gary's Digital Forensic and Data Analytics Sharing Blog: Computer Forensic: - Forensic Workflow II

Following up from the data acquisition, the next is to conduct actual forensic analysis. There are numbers of analyses available and the most common quick analyses are shared as below.

1. Deletion Analysis

This is one of the most common analyses that required in almost all kind of cases. We could normally achieve this easily by leveraging the forensic software functionalities. Depend on the custodian’s OS version, the data storage device type and the forensic software, the high level results, such as no. of file recovered, could be always different. Also, deletion analysis might not be available in some situations, such as SSD, Linux, etc. On the other hand, deletion analysis would also be available to mobile forensic but it would be subject to the level of data access that available to examiners and the mobile device models.

2. Signature Analysis

One of the most common ways to hide the data files for scanning is to alter its file extension, for example pretending an Excel file to a Text file by changing the extension from xlsx to txt. This would possibly affect the file extraction (if this relies on file type) and the subsequent keyword search process on e-Discovery or any other subsequent forensic data review process. However, the fact is that extension is not the only way to identify the file type. There would be always a file header for each file telling the system that what type of file is it. Signature Analysis is to confirm if the file header / signature tie to the extension and identify the potential real identities.

3. Hash Analysis

Files might be duplicated for backup purpose in general computer usage OR known as no risk since they are system file in fact. In order to identify this, cryptographic hash functions could help. According to Wikipedia, “a cryptographic hash function is a hash function which is considered practically impossible to invert, that is, to recreate the input data from its hash value alone.” MD5 is one the most commonly used hash function for data integrity verification purpose. If two files having the same hash code, then it would be confirmed and accepted to be identical in terms of file content. And for the zero-risk files, we may leverage the information from a project namely National Software Reference Library (NSRL) which provide a Reference Data Set (RDS) of most known and traceable software applications’ files. By comparing the hash with each other and with the NSRL list, the review population would be reduced effectively.

4. Keyword search

There is number of ways to perform analytics on the data acquired and Keyword Search would be known as the most common one. The basic idea is similar to perform search in Google by input the keyword and review the search results accordingly. There would be plenty of ways to run keyword search, such as running in the forensic software or perform file extraction and run Windows search. The most effective, traceable and auditable way is to load the data in scope into the e-Discovery platform for search and review. In terms of loading data for search and filter, ensure that not all data has to be loaded normally since there always exist advanced data analytics and filtering process, such as filter by file type / data, apply analytics on user deletion activities, etc. to trim down the data size for data loading and run the subsequent keyword search to identify the high risk data population for review.

Please note the above would be only a quick overview of the most common task for general investigation purpose. In fact it would be thousands more analysis that available for deep down investigation. I would share more on this in the near future with some real-life example.

Previous Step | Next Step

GForensic - Gary's Digital Forensic and Data Analytics Sharing Blog

Thursday, February 11, 2016

Computer Forensic: - Forensic Workflow II – Forensic Analysis

No comments:

Post a Comment