Following up from the data acquisition, the next is to conduct actual forensic analysis. There are numbers of analyses available and the most common quick analyses are shared as below.
1. Deletion Analysis
This is one of the most common analyses that required
in almost all kind of cases. We could
normally achieve this easily by leveraging the forensic software
functionalities. Depend on the custodian’s
OS version, the data storage device type and the forensic software, the high
level results, such as no. of file recovered, could be always different. Also, deletion analysis might not be available
in some situations, such as SSD, Linux, etc.
On the other hand, deletion analysis would also be available to mobile
forensic but it would be subject to the level of data access that available to
examiners and the mobile device models.
2. Signature Analysis
One of the most common ways to hide the data files
for scanning is to alter its file extension, for example pretending an Excel
file to a Text file by changing the extension from xlsx to txt. This would possibly affect the file
extraction (if this relies on file type) and the subsequent keyword search
process on e-Discovery or any other subsequent forensic data review process. However, the fact is that extension is not
the only way to identify the file type.
There would be always a file header for each file telling the system
that what type of file is it. Signature
Analysis is to confirm if the file header / signature tie to the extension and
identify the potential real identities.
3. Hash Analysis
Files might be duplicated for backup purpose in general
computer usage OR known as no risk since they are system file in fact. In order to identify this, cryptographic hash
functions could help. According to
Wikipedia, “a cryptographic hash function is a hash function which is
considered practically impossible to invert, that is, to recreate the input
data from its hash value alone.” MD5 is
one the most commonly used hash function for data integrity verification
purpose. If two files having the same
hash code, then it would be confirmed and accepted to be identical in terms of
file content. And for the zero-risk
files, we may leverage the information from a project namely National Software
Reference Library (NSRL) which provide a Reference Data Set (RDS) of most known
and traceable software applications’ files. By comparing the hash with each other and
with the NSRL list, the review population would be reduced effectively.
4. Keyword search
There is number of ways to perform analytics on the
data acquired and Keyword Search would be known as the most common one. The basic idea is similar to perform search
in Google by input the keyword and review the search results accordingly. There would be plenty of ways to run keyword
search, such as running in the forensic software or perform file extraction and
run Windows search. The most effective, traceable
and auditable way is to load the data in scope into the e-Discovery platform
for search and review. In terms of loading
data for search and filter, ensure that not all data has to be loaded normally since
there always exist advanced data analytics and filtering process, such as filter
by file type / data, apply analytics on user deletion activities, etc. to trim down
the data size for data loading and run the subsequent keyword search to
identify the high risk data population for review.
Please note the above would be only a quick
overview of the most common task for general investigation purpose. In fact it would be thousands more analysis
that available for deep down investigation.
I would share more on this in the near future with some real-life
example.
Previous Step | Next Step
Previous Step | Next Step
No comments:
Post a Comment