Sunday, January 10, 2016

Data Analytics: - Analytics Cycle II: Analytics Rules

Following up from the previous post, it comes to the step 2:- Analytics Rules of the Analytics Cycle after all the ETL works are done.

Analytics Rules is the core on how’s and what’s the analysis going to be.  The ultimate goal is to analyze the readily data set and producing insights.  For example in customer analytics, rules are set for classifying and categorizing the profitable and non-profitable customers into different segments OR to trend the customers behaviours for prediction purpose.  From my experiences, Analytics rules could be classified into two major types: 1. on-going alert based; and 2. look-back review based.

On-going alert based rules are usually applied in transaction monitoring.  Alert will be triggered by pre-set criteria, for example any transaction amount larger than 1M.  Example included data screening in forensic investigation, AML transactions monitoring system, staff / customer behavior monitoring, spam email filtering, etc. 

Look-back review based is to review the historical data and then produce suggestion, insights and fact-findings subsequently.  Example included revenue optimization, predictive modelling, business review, assisted review in e-Discovery etc.

Regardless the type of analytics rules, this step is being conducted by various mathematical methodologies range from simple statistic with data visualization (such as bar chart, pie chart, trending line, etc.) to advanced modelling technique (such as back-propagation, self-organization map, regression analysis, probability, etc.)

There are plenty of tools that could help in implementing the analytics rules and below are some tools that I experienced:

1.      Database Management System (DBMS), such as SQL Server, MySQL, Oracle, MS Access, Excel, allow us to conduct analytics by SQL programming or some built-in functions and stored procedure.  Major advantage is that DBMS are always flexible and we could freely do any analysis we want on the basic data level.  But disadvantage is that it required in-depth programming skill with certain knowledge in IT prior to any analytics skill.

2.      Data Visualization Application, such as Tableau, Qlikview, i2 Analyst Workbook, allow you to create fancy layout of the data for presentation and to gain overview and insights from various visual effect.  Outliners of the data set could be easily identifying and the major advantage is that these tools could be managed easily by some simple scripting or just drag and drop.  But disadvantage is that too much data might possibly overwhelm the chart, for example it might become a mess when presenting 100,000+ entities in a network diagram OR when we are working with a dataset with too many dimensions.


3.      Predictive Modelling Platform, such as Viscovery SOMine, Assisted Review functions in Relativity, allow you to leverage advanced mathematical formulas with no requirement on details understating of how’s the equations work.  Major advantage is that user could operate the tools by only identifying and setting the training set to leverage the predictive modelling concept.  But disadvantage is that whether the tool works correctly are less likely to be told as the theory behind are likely to be a black box for most of the ordinary user on the applications level.

No comments:

Post a Comment