Originally posted by Jonathan Mac Data Mining is simply getting the data you want from one or more databases and it's perfectly legitimate, not associated with hacks or illegal activity in any way. I do it every day at work.
Morning Jonathan, You are correct that Data Mining is a generic reference to an analytical process that attempts to find correlations or patterns in large data sets for the purpose of data or knowledge discovery. This approach is perfectly legal and a normal every day activity. However, there is an area of System Engineering referred to as Information Assurance (IA) which is the practice of assuring information and managing risks related to the use, processing, storage, and transmission of information or data and the systems and processes used for those purposes.
Let's say that you are designing an entire system architecture for a bank. Consider a couple of scenarios.
- Tellers are going to need to know if an account has sufficient funds so the customer standing in front of them can make a withdrawal. Should the system show the teller, that the little old lady as a customer has $7,381,936.12 in her checking account? Or should the teller just receive an indication that the account has sufficient funds to cover her withdrawal of $300.00?
- Marketing may need a list of customers who have more than say $100K in their accounts to send information to about a new type of high interest savings account. Does marketing need to know the exact balance, the account numbers and if there are any restrictions on the account(s)? On the other hand, do you want the marketing guy rolling the information off on a thumb drive and walking out the door with it?
Both are examples of legitimate operations, with reasonable controls placed on the interactions with the database, to prevent possible abuse. Within IA, you want to create an overall system architecture and design that will facilitate legitimate normal operations, while enabling you to detect and potentially deter malicious operations. It is tied into a lot of other areas - like identity management and access control (letting legitimate users have access, but denying the rogue process initiated by an unknown user). Facilitating a legitimate SQL query, while denying a SQL injection attempt.
If you think about the Sony intrusion, it turns out every Sony employee had access to their film library - both released and to be released films. The location was just hidden (security thru obscurity). The intruders made off with all of their new to be released films. They were never detected until they took down the entire corporate system one morning. They also extracted and published everyone's salary, emails, etc. So, there are operations that need to take place, some you want to promote, while others you would rather not happen (detect and prevent).
Going back to the original example, I was just pointing out that even the most innocent public database interaction, you can legitimately extract or make inferences to additional information - intentionally or unintentionally. Sorry about the wall of text.....