Data Mining-Concept And Techniques
Posted On Monday, March 1, 2010 at at 8:25 PM by web researchDATA MINING –CONCEPT AND TECHNIQUES
INTRODUCTION
Data mining is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cut costs, or both. It allows users to analyze data from many different dimensions or angles, categorize it and summarize the relationships identified. It is the process of finding correlations or patterns among dozens of fields in large relational databases.
DATA, INFORMATION AND KNOWLEDGE
Data: Any facts, numbers, or text that can be processed by a computer is Data. Large amounts of data are being accumulated by the organizations in different formats and different databases. There are three types of data.
Operational or transactional data: This includes sales, cost, inventory, payroll, and accounting.
Non-operational data: Data from industry sales, forecast data, and macro economic data are considered non-operational.
Meta data: Data about the data itself. This includes logical database design or data dictionary definitions.
Information: Patterns, associations, or relationships among all the above type of data provide information of all the above types of data.
Knowledge: Information can be converted into knowledge about historical patterns and future trends. For example, a manufacturer or retailer could determine which items are most susceptible to promotional efforts.
FOUNDATIONS OF DATA MINING
Data mining techniques are the result of a long process of research and product development. This evolution began when business data was first stored on computers, continued with improvements in data access, and more recently, generated technologies that allow users to navigate through their data in real time. Data mining takes this evolutionary process beyond retrospective data access and navigation to prospective and proactive information delivery. Data mining is ready for application in the business community because it is supported by three technologies that are now sufficiently mature:
⃰ Massive data collection
⃰ Powerful multiprocessor computers
⃰ Data mining algorithms
Read more...