Data Mining in the Weblog Dr. Teh Ying Wah Faculty of Computer Science and Information Technology University of Malaya

Data Mining in the WeblogDr. Teh Ying WahFaculty of Computer Science and Information TechnologyUniversity of Malaya Introduction For a data warehouse environment, sales managers need to deal with very large data sets of sales items due to globalised marketing as current and future trends. To make globalisation possible, we must allow sales managers throughout the world to log on the system. On the average, users can tolerate at most 8 seconds, as this is the limit of peoples’ ability to keep their attention focused while waiting. Getting a reasonable response time is a very critical issue for a company that is going for globalization. Indexes have emerged as one of the techniques for dealing with very large data volumes and fast response time requirements in the data warehouse environment. Table 2 shows a training data set with four data attributes and two classes. Table 2 : Training Data Set Fig. 2 shows how the data mining technique works with the training data set. Fig. 2: Decision Tree Model Literature Review Current research in query processing techniques comprises either the automatic or non-automatic selection of query processing techniques (Table 1). Both approaches, however, are not suitable for a data warehouse. There are too many parameters to select in data warehouse performance tuning. Microsoft’s AutoAdmin and Microsoft SQL 2000’s tuning wizard use the optimiser estimated cost for all the SQL statement. Microsoft SQL 2000’s tuning wizard is not an open-source software, thus, it is impossible to change the existing codes. Therefore, data mining techniques are proposed as intelligent ways to handle the query processing techniques in this research. Evaluation The test data which is evaluated is based on Transaction Processing Performance benchmark Council’s web log file. Table 3 shows the performance TPC-H sample web log file. Table 3 : Performance TPC-H Web Log Data Mining Techniques in Indexes A high priority user’s (such as a manager) access Weblog file keeps track of the high priority user decision-support queries from time T1 to time T19, as shown in Fig. 1 Fig. 1: Weblog Conclusion There are great improvements in response times of queries after applying data mining models in indexes.

Data Mining in the Weblog Dr. Teh Ying Wah Faculty of Computer Science and Information Technology University of Malaya

Data Mining in the Weblog Dr. Teh Ying Wah Faculty of Computer Science and Information Technology University of Malaya

Presentation Transcript

Jordan University of Science & Technology Faculty of Computer & Information Technology Department of Computer Sc

Jordan University of Science & Technology Faculty of Computer & Information Technology Department of Computer Sc

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY, UNIVERSITY OF MALAYA

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY

Faculty of Information Technology Department of Computer Science Computer Organization and Assembly Language

Faculty of Computer Science University of Indonesia Dr. Aniati Murni

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY

Jordan University of Science & Technology Faculty of Computer & Information Technology

Palestine University Faculty of Information Technology

University of Macau Faculty of Science and Technology Computer and Information Science

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY (FCSIT)

Faculty of Computer Science & Information Technology

FACULTY OF COMPUTER SCIENCE & INFORMATION TECHNOLOGY UNIVERSITY OF MALAYA

WARSAW UNIVERSITY OF TECHNOLOGY FACULTY OF MATHEMATICS AND INFORMATION SCIENCE

WARSAW UNIVERSITY OF TECHNOLOGY FACULTY OF MATHEMATICS AND INFORMATION SCIENCE

Jordan University of Science & Technology Faculty of Computer & Information Technology

WARSAW UNIVERSITY OF TECHNOLOGY FACULTY OF MATHEMATICS AND INFORMATION SCIENCE

University of Jyväskylä Faculty of Information Technology

Data Mining in the Weblog Dr. Teh Ying Wah Faculty of Computer Science and Information Technology University of Malaya