بسمه تعالی
Download
1 / 38

???? ????? ???? ???? ? ??? ???? ???? ??? ??? - PowerPoint PPT Presentation


  • 165 Views
  • Uploaded on

بسمه تعالی داده کاوی و کشف دانش محمد تقی پور دارای مدارک تحصیلی: کاردانی،کارشناسی،کارشناسی ارشد آمار و دکترای مهندسی صنایع مدیرگروه مهندسی صنایع غیرانتفاعی آبا استاد نمونه دانشگاه های آزاد و پیام نور Mohamad_taghipour@yahoo.com www.drmohamadtaghipour.ir www.mtaghipour.tk 09123944126.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '???? ????? ???? ???? ? ??? ???? ???? ??? ???' - faunia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
3357022

بسمه تعالی

داده کاوی و کشف دانش

محمد تقی پور

دارای مدارک تحصیلی: کاردانی،کارشناسی،کارشناسی ارشد آمار

و

دکترای مهندسی صنایع

مدیرگروه مهندسی صنایع غیرانتفاعی آبا

استاد نمونه دانشگاه های آزاد و پیام نور

Mohamad_taghipour@yahoo.com

www.drmohamadtaghipour.ir

www.mtaghipour.tk

09123944126



Contribution of the paper
Contribution of the paper Mining process model

  • This paper presents the results of the rigorous evaluation of the Integrated Knowledge Discovery and Data Mining (IKDDM) process model and compares it to the CRISP-DM process model.

  • Results of statistical tests confirm that the IKDDM leads to more effective and efficient implementation of the knowledge discovery process.



3357022


Phases of kddm process
Phases of KDDM process Mining process model


3357022

Limitations of previous KDDM models Mining process model

Checklist oriented description and lack of tool support

Fragmented design

Absence of an integrated view

4. Conspicuous lack of support for tasks of the business understanding phase


Ikddm overcoming the limitations of existing kddm models
IKDDM: Mining process modelovercoming the limitations of existing KDDM models

  • All the identified limitations in previously proposed KDDM process models were used as design requirements in creating this new KDDM process model.


Development of an integrated view
Development of an integrated view Mining process model

  • The IKDDM model was designed by explicating the numerous dependencies that exist between the various tasks of the KDDM process.

  • Some of the dependencies can be regarded as intra-phase dependencies, as they exist between the tasks of the same phase.

  • For example, there is a dependency between the Data Mining objective and business objective of the business understanding phase as the former utilizes the latter as its input.

  • Other dependencies can be classified as inter-phase dependencies as they exist between tasks of different phases.


Measurement instrument used for assessing the quality of ikddm and crisp dm process models
Measurement instrument Mining process model used for assessing the quality of IKDDM and CRISP-DM process models

  • Observational (through case studies and field studies).

  • Analytical (through static analysis, architecture analysis, optimization and dynamic analysis).

    A static analysis helps in evaluation of a design artifact on the basis of static or desired qualities.

  • Experimental (through controlled experiments and simulation).

  • Testing (through functional or black box and structural or white box testing).

  • Descriptive (through informed arguments and scenario construction).


3357022

The model incorporates the Mining process modelsame dimensions as Seddon’s model:

  • perceived ease of use(peou)

  • perceived usefulness(pu)

  • user satisfaction(us)

  • but replaces the Information Quality dimension of the original model with a Perceived semantic quality construct.(psq)


3357022

  • the Mining process modelInformation Quality of a conceptual model users is the perceived semantic quality of the model such as how valid and complete it is with respect to (their perception of) the problem domain.

  • Validity means that all information conveyed by the model is correct and relevant to the problem.

  • completeness entails that the model contains all information about the domain that is considered correct and relevant.

  • conceptual model =KDDM process model


Analytical testing of ikddm and crisp dm
analytical Mining process model testing of IKDDM and CRISP-DM


Methodology for performing the analytical testing using spss software v 15
methodology Mining process model for performing the analytical testing (using SPSS software v. 15)

1. Identified and recruited 42 study participants and randomly divided them in two groups.

2. Presented one group of users with a test questionnaire, which includes Data Mining tasks posed as multiple choice questions.

  • Provide them with the documentation of the CRISP-DM process model to assist in answering the questions (i.e. in executing tasks of a Data Mining project).

  • Presented the second group of users with the same test questionnaire but with the documentation of the IKDDM process model to assist in answering the questions.


3357022

3. Mining process model After the completion of the test questionnaire, recorded the perception of the static qualities of the artifact (i.e. the CRISP-DM or the IKDDM process model) used by each participant through a set of survey questions (Refer Table 3).

4. Recorded each participant’s gender, role/designation, number of years of experience in Data Mining, and time taken to complete the test. A numeric id was used to link the responder’s test to the survey. No identifying detail, such as name of the participant, or name of the organization that the individual is affiliated were recorded.


3357022

5. Mining process model Tested for statistical differences in the quality of the two models, as perceived by the users. The independent meanst-test as well as the Mann Whitney procedure was used to test the differences between the two groups (IKDDM versus CRISP-DM).


Independent means t test for comparing performance of ikddm model versus crisp dm model
independent means t-test Mining process model (for comparing performance of IKDDM model versus CRISP-DM model)

  • null hypothesis states that the experimental manipulation has no effect on the subjects and therefore we expect the sample means to be identical or very similar.


3357022
Mann–Whitney test Mining process model ( for comparing difference in groups’ perception about static qualities of KDDM process models)

  • the static qualities of the KDDM process model employed by the users to execute the Data Mining tests (in the test questionnaire) was assessed through a set of survey questions with 7 point Likert-scale options, ranging from Strongly Agree to Strongly Disagree.


Pilot test of test questionnaire and survey
Pilot test of test questionnaire and survey Mining process model

  • Four users with expertise in Data Mining participated in the pilot test.

  • On the basis of feedback received from the users the test questionnaire was slightly revised, and a final version was created for use in the actual evaluation.


Assessment of artifact by users with experience in data mining
Assessment of artifact by users with experience in Data Mining

  • The 42 participants were randomly assigned into two groups, CRISP-DM (N = 21) or IKDDM (N = 21) and were asked to use the documentation of KDDM process model to answer the Data Mining tasks.


3357022

The following Mininginformation was recorded for each participant:

  • Date on which data was collected from the individual.

  • Participant’s Gender.

  • Participant’s Role/Title.

  • Participant’s number of years of Data Mining experience.

  • Start Time for the test.

  • End Time for the test.


Assessment of validity of the measurement instrument
Assessment of Miningvalidity of the measurement instrument

  • Recommended threshold for composite reliabilities=0.70

  • Minimum threshold for Cronbach’s alpha=0.7

  • lower bound threshold value for average variances extracted (AVE)= 0.50



3357022

Results of independent means t-test – analysis of performance of CRISP-DMeval versus IKDDMeval on test questionnaire: using independent mean t-test


Discussion of results of independent means t test
Discussion of results of independent means t-test performance of CRISP-DMeval versus IKDDMeval on test questionnaire: using independent mean t-test


Results of mann whitney test
Results of Mann–Whitney test performance of CRISP-DMeval versus IKDDMeval on test questionnaire: using independent mean t-test


Results of mann whitney test to assess difference between groups on individual constructs
Results of Mann Whitney test to assess difference between groups on individual constructs

  • Results for perceived ease of use

  • Results for user satisfaction

  • Results for perceived usefulness

  • Results for perceived semantic quality


Results for perceived ease of use
Results for perceived ease of use groups on individual constructs

  • The mann-whitney test is highly significant

    (p < 0.001) for the perceived ease of use scores of the two groups(refer table13).

  • This conclusion is reached by noting that for the survey scores representing perceived ease of use the mean rank is higher in the IKDDM group(30.98) than in the CRISP group(12.02).


Results for user satisfaction
Results for user satisfaction groups on individual constructs

  • The mann-whitney test is highly significant

    (p < 0.001) for the user satisfacation scores of the two groups(refer table13).

  • This conclusion is reached by noting that for the survey scores representing user satisfaction the mean rank is higher in the IKDDM group(30.67) than in the CRISP group(12.63).


Results for perceived usefulness
Results for perceived usefulness groups on individual constructs

  • The mann-whitney test is highly significant

    (p < 0.001) for the perceived usefulness scores of the two groups(refer table13).

  • This conclusion is reached by noting that for the survey scores representing perceived usefulness the mean rank is higher in the IKDDM group(31.48) than in the CRISP group(11.52).


Results for perceived semantic quality
Results for perceived semantic quality groups on individual constructs

  • The mann-whitney test is highly significant

    (p < 0.001) for the perceived semantic quality scores of the two groups(refer table13).

  • This conclusion is reached by noting that for the survey scores representing perceived semantic quality the mean rank is higher in the IKDDM group(29.60) than in the CRISP group(13.40).


3357022

  • The results of groups on individual constructsMann–Whitney test on the overall survey scores representing the quality of the process models indicate that a significant difference existed between the CRISP and IKDDM models.

  • The test results clearly indicate that the IKDDM model outperformed the CRISP model by a highly significant margin (p < 0.001).

  • This is an important result and signifies that users rated the effectiveness and efficacy of the IKDDM model as much higher than the CRISP model.


3357022

  • The results of Mann–Whitney test across the groups on individual constructsfour constructs also indicated that the IKDDM group and CRISP group significantly differed in their perceptions of ease of use, usefulness, semantic quality and levels of user satisfaction of the model employed by them to execute tasks in Data Mining.


3357022

  • The IKDDM group reported significantly groups on individual constructshigher levels of perceived ease of use,perceived usefulness, semantic quality and user satisfaction as compared to the CRISP group.

  • The results confirm that IKDDM is more effective and efficient than the CRISP model in executing tasks of the KDDM process.

  • The limitations of existing KDDM process models (such as use of only a checklist approach, lack of explicit support towards execution of tasks) as identified in this research are certainly also perceived as being problematic by the Data Mining users.


3357022


ad