1 / 27

Purpose: to Discover and Predict Trends of Oregon Graduates from SOU:

Who graduates and who doesn’t – why? Do some majors tend to have more of one gender than another – why? Does economic / cultural background influence choice of major – why? What do we see within age groupings – do certain age groupings gravitate toward certain majors and not others?

astro
Download Presentation

Purpose: to Discover and Predict Trends of Oregon Graduates from SOU:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Who graduates and who doesn’t – why? • Do some majors tend to have more of one gender than another – why? • Does economic / cultural background influence choice of major – why? • What do we see within age groupings – do certain age groupings gravitate toward certain majors and not others? • And finally, can we predict gender from the attributes of their major, age grouping, county, economic status, transfer status, and year graduated? Purpose: to Discover and Predict Trends of Oregon Graduates from SOU:

  2. Data Modeling Tool Used: WEKA • Classification / Prediction: WEKA decision tree (70% accuracy) predicted gender based on attributes of major, age grouping, year of graduation, county, economic status, transfer student. • Clustering (108): Visually shows patterns of trends for combinations of attributes

  3. A 12,982 records of SOU graduates from 1990 to present • 10,491 for training • 2,491 for testing • Attributes • PID • Year graduated {1990 – 2006} • Transfer student or not {Y,N} • County {36} • Economic county status Distressed {1,2,3} • Age (Discretized into 7 categories) • Major {1 - 297} • Gender {F,M} Data

  4. Attribute: major (297) 10,491 Training Set2491 Training SetTotal = 12,982 Decision Tree splits:majorcountytransferagegraduation yeargender @attribute MAJOR1_CODE {GSHU,COMS,HPOL,GSHF,GSIN,GSSM,MSBA,SEFR,SEMT,CSIS,MUCN,SEAT,SEMU,SES,SESP,EESP, SEBI,SEIS,SAAS,ABFA,ACA,AFE,ANTH,ANTP,ARLP,ARLT,ART,ARTH,ARTP,BA,BACH,BAHR,HR,BAMG, BAMK,BAMT,BAMU,BANM,BAOM,BAOP,BAPB,BAPH,BAPM,BARM,BASB,BCHP,BED,BIO,BIOH,BIOP,BMTP, BMUP,BOTC,BUSP,CBIS,CCJ,CHAC,CHBA,CHBI,CHEM,CHEP,CHPA,CIM,COJO,CJOP,CMHR,CMM,COMM, COMP,COBR,COTE,CRIM,CRIP,CRM,CS,CSG,CSIA,CSIN,CSMA,CSP,CSPS,ECD,ECEL,ECON,ECOP,ECTL, ED1P,ED2P,ED3P,ED4P,ED5P,ED6P,ED7P,ED8P,ED9P,EDEC,EDMS,EDP,EDUC,EE,EECI,EECT,EEHE,EEEC, EEHC,EEHL,EERE,EESB,EESL,ESP,EESU,EETS,EIAL,ELED,ELMS,EMAT,EMBE,EMBI,EMCH,EMDR,EMFR, EMGE,EMHE,EMIS,EMLA,BAAC,EMMT,EMMU,EMPE,EMPH,EMS,EMSP,EMSS,ENG,ENGL,ENGP,ENGR, ENGW,ES,ESB,ESC,ESG,ESGR,ESSP,FPA,GEGP,GEOG,GEOL,GEOP,GSBE,GSSS,HISP,HIST,HPAT,HPE, HPHP,HPP,HPPE,HPHS,HSP,HUM,INDP,INTD,INTP,INTS,LAFP,LAGP,LANC,LANF,LANG,LANS,LASP,MACS,MAP, MATH,MBA,MECI,MEEC,MERE,MESP,MHAT,MHBE,MHBI,MHCH,MHDR,MHFR,MHGE,MHHE,MHHP, MHIS,MHLA,MHMT,MHMU,MHPE,MHPH,MHS,MHSP,MHSS,MIM,MIMP,MMC,MMST,MSSP,MTAT,MTBE,MTBI, MTCH,MTDR,MTFR,MTGE,MTHE,MTHH,MTHP,MTIS,MTLA,MTMT,MTMU,MTPE,MTPH,MTRE,MTS,MTSE, MTSP,MTSS,MUIN,MUPF,MUS,MUSP,NAAM,NURP,NURS,PCHM,PCJO,PDEM,PDEN,PDHY,PEGR,PHR, PHYA,PHYP,PHYS,PLAW,PMED,PMET,POLP,POLS,POPT,POTH,PPAS,PPHA,PPTH,PRAM,PSY,PSY2,PSY3, PSY4,PSY5,PSY6,PSY7,PSYA,PSYC,PSYP,PVET,SCI,SCIP,SCTL,SEBE,SED,SEHC,SEHE,SEHL,SEHU,SELA, SEPE,SERE,SESB,SESL,SESM,SESS,SESU,SETS,SOC,SOLP,SPAN,SSCD,SSCI,SSCR,SSHS,SSPD,SSPS, TAFA,TBFA,TEAC,THAR,THEA,THEP,UNDL}

  5. Test data DT

  6. Findings: • Could within 70% accuracy predict F/M for majors (and by following the decision tree you can trace the branching to view the classification of attributes and how they relate) • But, there were other interesting patterns found using clustering (especially socio-economic)

  7. Added Distressed_County Attribute (economic status) • 1. Non Distressed • 2. Distressed • 3. Severly Distressed • And Discretized Age Attribute into 6 Classifications • 1909 – 1939 (67- 97) • 1940 – 1949 (57- 66) • 1950 – 1959 (47- 56) • 1960 – 1969 (37- 46) • 1970 – 1979 (27- 36) • 1980 – 1986 (26 -20) To discover socio-economic correlations I added 1 attribute not in original data:

  8. I had based the Distressed Attribute on: Oregon countieseconomic health • http://www.gonorthwest.com/Oregon/Oregon-cities.htm 3 = Severly Distressed (are all rural)2 = Distressed (except Marion, are non metro)1 = Not Distressed

  9. Map of Counties (socio-economic) http://www.answers.com/topic/list-of-counties-in-oregon 1. Red: distressed (rural)2.Yellow: non-metro (except Marion)3.Blue: not distressed

  10. County Economic Ranking

  11. County Economic Ranking

  12. County Economic Ranking

  13. Most interesting finding:From 1990 to 2006We can see the amount of graduates are far greater from non distressed counties. However the ratio of graduates to non graduates (within each grouping) is extremely disproportionate when you compare groupings. When you compare the ratio of students who graduate (that come from non distressed counties), you see a predominate trend: Students from distressed, and especially from severely distressed counties, who make it to SOU, Graduate.

  14. Speculating the Reason: Financial Motivation Education = Increased Income

  15. Non transfer students were the predominent graduatesJackson, Jefferson, Josephine and Klamath represented transfer studentsIt looks like graduates coming from a distance know they want to attend SOU right out of high school.

  16. Classified by major and transfer/non transfer:There was no indication of any particular major being the motivation, however our tuition is relatively lower (state) – a possible motivator.

  17. Other Trends that were noted:Male (right)/ Female (left) ratio is about the same per economic strata

  18. The age groupings by gender are fairly equalGraduates tend to be older students Top to bottom age:1980 – 1986 (26 -20)1970 – 1979 (27- 36)1960 – 1969 (37- 46)1950 – 1959 (47- 56)1940 – 1949 (57- 66)1909 – 1939 (67- 97) Top to bottom age:1980 – 1986 (26 -20)1970 – 1979 (27- 36)1960 – 1969 (37- 46)1950 – 1959 (47- 56)1940 – 1949 (57- 66)1909 – 1939 (67- 97)

  19. Majors were the first split in the Decision Tree. General trends by clustering could be noted such as Males tended to be ‘sparse’ as English Graduates. Female Graduates were ‘sparse’ in all years within the CS programming track (82% M /17% F). Even in CSIS (79% M, 21% F) with the rest categorized as ‘general CS’ (92%, 8%) for a total of all tracks (81%, 19%)

  20. 108 clusters shows clearly the disparity of graduates from certain severely economically distressed counties

  21. Age Groupings and Counties Top to bottom age:1980 – 1986 (26 -20)1970 – 1979 (27- 36)1960 – 1969 (37- 46)1950 – 1959 (47- 56)1940 – 1949 (57- 66)1909 – 1939 (67- 97)

  22. Jitter pulled back to show our near neighbors (bottom):Douglas, Jackson, Josephine, Klamath

  23. Age Groupings of Near Neighbor Graduates Left to right age:1909 – 1939 (67- 97)1940 – 1949 (57- 66)1950 – 1959 (47- 56)1960 – 1969 (37- 46)1970 – 1979 (27- 36)1980 – 1986 (26 -20)

  24. Near Neighbor is the x axisTransfer yes/no is Y axis

  25. (distressed) Josephine county produced one female CSIN major graduate (in the year 2000) – not definitive as I was clicking on instances (to see what I could find) and could have missed another female from this county.

  26. Gender is 50/50 for Near Neighbor Graduates

  27. Listing by County, Number of Graduates#Max-min number of graduates: 4854 Jackson (near neighbor), 581 Multnomah (Distant) (Wheeler 0, Gilliam 2 distant), Lake 39, Harney 25, Grant 9, Malheur 22, Crook 20

More Related