1 / 9

Building matrices and normalization

Building matrices and normalization. In order to normalize co-occurences you will need first to build a matrix with units (words, cited authors etc) in the columns and document numbers in the rows. BibExcel will fill the matrix with numbers and then you could calculate Salton’s or Jaccard Index.

feo
Download Presentation

Building matrices and normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building matrices and normalization

  2. In order to normalize co-occurences you will need first to build a matrix with units (words, cited authors etc) in the columns and document numbers in the rows. BibExcel will fill the matrix with numbers and then you could calculate Salton’s or Jaccard Index.

  3. Make a co-word analysis based on the ID-field. The low file has a nicer look than the out-file after running Edit out-files/Convert Upper Lower Case/Good for reference strings on the outfile

  4. Calculate frequencies on the low-file and the cit-file looks like this

  5. Select the most frequent units, down to frequencies=20, sort them in Excel and then paste them into The List. Then select the low-file containing the id-words, and then run Analyze/Docs and units matrix/Make docnr+units matrix without zero row sum.

  6. The ma5-file now contains the matrix!

  7. To calculate Salton’s index select the ma5-file and runAnalyze/Docs and units/Calculate Salton cosine from a ma5-file

  8. Answer Yes (Ja) to this question: Answer No (Nej) to this question:

  9. …and the result is in the sal-file, with Salton index values, multiplied by 1000 (good for some applications) Instead of Salton you may choose Jaccard or Vladutz & Cook normalization and apply them to the ma5-file.

More Related