Loading in 5 sec....

The Disputed Federalist Papers : SVM Feature Selection via Concave MinimizationPowerPoint Presentation

The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization

- By
**axl** - Follow User

- 196 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization' - axl

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### TheDisputed Federalist Papers :SVM Feature Selection via Concave Minimization

Glenn Fung and Olvi L. Mangasarian

CSNA 2002

June 13-16, 2002

Madison, Wisconsin

Outline of Talk

- Support Vector Machines (SVM) Introduction

- Standard Quadratic Programming Formulation

- 1-norm Linear SVMs

- SVM Feature Selection

- Successive Linearization Algorithm (SLA)

- The Disputed Federalist Papers

- Description of the Classification Problem

- Description of Previous Work

- Results

- Separating Hyperplane in Three Dimensions Only

- Classification Agrees with Previous Results

What is a Support Vector Machine?

- An optimally defined surface
- Typically nonlinear in the input space
- Linear in a higher dimensional space
- Implicitly defined by a kernel function

What are Support Vector Machines Used For?

- Classification
- Regression & Data Fitting
- Supervised & Unsupervised Learning

(Will concentrate on classification)

in class +1 or –1 specified by:

- Membership of each

- An m-by-m diagonal matrix D with +1 & -1 entries

- Separate by two bounding planes,

where e is a vector of ones.

Algebra of the Classification Problem2-Category Linearly Separable Case- Given m points in n dimensional space

- Represented by an m-by-n matrix A

- More succinctly:

s.t.

where

is the weight of the training error

- Maximize themarginby minimizing

- Solve the following quadratic program:

s.t.

min

s.t.

Support Vector Machines: Linear Programming Formulation- Use the 1-norm instead of the 2-norm:

- This is equivalent to the following linear program:

Feature Selection and SVMs

min

s.t.

Where:

- Use the step function to suppress components of the
normal to the separating hyperplane:

SVM Formulation with Feature Selection

- For , we use the approximation of the step
vector by the concave exponential:

- Here is the base of natural logarithms. This leads to:

min

s.t.

Successive Linearization Algorithm (SLA) for Feature Selection

- Choose . Start with some .
Having , determine the next iterate

by solving the LP:

min

s.t.

- Stop when:

- Proposition: Algorithm terminates in a finite number
of steps (typically 5 to 7) at a stationary point.

The Federalist Papers Selection

- Written in 1787-1788 by Alexander Hamilton, John Jay and James Madison to persuade the citizens of New York to ratify the constitution.
- Papers consisted of short essays, 900 to 3500 words in length.
- Authorship of 12 of those papers have been in dispute ( Madison or Hamilton). These papers are referred to as the disputed Federalist papers.

Previous Work Selection

- Mosteller and Wallace (1964)
- Using statistical inference, determined the authorship of the 12 disputed papers.

- Bosch and Smith (1998).
- Using linear programming techniques and the evaluation of every possible combination of one, two and three features, obtained a separating hyperplane using only three words.

Description of the data Selection

- For every paper:
- Machine readable text was created using a scanner.
- Computed relative frequencies of 70 words, that Mosteller-Wallace identified as good candidates for author-attribution studies.
- Each document is represented as a vector containing the 70 real numbers corresponding to the 70 word frequencies.

- The dataset consists of 118 papers:
- 50 Madison papers
- 56 Hamilton papers
- 12 disputed papers

SLA Feature Selection for Classifying the Disputed Federalist Papers

- Apply the successive linearization algorithm to:
- Train on the 106 Federalist papers with known authors
- Find a classification hyperplane that uses as few words as possible

- Use the hyperplane to classify the 12 disputed papers

Hyperplane Classifier Using 3 Words Selection

- A hyperplane depending on three words was found:
0.5368to+24.6634upon+2.9532would=66.6159

- Alldisputed papers ended up on the Madison side of the plane

Results: 3d plot of resulting hyperplane Selection

Comparison with Previous Work & Conclusion Selection

- Bosch and Smith (1998) calculated all the possible sets of one, two and three words to find a separating hyperplane. They solved 118,895 linear programs.
- Our SLA algorithm for feature selectionrequired the solution of only6 linear programs.
- Our classification of the disputed Federalist papers agrees with that of Mosteller-Wallace and Bosch-Smith.

More on SVMs: Selection

- My web page:
www.cs.wisc.edu/~gfung

- Olvi Mangasarian web page:
www.cs.wisc.edu/~olvi

Download Presentation

Connecting to Server..