nlp in scala with breeze and epic n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
NLP in Scala with Breeze and Epic PowerPoint Presentation
Download Presentation
NLP in Scala with Breeze and Epic

Loading in 2 Seconds...

play fullscreen
1 / 66

NLP in Scala with Breeze and Epic - PowerPoint PPT Presentation


  • 166 Views
  • Updated on

NLP in Scala with Breeze and Epic. David Hall UC Berkeley. ScalaNLP Ecosystem. ≈. ≈. ≈. Linear Algebra Scientific Computing Optimization. Natural Language Processing Structured Prediction. Super-fast GPU parser for English. { }. Numpy / Scipy. PyStruct /NLTK. Breeze.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

NLP in Scala with Breeze and Epic


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. NLP in Scala with Breeze and Epic David Hall UC Berkeley

    2. ScalaNLP Ecosystem ≈ ≈ ≈ • Linear Algebra • Scientific Computing • Optimization • Natural Language Processing • Structured Prediction • Super-fast GPUparser for English { } Numpy/Scipy PyStruct/NLTK Breeze Epic Puck

    3. Natural Language Processing V S S VP VP Some fruit visionaries say the Fuji could someday tumble the Red Delicious from the top of America's apple heap. It certainly won’t get there on looks. VP NP PP NP

    4. Epic for NLP V S S VP VP Some fruit visionaries say the Fuji could someday tumble the Red Delicious from the top of America's apple heap. VP NP PP NP

    5. Named Entity Recognition PersonOrganizationLocationMisc

    6. NER with Epic • import epic.models.NerSelector • valnerModel = NerSelector.loadNer("en").get • val tokens = epic.preprocess.tokenize("Almost 20 years ago, Bill Watterson walked away from \"Calvin & Hobbes.\"") • println(nerModel.bestSequence(tokens).render("O"))Almost 20 years ago , [PER:Bill Watterson] walked away from `` [LOC:Calvin & Hobbes] . '' Not a location!

    7. Annotate a bunch of data?

    8. Building an NER system • val data: IndexedSeq[Segmentation[Label, String]] = ??? • val system = SemiCRF.buildSimple(data, startLabel, outsideLabel) • println(system.bestSequence(tokens).render("O"))Almost 20 years ago , [PER:Bill Watterson] walked away from `` [MISC:Calvin& Hobbes] . ''

    9. Gazetteers http://en.wikipedia.org/wiki/List_of_newspaper_comic_strips_A%E2%80%93F

    10. Gazetteers

    11. Using your own gazetteer • val data: IndexedSeq[Segmentation[Label, String]] = ??? • valmyGazetteer = ??? • val system = SemiCRF.buildSimple(data, startLabel, outsideLabel, gaz= myGazetteer)

    12. Gazetteer • Careful with gazetteers! • If built from training data, system will use it and only it to make predictions! • So, only known forms will be detected. • Still, can be very useful…

    13. Semi-CR-What? • Semi-Markov Conditional Random Field • Don’t worry about the name.

    14. Semi-CRFs

    15. Semi-CRFs score(Chez Panisse) + score(Berkeley, CA) + score(- A bowl of ) + score(Churchill-Brenneis Orchards) + score(Page mandarins and medjool dates)

    16. Features = w(starts-with-Chez) • w(starts-with-C…) • w(ends-with-P…) • w(starts-sentence) • w(shape:Xxx Xxx) • w(two-words) • w(in-gazetteer) score(Chez Panisse)

    17. Building your own features valdsl = new WordFeaturizer.DSL[L](counts) with SurfaceFeaturizer.DSL import dsl._ word(begin) // word at the beginning of the span + word(end – 1) // end of the span + word(begin – 1) // before (gets things like Mr.) + word (end) // one past the end + prefixes(begin) // prefixes up to some length + suffixes(begin) + length(begin, end) // names tend to be 1-3 words + gazetteer(begin, end)

    18. Using your own featurizer • val data: IndexedSeq[Segmentation[Label, String]] = ??? • valmyFeaturizer = ??? • val system = SemiCRF.buildSimple(data, startLabel, outsideLabel, featurizer = myFeaturizer)

    19. Features • So far, we’ve been able to do everything with (nearly) no math. • To understand more, need to do some math.

    20. Machine Learning Primer • Training example (x, y) • x: sentence of some sort • y: labeled version • Goal: • want score(x, y) > score(x, y’), forall y’.

    21. Machine Learning Primer score(x, y) = wTf(x, y)

    22. Machine Learning Primer score(x, y) = w.t * f(x, y)

    23. Machine Learning Primer score(x, y) = w dot f(x, y)

    24. Machine Learning Primer score(x, y) >= score(x, y’)

    25. Machine Learning Primer w dot f(x, y) >= w dot f(x, y’)

    26. Machine Learning Primer w dot f(x, y)>= w dot f(x, y’)

    27. Machine Learning Primer w+ f(x,y) f(x,y) w f(x,y’) (w + f(x,y)) dot f(x,y) >= w dot f(x,y)

    28. The Perceptron • valfeatureIndex: Index[Feature] = ??? • vallabelIndex: Index[Label] = ??? • val weights = DenseVector.rand[Double](featureIndex.size) • for ( epoch <- 0 until numEpochs; (x, y) <- data ) { vallabelScores = DV.tabulate(labelIndex.size) { yy => val features = featuresFor(x, yy) weights.t * new FeatureVector(indexed) // or weights dot new FeatureVector(indexed) } … }

    29. The Perceptron (cont’d) • valfeatureIndex : Index[Feature] = ??? • vallabelIndex: Index[Label] = ??? • val weights = DenseVector.rand[Double](featureIndex.size) • for ( epoch <- 0 until numEpochs; (x, y) <- data ) { vallabelScores= ...valy_best = argmax(labelScores) if (y != y_best) { weights += new FeatureVector(featuresFor(x, y))weights -= new FeaureVector(featuresFor(x, y_best)) }}

    30. Structured Perceptron • Can’t enumerate all segmentations! (L * 2n) • But dynamic programs exist to max or sum • … if the feature function has a nice form.

    31. Structured Perceptron • valfeatureIndex: Index[Feature] = ??? • vallabelIndex: Index[Label] = ??? • val weights = DenseVector.rand[Double](featureIndex.size) • for ( epoch <- 0 until numEpochs; (x, y) <- data ) { valy_best = bestStructure(weights, x) if (y != y_best) {weights += featuresFor(x, y) weights -= featuresFor(x, y_best) } }

    32. Constituency Parsing V S S VP VP Some fruit visionaries say the Fuji could someday tumble the Red Delicious from the top of America's apple heap. VP NP PP NP

    33. Multilingual Parser “Berkeley:” [Petrov & Klein, 2007]; Epic [Hall, Durrett, and Klein, 2014]

    34. Epic Pre-built Models Parsing English, Basque, French, German, Swedish, Polish, Korean (working on Arabic, Chinese, Spanish) Part-of-Speech Tagging English, Basque, French, German, Swedish, Polish Named Entity Recognition English Sentence segmentation English (ok support for others) Tokenization Above languages

    35. What Is Breeze? Dense Vectors, Matrices, Sparse Vectors, Counters, MatrixDecompositions

    36. What Is Breeze? Nonlinear Optimization, Probability Distributions

    37. Getting started libraryDependencies ++= Seq( "org.scalanlp" %% "breeze" % "0.8.1", // optional: native linear algebra libraries // bulky, but faster "org.scalanlp" %% "breeze-natives" % "0.8.1" ) scalaVersion := "2.11.1"

    38. Linear Algebra • import breeze.linalg._ • val x = DenseVector.zeros[Int](5)// DenseVector(0, 0, 0, 0, 0) • val m = DenseMatrix.zeros[Int](5,5) • val r = DenseMatrix.rand(5,5) • m.t // transpose • x + x // addition • m * x // multiplication by vector • m * 3 // by scalar • m * m // by matrix • m :* m // element wise mult, Matlab .*

    39. Return Type Selection • import breeze.linalg.{DenseVector=> DV} • val dv = DV(1.0, 2.0) • valsv = SparseVector.zeros[Double](2) • dv + sv// res: DenseVector[Double] = DenseVector(1.0, 2.0) • sv+ sv// res: SparseVector[Double] = SparseVector(1.0, 2.0)

    40. Return Type Selection • import breeze.linalg.{DenseVector=> DV} • val dv = DV(1.0, 2.0) • valsv = SparseVector.zeros[Double](2) • (dv: Vector[Double]) + (sv: Vector[Double])// res: Vector[Double] = DenseVector(1.0, 2.0) • (sv: Vector[Double]) + (sv: Vector[Double])// res: Vector[Double] = SparseVector() Static: Vector Dynamic: Dense Static: Vector Dynamic: Sparse

    41. Linear Algebra: Slices • val m = DenseMatrix.zeros[Int](5, 5) • m(::,1) // slice a column// DV(0, 0, 0, 0, 0) • m(4,::) // slice a row • m(4,::) := DV(1,2,3,4,5).t • m.toString0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5

    42. Linear Algebra: Slices • m(0 to 1, 3 to 4).toString0 0 2 3 • m(IndexedSeq(3,1,4,2),IndexedSeq(4,4,3,1))0 0 0 0 0 0 0 0 5 5 4 2 0 0 0 0

    43. Universal Functions • import breeze.numerics._ • log(DenseVector(1.0, 2.0, 3.0, 4.0))// DenseVector(0.0, 0.6931471805599453, // 1.0986122886681098, 1.3862943611198906) • exp(DenseMatrix( (1.0, 2.0), (3.0, 4.0))) • sin(Array(2.0, 3.0, 4.0, 42.0)) • // also sin, cos, sqrt, asin, floor, round, digamma, trigamma, ...

    44. UFuncs: In-Place • import breeze.numerics._ • val v = DV(1.0, 2.0, 3.0, 4.0) • log.inPlace(v)// v == DenseVector(0.0, 0.6931471805599453, // 1.0986122886681098, 1.3862943611198906)

    45. UFuncs: Reductions • sum(DenseVector(1.0, 2.0, 3.0))// 6.0 • sum(DenseVector(1, 2, 3))// 6 • mean(DenseVector(1.0, 2.0, 3.0))// 2.0 • mean(DenseMatrix( (1.0, 2.0, 3.0) (4.0, 5.0, 6.0)))// 3.5

    46. UFuncs: Reductions • valm = DenseMatrix((1.0, 3.0), (4.0, 4.0)) • sum(m) == 12.0 • mean(m) == 3.0 • sum(m(*, ::)) == DenseVector(4.0, 8.0) • sum(m(::, *)) == DenseMatrix((5.0, 7.0)) • mean(m(*, ::)) == DenseVector(2.0, 4.0) • mean(m(::, *)) == DenseMatrix((2.5, 3.5))

    47. Broadcasting • valdm = DenseMatrix((1.0,2.0,3.0), (4.0,5.0,6.0)) • sum(dm(::, *)) == DenseMatrix((5.0, 7.0, 9.0)) • dm(::, *) + DenseVector(3.0, 4.0)// DenseMatrix((4.0, 5.0, 6.0), (8.0, 9.0, 10.0))) • dm(::, *) := DenseVector(3.0, 4.0)// DenseMatrix((3.0, 3.0, 3.0), (4.0, 4.0, 4.0))

    48. UFuncs: Implementation • object log extends UFunc • implicit object logDouble extends log.Impl[Double, Double] { def apply(x: Double) = scala.math.log(x) } • log(1.0) // == 0.0 • // elsewhere: magic to automatically make this work: • log(DenseVector(1.0)) // == DenseVector(0.0) • log(DenseMatrix(1.0)) // == DenseMatrix(0.0) • log(SparseVector(1.0)) // == SparseVector()