1 / 65

Programming for Linguists

Programming for Linguists. An Introduction to Python. Contact. Claudia Peersman claudia.peersman@ua.ac.be Lange Winkelstraat 40, room L202 (2 nd floor). Literature.

vaughan
Download Presentation

Programming for Linguists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming for Linguists An Introduction to Python

  2. Contact • Claudia Peersman • claudia.peersman@ua.ac.be • Lange Winkelstraat 40, room L202 (2nd floor)

  3. Literature • “Think Python. How to Think Like a Computer Scientist?” by Allen B. Downey freely available at:http://greenteapress.com/thinkpython/thinkpython.html • “Natural Language Processing with Python. Analyzing Text with the Natural Language Toolkit” by Steven Bird, Ewan Klein, and Edward Loperfreely available at:http://www.nltk.org/book

  4. The Python programming language Part 1 • Formal vs. natural languages • The way of the program • Programming for linguists? • What is a program? • Debugging • Your first program

  5. Formal vs. natural languages • Natural Languages: spoken languages, e.g. English, Dutch, French… • not designed by people • evolved naturally • Formal Languages: designed by people for specific applications, e.g.: • in mathematics: notation which denotes relationships among numbers and symbols • in chemistry: represent the chemical structure of molecules

  6. Many features in common: • tokens, structure, syntax and semantics • A lot of differences:

  7. Some Examples • 5 + 5 = 10 • H2O • 5 + 5 = 1$0 ??? • Zz ??? • Illegal tokens $ and Zz • 5 +: 5 = 10 ??? • Legal tokens, but illegal structure +:

  8. The way of the program • Programming = the art of problem solving: • formulate problems • think creatively about possible solutions • express a solution clearly and accurately • trial and error

  9. Programming for linguists? • aim: handle large linguistic corpora • automatic frequency counts • distribution of linguistic features across different categories, corpora • look up context • existing tools are limited, cost money

  10. About Python… • open source • executed by an interpreter in two ways: • interactive mode • script mode

  11. interactive mode: • open the interpreter • >>> prompt = ready to begin • type a command • interpreter prints the results >>> 1 + 1 2

  12. script mode: • open a new window in the interpreter • type a number of commands • save the program as a python script: e.g. test.py • the program is executed whenever you tell the interpreter to run it • the results are printed when the script is run

  13. Which mode to use? • interactive mode: • good for testing small parts of the program before you go on • does not save the program! • script mode: • put together all small parts of code in a sequence of instructions for the computer to execute • save your program • use it again in the future

  14. What is a program? • a sequence of instructions that specifies how to perform a computation • for linguists: the computation can also be e.g. looking up the context of words in a text, calculating average word lengths, sentence lengths, …

  15. Some basic instructions • input: data you type, a text you load • output: display data on the screen, send data to a file • math: perform basic mathematical operations like +, -, X, :

  16. conditional execution: check for certain conditions and execute the appropriate instructions • repetition: perform some action repeatedly, usually with some variation Programming = breaking a large, complex task into smaller and smaller subtasks until the subtasks are simple enough to be performed with one of these basic instructions

  17. Debugging

  18. Bugs = programming errors • Debugging = process of tracking down programming errors • Three kinds of bugs: • syntax errors • runtime errors • semantic errors

  19. Syntax errors • refer to the structure of the program and the rules about that structure • if there is even a single syntax error in your code: • Python will display an error message • the execution of your program will quit immediately

  20. An example • parentheses: (1 + 2) : correct syntax 2) : syntax error Syntax errors are very common in the beginning. The more you practice and gain experience, the fewer mistakes you will make and the faster you will find them.

  21. Runtime errors • also called exceptions • do not appear until after the program has started to run • Python will display an error message • For example: you give the instruction to open a file, but you have typed in the wrong file name or wrong directory

  22. Semantic errors • The program will run perfectly, but it will not produce the results you wanted: the meaning of the program (semantics) is wrong • Tricky errors, because: • Python will not display an error message !! • you need to work backward looking at the output of the program and try to figure out what it is doing exactly

  23. An example • Python function read( ) vs. readline( ) vs. readlines( )

  24. Debugging is equally important to programming itself: • not only learn how to write a program • learn to write a program that works • learn to write a program that does what you want it to do • Always try out small pieces of code before you go on with writing your program • Try out your code on short pieces of text, so that you can verify your results manually

  25. Your first program • open IDLE • The first program is usually called “Hello, world!” • In Python: >>> print “Hello, world!” or >>> print ‘Hello, world!’ Mind the quotation marks!

  26. This is the print statement • The quotation marks mark the beginning and the end of the text to be displayed • The quotation marks do not appear in the result

  27. Why we teach Python: e.g. in Java: public class Hello { public static void main( String[] args ) { System.out.println( "Hello, World!" ); }}

  28. Make some mistakes • What happens if you: • leave out one of the quotation marks • replace “ by ‘ or vice versa in one case • spell “print” wrong • double the quotation marks • double the quotation marks, but change the order

  29. By making mistakes on purpose you will: • learn which details are important in writing program code • learn to debug more efficiently, because you get to know what the error messages mean

  30. Try it yourselves • We will make time to try out new things as we proceed • Programming is a new way of thinking for linguists • If there is a problem or you have a question, do not hesitate to mention it immediately

  31. Values and Types • values = basic elements of a program • e.g. print “Hello, world!” • each value has a type: • integer • string • float

  32. Integer: all non-decimal numbers • e.g. 105 • String: a string of letters • e.g. “Hello, World!” • Float: numbers with a decimal point • e.g. 10.5 • The interpreter can tell you the type of a value: >>> type(105) <type ‘int’>

  33. Try to find out what the type is of the following values: • “Hello!” • 3.1415 • Dag Jan • “123” • “123.456”

  34. Try this:>>> print 123,456 • Float types always have a dot, never a comma • To which kind of error could this lead? • runtime error • syntax error • semantic error

  35. Variables • A name that refers to a value • An assignment statement creates new variables and assigns values to them • You can choose the name yourself • e.g.>>> text = “Everything except ‘Hello, world!’”>>> age = 26>>> pi = 3.1415

  36. The variables now carry the values we assigned to them:>>> print text>>> print age>>> print pi • The interpreter can again tell you the type:>>> type(text)

  37. Variable names: • can be arbitrarily long • can contain both letters and numbers • have to begin with a letter • can contain uppercase letters • are case sensitive ! • If you use an illegal character in your name, you will get a syntax error message: • e.g. my name, live@

  38. You cannot choose a name that is a keyword in Python: and del from as elif global assert else if break except import class exec in continue finally is def for lambda not while or with pass yield print raise return try • Tip: try to choose names which describe what the variable is used for

  39. Statements • Units of code that the Python interpreter can execute • So far we have seen the print statement and the assignment statement • A program usually contains a series of statements that are executed in an order predetermined by the programmer

  40. e.g.>>> age1 = 20>>> age2 = 40>>> print age240>>> average_age = (age1 + age2)/2>>> print average_age30

  41. You always have to assign a value to a variable before you can work with it • Variables have to be spelled in the same way throughout the program • If you assign a new value to an existing variable, the old value is deleted e.g. >>>age = 20 >>>age = age + 20 >>>print age

  42. Operators and Operands • Operators = special symbols that represent computations • e.g. +, -, *, /, ** • Operands = the values the operator is applied to • e.g. 2 + 2 • Try 2/3

  43. When both operands are integers, the result is again an integer • If you want a floating-point result, you have to make one of the operands a floating-point number:>>> 2/3.00.66666666666666663 • youcanalsogive a command at the beginning of your script: from __future__ import division

  44. Expressions • A combination of values, variables, and operators • Try:>>>x = 5>>>x + 1 • Now make a script of it (File  New window) and run it (Run  Run module)

  45. In a script an expression all by itself does not print a result !!! • How can you modify the script so that it does produce a result ?

  46. Order of Operations • The order of evaluation depends on the rules of precedence • For mathematical operators, Python follows mathematical conventions: • Parentheses • Exponentiation • Multiplication and division • Addition and subtraction

  47. String Operations • In general: no mathematical operations on stringse.g. “hello”/ “hi”  TypeError: unsupported operand type(s) for /: 'str' and 'str’ • Except: the + and * operators

  48. Try: • “hello” + “hi” • “hello”*2 • String + string = concatenation • string * int = repetition

  49. Boolean Expressions • An expression that is either True or Falsee.g. the operator ==>>>5 == 5True>>>5 == 6False • True & False: <type ‘bool’> not string

  50. Relational Operators x==yx is equal to y x!=yx is not equal to y x> yx is greater than y x< yx is smaller than y x >= y x is greater than or equal to y x <= y x is smaller than orequal to y

More Related