1 / 68

Programming for Linguists

Programming for Linguists. An Introduction to Python. Contact. Claudia Peersman claudia.peersman@ua.ac.be Lange Winkelstraat 40, room L202 (2 nd floor). Literature.

nerys
Download Presentation

Programming for Linguists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Programming for Linguists An Introduction to Python

  2. Contact • Claudia Peersman • claudia.peersman@ua.ac.be • Lange Winkelstraat 40, room L202 (2nd floor)

  3. Literature • “Think Python. How to Think Like a Computer Scientist?” by Allen B. Downey freely available at:http://greenteapress.com/thinkpython/thinkpython.html • “Natural Language Processing with Python. Analyzing Text with the Natural Language Toolkit” by Steven Bird, Ewan Klein, and Edward Loperfreely available at:http://www.nltk.org/book

  4. The Python programming language Part 1 • Formal vs. natural languages • The way of the program • Programming for linguists? • What is a program? • Debugging • Your first program

  5. Formal vs. natural languages • Natural Languages: spoken languages, e.g. English, Dutch, French… • not designed by people • evolved naturally • Formal Languages: designed by people for specific applications, e.g.: • in mathematics: notation which denotes relationships among numbers and symbols • in chemistry: represent the chemical structure of molecules

  6. Many features in common: • tokens, structure, syntax and semantics • A lot of differences:

  7. Some Examples • 5 + 5 = 10 • H2O • 5 + 5 = 1$0 ??? • Zz ??? • Illegal tokens $ and Zz • 5 +: 5 = 10 ??? • Legal tokens, but illegal structure +:

  8. The way of the program • Programming = the art of problem solving: • formulate problems • think creatively about possible solutions • express a solution clearly and accurately • trial and error

  9. Low-level vs. high-level languages • Low-level languages = “machine languages”: only language a computer can execute • High-level languages like Python, Perl, Java, C++need to be processed to a low-level language to be executed by: • compilers • interpreters

  10. An interpreter: • processes the program a little at a time • alternates between reading lines and performing computations • A compiler: • translates the high-level language completely first • once a program is compiled, it can be executed repeatedly without further translation

  11. Programming for linguists? • aim: handle large linguistic corpora • automatic frequency counts • distribution of linguistic features across different categories, corpora • look up context • existing tools are limited, cost money

  12. About Python… • high-level language • open source • executed by an interpreter in two ways: • interactive mode • script mode

  13. interactive mode: • open the interpreter • >>> prompt = ready to begin • type a command • interpreter prints the results >>> 1 + 1 2

  14. script mode: • open a new window in the interpreter • type a number of commands • save the program as a python script: e.g. test.py • the program is executed whenever you tell the interpreter to run it • the results are printed when the script is run

  15. Which mode to use? • interactive mode: • good for testing small parts of the program before you go on • does not save the program! • script mode: • put together all small parts of code in a sequence of instructions for the computer to execute • save your program • use it again in the future

  16. What is a program? • a sequence of instructions that specifies how to perform a computation • for linguists: the computation can also be e.g. looking up the context of words in a text, calculating average word lengths, sentence lengths, …

  17. Some basic instructions • input: data you type, a text you load • output: display data on the screen, send data to a file • math: perform basic mathematical operations like +, -, X, :

  18. conditional execution: check for certain conditions and execute the appropriate instructions • repetition: perform some action repeatedly, usually with some variation Programming = breaking a large, complex task into smaller and smaller subtasks until the subtasks are simple enough to be performed with one of these basic instructions

  19. Debugging

  20. Bugs = programming errors • Debugging = process of tracking down programming errors • Three kinds of bugs: • syntax errors • runtime errors • semantic errors

  21. Syntax errors • refer to the structure of the program and the rules about that structure • if there is even a single syntax error in your code: • Python will display an error message • the execution of your program will quit immediately

  22. An example • parentheses: (1 + 2) : correct syntax 2) : syntax error Syntax errors are very common in the beginning. The more you practice and gain experience, the fewer mistakes you will make and the faster you will find them.

  23. Runtime errors • also called exceptions • do not appear until after the program has started to run • Python will display an error message • For example: you give the instruction to open a file, but you have typed in the wrong file name or wrong directory

  24. Semantic errors • The program will run perfectly, but it will not produce the results you wanted: the meaning of the program (semantics) is wrong • Tricky errors, because: • Python will not display an error message !! • you need to work backward looking at the output of the program and try to figure out what it is doing exactly

  25. An example • Python function read( ) vs. readline( ) vs. readlines( )

  26. Debugging is equally important to programming itself: • not only learn how to write a program • learn to write a program that works • learn to write a program that does what you want it to do • Always try out small pieces of code before you go on with writing your program • Try out your code on short pieces of text, so that you can verify your results manually

  27. Your first program • open IDLE (desktop) • The first program is usually called “Hello, world!” • In Python: >>> print “Hello, world!” or >>> print ‘Hello, world!’ Mind the quotation marks!

  28. This is the print statement • The quotation marks mark the beginning and the end of the text to be displayed • The quotation marks do not appear in the result

  29. Why we teach Python: e.g. in Java: public class Hello { public static void main( String[] args ) { System.out.println( "Hello, World!" ); }}

  30. Make some mistakes • What happens if you: • leave out one of the quotation marks • replace “ by ‘ or vice versa in one case • spell “print” wrong • double the quotation marks • double the quotation marks, but change the order

  31. By making mistakes on purpose you will: • learn which details are important in writing program code • learn to debug more efficiently, because you get to know what the error messages mean

  32. Try it yourselves • We will make time to try out new things as we proceed • Programming is a new way of thinking for linguists • If there is a problem or you have a question, do not hesitate to mention it immediately

  33. Values and Types • values = basic elements of a program • e.g. print “Hello, world!” • each value has a type: • integer • string • float

  34. Integer: all non-decimal numbers • e.g. 105 • String: a string of letters • e.g. “Hello, World!” • Float: numbers with a decimal point • e.g. 10.5 • The interpreter can tell you the type of a value: >>> type(105) <type ‘int’>

  35. Try to find out what the type is of the following values: • “Hello!” • 3.1415 • Dag Jan • “123” • “123.456”

  36. Try this:>>> print 123,456 • Float types always have a dot, never a comma • To which kind of error could this lead? • runtime error • syntax error • semantic error

  37. Variables • A name that refers to a value • An assignment statement creates new variables and assigns values to them • You can choose the name yourself • e.g.>>> text = “Everything except ‘Hello, world!’”>>> age = 26>>> pi = 3.1415

  38. The variables now carry the values we assigned to them:>>> print text>>> print age>>> print pi • The interpreter can again tell you the type:>>> type(text)

  39. Variable names: • can be arbitrarily long • can contain both letters and numbers • have to begin with a letter • can contain uppercase letters • are case sensitive ! • If you use an illegal character in your name, you will get a syntax error message: • e.g. my name, live@

  40. You cannot choose a name that is a keyword in Python: and del from as elif global assert else if break except import class exec in continue finally is def for lambda not while or with pass yield print raise return try • Tip: try to choose names which describe what the variable is used for

  41. Statements • Units of code that the Python interpreter can execute • So far we have seen the print statement and the assignment statement • A program usually contains a series of statements that are executed in an order predetermined by the programmer

  42. e.g.>>> age1 = 20>>> age2 = 40>>> print age240>>> average_age = (age1 + age2)/2>>> print average_age30

  43. You always have to assign a value to a variable before you can work with it • Variables have to be spelled in the same way throughout the program • If you assign a new value to an existing variable, the old value is deleted e.g. >>>age = 20 >>>age = age + 20 >>>print age

  44. Operators and Operands • Operators = special symbols that represent computations • e.g. +, -, *, /, ** • Operands = the values the operator is applied to • e.g. 2 + 2 • Try 2/3

  45. When both operands are integers, the result is again an integer • If you want a floating-point result, you have to make one of the operands a floating-point number:>>> 2/3.00.66666666666666663 • youcanalsogive a command at the beginning of your script: from __future__ import division

  46. Expressions • A combination of values, variables, and operators • Try:>>>x = 5>>>x + 1 • Now make a script of it (File  New window) and run it (Run  Run module)

  47. In a script an expression all by itself does not print a result !!! • How can you modify the script so that it does produce a result ?

  48. Order of Operations • The order of evaluation depends on the rules of precedence • For mathematical operators, Python follows mathematical conventions: • Parentheses • Exponentiation • Multiplication and division • Addition and subtraction

  49. String Operations • In general: no mathematical operations on stringse.g. “hello”/ “hi”  TypeError: unsupported operand type(s) for /: 'str' and 'str’ • Except: the + and * operators

  50. Try: • “hello” + “hi” • “hello”*2 • String + string = concatenation • string * int = repetition

More Related