LING/C SC 581: Advanced Computational Linguistics

Slides:



Advertisements
Similar presentations
Lab III – Linux at UMBC.
Advertisements

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 15 th.
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
LING 364: Introduction to Formal Semantics Lecture 1 January 12th.
CS – 600 Introduction to Computer Science Prof. Angela Guercio Spring 2008.
COMPSCI 101 S Principles of Programming Lecture 1 – Introduction.
Course A201: Introduction to Programming 09/09/2010.
COMP Introduction to Programming Yi Hong May 13, 2015.
LING 408/508: Programming for Linguists Lecture 3 August 31 st.
CST 229 Introduction to Grammars Dr. Sherry Yang Room 213 (503)
Course Overview Ted Baker  Andy Wang COP 5641 / CIS 4930.
Welcome to CS 115! Introduction to Programming. Class URL Write this down!
COP3502: Introduction to Computer Science Yashas Shankar.
Course Overview Ted Baker  Andy Wang COP 5641 / CIS 4930.
Introduction to ECE 2401 Data Structure Fall 2005 Chapter 0 Chen, Chang-Sheng
Computer Science 101 Spring 2000 Section E8TBA Registration Code 1693 Dr. Christopher Vickery.
General rules 1. Rule: 2. Rule: 3. Rule: 10. Rule: Ask questions ……………………. 11. Rule: I do not know your skill. If I tell you things you know, please stop.
Intro to CS ACO 101 Lab Rat. Academic Integrity What does that mean in programming? Log into Blackboard and take the test titled “Applied Computing Course.
8 January 2016Birkbeck College, U. London1 Introduction to Programming Lecturer: Steve Maybank Department of Computer Science and Information Systems
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 13 th.
Course Overview Ted Baker  Andy Wang COP 5641 / CIS 4930.
LING 408/508: Programming for Linguists Lecture 19 November 9 th.
Course Information CSE 2031 Fall Instructor U. T. Nguyen /new-yen/ Office: CSEB Office hours:  Tuesday,
Course Information CSE 2031 Fall Instructor U.T. Nguyen Office: CSE Home page:
Web Application Development Instructor: Matthew Schurr Please sign in on the sheet at the front of the room when you arrive.
Course Information EECS 2031 Fall Instructor Uyen Trang (U.T.) Nguyen Office: LAS Office hours: 
Introduction to Programming
Development Environment
Course Information EECS 2031 – Section A Fall 2017.
Computer Engineering Department Islamic University of Gaza
MET4750 Techniques for Earth System Modeling
Install external command line softwares
CSc 1302 Principles of Computer Science II
Introduction to Programming
Introduction to .NET Core
Intro to Computer Science II
Introduction to Programming
LING/C SC/PSYC 438/538 Lecture 2 Sandiway Fong.
Introduction to Programming
CSE1320 INTERMEDIATE PROGRAMMING
CSE1320 INTERMEDIATE PROGRAMMING
Week 1 Gates Introduction to Information Technology cosc 010 Week 1 Gates
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
LING 408/508: Computational Techniques for Linguists
LING/C SC 581: Advanced Computational Linguistics
LING 388: Computers and Language
LING 581: Advanced Computational Linguistics
Welcome to CS 1010! Algorithmic Problem Solving.
LING 388: Computers and Language
Andy Wang Operating Systems COP 4610 / CGS 5765
CSC2310 Principles of Computer Programming
Andy Wang Operating Systems COP 4610 / CGS 5765
CSE1320 INTERMEDIATE PROGRAMMING
CSE1320 INTERMEDIATE PROGRAMMING
LING/C SC 581: Advanced Computational Linguistics
Welcome to CS220/MATH 320 – Applied Discrete Mathematics Fall 2018
Andy Wang Operating Systems COP 4610 / CGS 5765
Course page: CSE/Math 1560: Introduction to Computing for Mathematics and Statistics Winter 2011 Suprakash Datta.
Introduction to Programming
Course Information EECS 2031 Fall 2016.
LING 388: Computers and Language
Andy Wang Operating Systems COP 4610 / CGS 5765
LING 408/508: Computational Techniques for Linguists
LING/C SC 581: Advanced Computational Linguistics
BIT 115: Introduction To Programming
LING/C SC 581: Advanced Computational Linguistics
Computer Networks CNT5106C
Andy Wang Operating Systems COP 4610 / CGS 5765
Review of Previous Lesson
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Presentation transcript:

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 10th

Today's Lecture unfortunately will be short.. Homework 1: install Python 3 and nltk/nltk_data on your own computer unfortunate throat issues/coughing…

Administrivia No lectures on I'm away on business on Th 3/7 (Spring Recess) I'm away on business on Tu 1/29 Th 1/31 Tu 2/5 Th 2/7 (What to do: TBD) Guest lectures (more to be announced): 2/14 Tatjana Scheffler ….

Course Webpage for lecture slides and Panopto recordings: http://elmo.sbs.arizona.edu/~sandiway/#courses available from just before class time (afterwards, look again for corrections/updates) in .pptx (good for animations) and .pdf formats Meeting information Tues/Thurs 3:30-4:45pm. McClellandPark, Room 102. not guaranteed!

Course Objectives Follow-on course to LING/C SC/PSYC 538 Computational Linguistics: pre-requisite: 538 continue with selected material from the 538 textbook (J&M): 25 chapters, a lot of material not covered in 438/538 And gain more extensive experience with new stuff not in textbook dealing with natural language software packages Installation, input data formatting operation project exercises useful “real-world” computational experience abilities gained will be of value to employers

Computational Facilities Use your own laptop/desktop can also make use of the computers in this lab but you don’t have installation rights on these computers Platforms Windows is maybe possible but you really should run some variant of Unix… Linux (e.g. Ubuntu, separate bootable partition or via virtualization software if you use Windows) OSX Not quite Linux, some porting issues, especially with C programs, can use Virtual Box (Linux under OSX) Or Macports or Homebrew (no need for virtualization)

Grading Office hours (by appointment): Completion of all homework tasks will result in a satisfactory grade (A) Tasks typically should be completed before the corresponding class next week. email me your work (sandiway@email.arizona.edu). also be prepared to come up and present your work (if called upon). Office hours (by appointment): also after class

Syllabus Revisions to the syllabus Homeworks you may discuss questions with other students however, you must write it up yourself (in your own words, your own code etc.) cite (web) references and your classmates (in the case of discussion) Student Code of Academic Integrity: plagiarism etc. http://deanofstudents.arizona.edu/codeofacademicintegrity Revisions to the syllabus “the information contained in the course syllabus, other than the grade and absence policies, may be subject to change with reasonable advance notice, as deemed appropriate by the instructor.”

www.python.org

Windows 10: 32 or 64 bit system? Choose correct file: Look for System in Control Panel: Choose correct file: Windows x86-64 executable installer (for 64 bit systems) Windows x86 executable installer (for 32 bit systems)

Running the Python 3 installer

Finish installation, and start Python Four versions present: Python command line (2.7.x) IDLE Python (2.7.x) Python 3.6 IDLE Python 3.6

Python resources python.org

Why Python? Installed by default on OSX but you should install and use Python 3

Python on Ubuntu sudo apt install python3-pip

Python on Ubuntu

NLTK 3.2.5 Install See http://www.nltk.org/install.html Use pip3 (for python3) to install packages from the Python Package Index (PyPI) sudo pip3 install -U nltk updated my nltk 3.2.4 to 3.2.5

NLTK Data Install See http://www.nltk.org/data.html python3 If you get an SSL certificate error message, run: /Applications/Python 3.6/Install Certificates.command

Windows 10: setup Environment variable PATH should be set correctly to point to Python 3 install directory Type in search: Edit environment variables for your account

Windows 10: install nltk On the command line: pip3 install pyyaml nltk Package pyyaml must be used somewhere in nltk … Source: http://www.pitt.edu/~naraehan/python2/faq.html

Windows 10: install numpy and test nltk On the command line: pip3 install numpy (the chunking algorithm uses it) Let's test nltk: .word_tokenize() converts a string into words .pos_tag() does part-of-speech tagging .ne_chunk() does named entity recognition

Windows 10: test nltk .draw() takes a Tree object and draws it in a pop-up window

Windows 10: install nltk data Install corpus data (from inside Python) using nltk.download()

Windows 10: test nltk data There is a sample of the well-known Penn Treebank Wall Street Journal (WSJ) corpus included 3,914 parsed sentences 49,000+ parsed sentences in the full corpus

Example:

nltk: where is it installed?