Topics in Linguistics ENG 331

Slides:



Advertisements
Similar presentations
Don’t Type it! OCR it! How to use an online OCR..
Advertisements

Research in the Central High School Media Center Connie L. Heller.
compilers and interpreters
Creating a Program In today’s lesson we will look at: what programming is different types of programs how we create a program installing an IDE to get.
Compilers and Interpreters. Translation to machine language Every high level language needs to be translated to machine code There are different ways.
PHP and MySQL Week#1  Course Plan.  Introduction to Dynamic Web Content.  Setting Up Development Server Eng. Mohamed Ahmed Black 1.
VMWare Workstation Installation. Starting Vmware Workstation Go to the start menu and start the VMware Workstation program. *Note: The following instructions.
Chapter 1: Python Basics CSCI-UA 0002 – Introduction to Computer Programming Mr. Joel Kemp.
An Introduction to Content Management. By the end of the session you will be able to... Explain what a content management system is Apply the principles.
Digital Logic and State Machine Design Installing Xilinx WebPACK 12.4 CS 2204 Digital Hardware.
TC2-Computer Literacy Mr. Sencer February 8, 2010.
Website Accessibility for People with Disabilities Kate Todd November 27, 2007.
Enlightening minds. Enriching lives. Tamil Digital Industry Badri Seshadri K.S.Nagarajan New Horizon Media.
CITATIONS AND WORKS CITED MLA FORMAT FOR REFERENCES.
Constructing Your Own Corpus from Written Language.
Computing Theory: HTML Year 11. Lesson Objective You will: o Be able to define what HTML is - ALL o Be able to write HTML code to create your own web.
Chapter 11 An Introduction to Visual Basic 2008 Why Windows and Why Visual Basic How You Develop a Visual Basic Application The Different Versions of Visual.
How to Tag a Corpus Using Stanford Tagger. Accuracy All tokens: 97.32% Unknown words: 90.79%
Use PowerPoint to make an E-BOOK with voice embedded.
Mark Turner Cuesta College Faculty Web Pages: Elegant Design with Students in Mind.
Productivity Tools Ken Nguyen Department of Information Technology Clayton State University.
Animoto.com Visit: Sign up and wait for a promotional code. Your promotional code will be sent.
Presented By: By: By: Web Address: Topic Number: Topic Number: Date: Date:
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Operating Systems Foundation Computing Half the people you know are below average.
Introduction to Computer Application (IC) MH Room 517 Time : 7:00-9:30pm.
Chapter 11 An Introduction to Visual Basic 2005 Why Windows and Why Visual Basic How You Develop a Visual Basic Application The Different Versions of Visual.
Transforming Parallel Corpora to Translation Memory Steve Legrand IPN 29th Sept
Web Page Design Introduction. The ________________ is a large collection of pages stored on computers, or ______________ around the world. Hypertext ________.
Archimer Ifremer’s institutional repository Fred Merceur IAMSLIC's 32nd annual conference Every Continent, Every Ocean October 8-12, 2006 Portland, Oregon,
An exercise in preservation and applied technology Making an Electronic Text.
Virus Scan Software.  Every computer should have virus scan software to protect it from the increasing number of bad files that are installed on computer’s.
Section 2B. Objectives List two reasons why some people prefer alternative methods of input over a standard keyboard or mouse. List three categories of.
Downloading a Visual C compilers (try it yourself at home) Visual Studio 2012 can be found at:
OCR A Level F453: The function and purpose of translators Translators a. describe the need for, and use of, translators to convert source code.
Software Development Languages and Environments. Computer Languages Just as there are many human languages, there are many computer programming languages.
TEI Workshop Digitization of Text 文字數位化 Reasons, Methods, Stages.
Microsoft Office One Note
Class03 Introduction to Web Development (Hierarchy and the IDE)
How to get started with RefWorks
UMBC CMSC 104 – Section 01, Fall 2016
Introduction to Corpus Linguistics
Movie Marketing Project:
Introduction to Programming (CS 201)
Ch 1. A Python Q&A Session Bernard Chen 2007.
Corpus Linguistics I ENG 617
A451 Theory – 7 Programming 7A, B - Algorithms.
How to get started with RefWorks
Topics in Linguistics ENG 331
Corpus Linguistics I ENG 617
Corpus Linguistics I ENG 617
Corpus Linguistics I ENG 617
Brother Support Ireland Toll-Free Number:
Teaching Computing to GCSE
Teaching Computing to GCSE
Topics in Linguistics ENG 331
Ian Ramsey C of E School GCSE ICT Smart working Software choices.
Corpus Linguistics I ENG 617
Topics in Linguistics ENG 331
Corpus Linguistics I ENG 617
Introduction to Algorithm Design
Introduction to Corpus Linguistics ENG 331
Internet and Community Resources
Family Search and the scanning of OCPL’s historical book collection.
Oregon Department of Education
Accelerated Introduction to Computer Science
Internet Technologies I - Lect.01 - Waleed Ibrahim Osman
HTML Text editors and adding graphics
DATA MINING Python.
Digital Technologies in the Classroom
Presentation transcript:

Topics in Linguistics ENG 331 Rania Al-Sabbagh Department of English Faculty of Al-Alsun (Languages) Ain Shams University rsabbagh@alsun.asu.edu.eg Week 11

Installation Prerequisites We need to download and install these two before we start: Visual C++ 2015 Build Tools Editra: Python editor Notepad++ Week 11

Corpus Compilation It is always a good idea to look for a ready made corpus either from sources such as the LDC and ERLA or from individual researchers. However, sometimes you have to compile your own corpus. As you compile the corpus, you need to make sure that it follows the criteria of a well-designed corpus. Do you remember what those criteria are? In corpus and computational linguistics, corpus compilation is referred to as corpus harvesting as well. Week 11

Resources for Corpus Harvesting: Print Books Depending on your study, you may compile your corpus from print books, online written resources, or audiovisual resources. For print books, one can check the following for a text machine-readable version of the books Project Gutenberg Oxford Internet Archive If such a version does not exist, one may need to work on a scanned version of the book and use an Optical Character Reader (OCR) software program. OCR programs convert scanned images into text files. They are never 100% accurate but they save much typing time. There are many free online OCRs, though. Week 11

Resources for Corpus Harvesting: Web as Corpus When we compile data from online resources, we are using the “Web as Corpus”. This is a term coined a few years ago and there is an entire series of workshops that carry the same name as well as a SIG. Software programs used to compile corpora from the Web are referred to as scrappers, spiders, or crawlers. Week 11