Introduction to Xaira Part One: All about Xaira Andrew Hardie.

Slides:



Advertisements
Similar presentations
Richard Gartner Oxford University
Advertisements

EPrints 2.0 / March 4 th 2002 / Glasgow / Chris Gutteridge Introduction to EPrints 2.0 March 4 th 2002 Glasgow Christopher Gutteridge from the Department.
Part Two: Using Xaira to explore corpora Richard Xiao
The DataFlex Web Framework Changing the Game Stephen W. Meeley Development Team Data Access Worldwide
6 C H A P T E R © 2001 The McGraw-Hill Companies, Inc. All Rights Reserved1 Electronic Mail Electronic mail has revolutionized the way people communicate.
ABNIAC The following slide presentation is to acquaint the student with ABNIAC. The version used for presentation is the Java version, which can be found.
Multilingual support; interface languages Course material prepared by Greenstone Digital Library Project University of Waikato, New Zealand andNational.
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
Using XML files as real corpora making an XML database with the dbXML program
By Jim Graham May, How GoogleEarth Works 2. Display Excel Data in GoogleEarth 3. Creating KML Files for GoogleEarth.
ICS 463, Intro to Human Computer Interaction Design: 11. User Support Dan Suthers.
DEMO: BNC and Xaira (see lilac sheet). Start Xaira and open BNC Via ‘bnc-xml.xcorpus’ or Xaira.
1 Distributed File System, and Disk Quotas (Week 7, Thursday 2/21/2007) © Abdou Illia, Spring 2007.
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
Sistemi basati su conoscenza XML Prof. M.T. PAZIENZA a.a
Use Case Modelling Visual Annotator for studying ICU Notes Bacchus Beale.
Basic Unix Dr Tim Cutts Team Leader Systems Support Group Infrastructure Management Team.
 Contents 1.Introduction about operating system. 2. What is 32 bit and 64 bit operating system. 3. File systems. 4. Minimum requirement for Windows 7.
Cisco Confidential © 2010 Cisco and/or its affiliates. All rights reserved. 1 MSE MSAP Functional Specifications Presenter Name: Patrick Nicholson.
Documentation 1. User Documentation 2. Technical Documentation 3. Program Documentation.
Strategies for Building Successful Digital Initiatives at Small to Medium Size Institutions Rachel Frick & Andrew Rouner.
LOGO Chapter V Formattings 1. LOGO Overview  Conditional formatting  Working with tables  Filtering  Sorting  Freeze panes  Pivot tables  How to.
SOUL Software Installation
Sophia Antipolis, September 2006 Multilinguality, localization and internationalization Miruna Bădescu Finsiel Romania.
CHAPTER 9 DATABASE MANAGEMENT © Prepared By: Razif Razali.
Formex XML Two years after introduction Dr. Holger Bagola Publications Office Directorate A ‘OJ and Access to Legislation’ ‘Methodology and development’
October 2005CSA3180: Text Processing I1 CSA3180: Natural Language Processing Text Processing 1 Language Encoding Issues Common Corpora Handling Large Document.
 What is the BNC?  What is Xaira?  How to use the BNC for: › Language teaching and learning › Research.
Learning Web Design: Chapter 4. HTML  Hypertext Markup Language (HTML)  Uses tags to tell the browser the start and end of a certain kind of formatting.
A Web Application for Customized Corpus Delivery Nancy Ide, Keith Suderman, Brian Simms Department of Computer Science Vassar College USA.
What is XML?  XML stands for EXtensible Markup Language  XML is a markup language much like HTML  XML was designed to carry data, not to display data.
Mr C Johnston ICT Teacher BTEC IT Unit 06 - Lesson 01 Introduction to Computer Programming.
UCSC All rights reserved. No part of this material may be reproduced and sold. 1 IT1202-Fundamentals Of Programming (Using JAVA) Interacting with.
Quick Reference notes  Part of the Microsoft® Office Fluent user interface, the ribbon is the rectangular region across the top of the document window.
Technical Aspects of SIARD “SIARD under the hood” 10. April 2003 / Stephan Heuscher.
Introducing XARA… An XML aware tool for corpus searching Lou Burnard Tony Dodd Research Technology Services, OUCS.
Copyright (c) Cem Kaner. 1 Software Testing 1 CSE 3411 SWE 5411 Assignment #1 Replicate and Edit Bugs.
More about Databases. Data Entry through Forms Table View (Data sheet view) is useful for data entry of new records But sometimes customization would.
Chapter 17 Creating a Database.
Problem Statement: Users can get too busy at work or at home to check the current weather condition for sever weather. Many of the free weather software.
Copenhagen, 6 June 2006 EC CHM Multilinguality Anton Cupcea Finsiel Romania.
XML for Text Markup An introduction to XML markup.
英 3B 戴偲婷. WConcord is a fast and easy to use concordancer for unlimited amounts of text. It allows the user to load multiple plain text files (.txt)
U3A General Computing Class Autumn 2014 Week 4 of 10 weeks. Mondays 4:15 to 5:45 pm Half Term – Miss 27th of October 2014 and 3 rd November. Class dates.
LINGUATECA FLUP/CLUP The Corpógrafo – a Web-based environment for corpora research extract Term Candidates.
Using MS Excel to validate & load your data into Oracle EBS.
1 Terminology. 2 Requirements for Network Printing Print server Sufficient RAM to process documents Sufficient disk space on the print server.
HTML A brief introduction HTML1. HTML, what is? HTML is a markup language for describing web documents (web pages). HTML stands for Hyper Text Markup.
CS440 Computer Networks 1 Neil Tang 12/01/2008.
Module Road Map Assignment Road Map Notice we have linked the conduit directly to the presentation layer. This is normally a bad idea!
2016 CSO System Training & Networking Conference / Copyright © 2016 #csoconf 2016 CSO System Training & Networking Conference / Copyright © 2016 #csoconf.
Ska in KAist …The BBS of Dream… netj leetop leechun jeans
XAIRA is an XML Aware Indexing and Retrieval Architecture ● Developed from the British National Corpus Sara program, it provides: – platform-independent.
Chapter 3: Mastering Editors Chapter 3 Mastering Editors (Emacs)
Data Virtualization Demoette… ODBC Clients
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Development Environment
Lesson Objectives Aims You should be able to:
Turning method call into an object
Sec (4.3) The World Wide Web.
INDEX What Problems occurred when user has multiple PST file?
Topics in Linguistics ENG 331
COMP 101 Introduction.
COMP 101 Introduction.
More about Databases.
ICT Word Processing Lesson 5: Revising and Collaborating on Documents
Real-World File Structures
Introduction to AutoCAD
Allyson Falkner Spokane County ISD
Software Engineering and Architecture
Presentation transcript:

Introduction to Xaira Part One: All about Xaira Andrew Hardie

What is Xaira? XML Aware Indexing and Retrieval Architecture The XML-aware version of SARA for the BNC corpus Several programs, including the Index Toolkit and the Client

How do you pronounce Xaira? Its designers pronounce it like Sarah We pronounce it like Zirah Other pronunciations may vary

Why are we talking about it? Andrew and Richard have been beta-testers for Xaira for several years Andrew wrote the help file

What sort of program is Xaira? Xaira is an analysis program for indexed corpora Searching indexed vs. non-indexed corpora Indexing – retrieval Xaira does both

Indexing

Retrieval

Xaira contains The Indexer itself Xaira-tools Easy user interface for corpus set-up and using the indexer The Xaira client Sophisticated corpus analysis system Wordlist, concordance, collocation Structured searching

Client, server? Why does Xaira describe itself as a client? Xaira splits the work between… one program that you use to build the search (the client), and one program that actually looks in the index and finds the solutions (the server) But you can just use the client like any concordancer software the user never deals directly with the server

What is special about Xaira? Xaira is based on XML XML is based on Unicode Thus Xaira can be used with any language in any alphabet But Xaira has been specially designed to aid multilingual analysis e.g. allows Unicode keyboard setup for any language

Do I need a Unicode corpus? Yes! (… but ASCII counts as valid UTF-8) Both UTF-8 and UTF-16 are OK (If in doubt, ask Andrew about variant text encodings)

Does my corpus need to be XML? No! Xaira can add basic XML to a corpus of plain-text files Xaira can also upgrade SGML to XML TEI XML is perfect for Xaira… … warning: Xaira will reject ill-formed XML or SGML files.

First, index your corpus Messages from the different tools appear here (you dont need to worry about them) Access the commands you need to set up and run the indexer from the Tools menu

The Tools Menu Tools for preparing your corpus and its header Tools for telling Xaira how to handle the XML markup in your corpus The indexer itself

Scared? Using Xaira-tools to prepare a corpus manually can be a bit complex Instructions: But dont despair – there is a wizard! File >> Index Wizard

The index wizard

Live Indexing!