Controlled Language in action for MT Johann Roturier May 2009.

Slides:



Advertisements
Similar presentations
Translation in the 21 st Century Impacts of MT and social media on language services.
Advertisements

© 2000 XTRA Translation Services Is MT technology available today ready to replace human translators?
OMKAR Solutions Inc. EMM – Error Message Manager Version 1.0.
Smart Identity Protection That Works for You and Your Users 2 Petri Ala-Annala Senior Principal, CISSP-ISSAP, CISA, CISM.
The following 10 questions test your knowledge of desired configuration management in Configuration Manager Configuration Manager Desired Configuration.
Tuning Jenny Burr August Discussion Topics What is tuning? What is the process of tuning?
Machine Translation The Translator s Choice Heidi Düchting Sylke Krämer Johann Roturier.
Masaki Itagaki (Language Excellence) Takako Aikawa (Machine Translation Incubation at MSR) Microsoft.
Symantec Education Skills Assessment SESA 3.0 Feature Showcase
Introduction to touchdevelop actions aka methods/functions/procedures Disclaimer: This document is provided “as-is”. Information and views expressed in.
Reduce Cost & Complexity Partner logo here Presenters Name (16pt) Presenters Title (14pt) Company/ (14pt) Manage and Deploy Applications using Virtualization.
Usage of the memoQ web service API by LSP – a case study
IT Analytics for Symantec Endpoint Protection
SpreadsheetML Basics.
Attacks on Virtual Machine Emulators Peter Ferrie, Senior Principal Researcher 12 April, 2007.
Threat Intelligence Use in Information Security: History, Theory and Practice Tim Gallo Cyber Security Field Engineering 1.
Translation in the Community LRC Localisation in the Cloud Jason Rickard Principal Product Manager, Community.
1 Rules Based Machine Translation Fred Hollowood Consultant RBMT and CL.
SRDC Ltd. 1. Problem  Solutions  Various standardization efforts ◦ Document models addressing a broad range of requirements vs Industry Specific Document.
Backup Modernization with NetBackup Appliances
Click to edit Master title style Click to edit Master subtitle style.
Symantec De-Duplication Solutions Complete Protection for your Information Driven Enterprise Richard Hobkirk Sr. Pre-Sales Consultant.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
1 of 3 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
Matt Masson| Senior Program Manager
Symantec Vision and Strategy for the Information-Centric Enterprise Muhamed Bavçiç Senior Technology Consultant SEE.
1 When Cloud Networking meets Cloud Computing: Software-Defined Networking (SDN) Customer Application Faan DeSwardt Infrastructure Architecture Manager.
GPS 2011 Slide - 1 COMPETITIVE STRATEGIES APAC Discussion.
Linda Mitchell Evaluating Community Post-Editing - Bridging the Gap between Translation Studies and Social Informatics Linda Mitchell PhD student.
Mobile Devices Carry Hidden Threats With Financial Consequences Hold StillInstalled.
Conditions and Terms of Use
© 2012 Microsoft Corporation. All rights reserved.
Symantec Managed Security Services The Power To Protect Duncan Evans Director, Cyber Security Services 1.
1 Safely Using Shared Computers Amanda Grady December 2013.
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
Symantec Targeted Attack Protection 1 Stopping Tomorrow’s Targeted Attacks Today iPuzzlebiz
© Blackboard, Inc. All rights reserved. Deploying a complex building block Andre Koehorst Learning Lab Universiteit Maastricht, the Netherlands July 18.
Type presentation name here in slide master © 2007 SDL. Company Confidential. Forward-looking information is based upon multiple assumptions and uncertainties.
Solution Architecture
GPS 2011 Slide - 1 MS CERT KIT Microworld Nova. GPS 2011 Slide - 2 Presentation of Microworld Nova The MS Cert Kit MS Cert Kit presentation The backend.
A l a d d i n. c o m eSafe 6 FR2 Product Overview.
Quick Thoughts on PGP Use Cases for KMIP 1 Michael Allen Sr. Technical Director.
The current state of Cybersecurity Targeted and In Your Pocket Dale “Dr. Z” Zabriskie CISSP CCSK Symantec Evangelist.
President’s Forum and WSML 2012 SYMSTRAT 03: Enterprise Sales Conversations for Virtualization Todd Zambrovitz with guest appearance by Kevin Fiedler 1.
WLAN Auditing Tools and Techniques Todd Kendall, Principal Security Consultant September 2007.
Innovation From the Ground Up Fred Hollowood, Martin Roche.
Type presentation name here in slide master © 2007 SDL. Company Confidential. Forward-looking information is based upon multiple assumptions and uncertainties.
Winning with Storage Foundation 5.x – 4.x End Of Life Process Winning with Storage Foundation 5.x.
Installation of Storage Foundation for Windows High Availability 5.1 SP2 1 Daniel Schnack Principle Technical Support Engineer.
Business Critical Services Introduction 1 Business Critical Services Overview BCAM NAME HERE Business Critical Account Manager.
1 APJ Curriculum Paths for Partners Specialization Accelerates Shirley Hoon APJ Partner Enablement Partner Enablement Oct
Copy to Tape TOI. 2 Copy to Tape TOI Agenda Overview1 Technical Feature Implementation2 Q&A3.
Shared Engineering Services APJ Ghostdetect ver 1.0 for SPC Donghyun Seo Dec 12, 2008.
Lesson 6: Controlling Access to Local Hardware and Applications
Optimized Synthetics 1 OpenStorage Optimized Synthetics.
Type presentation name here in slide master © 2007 SDL. Company Confidential. Forward-looking information is based upon multiple assumptions and uncertainties.
Partner Proctored Assessment Registration Process Ajit Jha 1 Partner Assessment.
OST Virtual Synthetics 1. Synthetics Overvier Definitions – Catalog – Image – Extent Process Overview (today) – Extent map derivation – Read agenda –
Cyber Security in the Post-AV Era Amit Mital Chief Technology Officer General Manager, Emerging Endpoints Business Unit.
Editing Tons of Text? RegEx to the Rescue! Eric Cressey Senior UX Content Writer Symantec Corporation.
APIs related to NBU AIR Feature 1 OST APIs Related to NBU AIR Feature.
Maximize Profits Through Stronger Security Brook Chelmo Product Marketing
Comparison between EPF Composer and Rational Method Composer
Deployment Planning Services
Deployment Planning Services
9/11/2018 4:10 AM © 2014 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION.
Automation in an XML Authoring Environment
Office Mac /30/2018 © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.
Microsoft 365 Business Technical Fundamentals Series
Presentation transcript:

Controlled Language in action for MT Johann Roturier May 2009

Johann Roturier May Outline Overview of CL Impact of CL on MT 5 Future directions CL deployment Lessons learnt

Johann Roturier May Overview of Controlled Language Strict use of Controlled Language (CL) –Subset of a natural language that uses a restricted grammar and a restricted vocabulary (technical domain) –Makes source content clearer and less ambiguous –Improves comprehensibility and (machine-)translatability of source content –Example: Caterpillar Technical English in the mid 1990s (142 rules and more than terms) Loose use of Controlled Language –Used since 2006 for large structured documentation sets (authored in XMetal) –MT requirements in FR, DE, IT, BR, ES, ZH and JA (SYSTRAN) –Strict enforcement of spelling and grammar checks –Enforcement of corporate terminology once defined –Enforcement of specific CL rules Minimize impact on authoring productivity

Johann Roturier May CL deployment Use authoring application to implement CL rules for English –Language checker developed by acrolinx –Customized to deal with Symantec content –Based on pattern-matching approach (reactive approach) –Rules can be context-sensitive using XML documents mark-up (DocBook) –Suggestions or help files are provided to users acrolinx IQ is used to ensure that source content is compliant with –Grammar rules –Terminology (5000+ terms) –Rules based on corporate guidelines (20+) Some rules deal specifically with tagging, such as the tagging of SW references –MT-specific rules (6) acrolinx IQ is used to harvest, store and access source terminology –In-house panel working to further refine rule set and terminology lists

Johann Roturier May Lessons learnt during CL deployment Strict implementation when there is: –New content –Little leverage –Time Resources must be maintained –Eliminate false alarms –Adapt resources to deal with new content –Language-specific rules are best implemented as: Pre-processing step MT Normalisation dictionaries CL + MT is not always sufficient –Terminology work to update dictionaries ( entries per dictionary) –PE when specific qualify standard is required

Johann Roturier May 2009 Global content delivery workflow using CL and MT Controlled Source Authoring (including Terminology Harvesting and Validation) MT Tuning Pre-translation using TM and MT Post-EditingEvaluation 6

Johann Roturier May 2009 Importance of CL for (RB)MT Why is it so difficult to use MT effectively? –While attempting to install SystemWorks 2005 the error message "Error An error occurred while performing the task. There is a problem with this Windows Installer package. A program run as part of the setup did not finish as expected. Contact your support personnel or package vendor" appears. Controlling source content at the segment level –Reduces complexity and ambiguity during MT step –Reduces post-editing effort during PE step Controlling source content at the sub-segment level (terminology) –New terms harvested and defined during authoring Identified variants can be used by search engine (for help system) Reduces MT tuning step –Approved translations are defined during MT tuning step 7

Johann Roturier May 2009 Impact of source compliance on MT quality Source wordsMT qualityEvaluation typeacrocheck project score 1083ExcellentHuman GoodHuman MediumHuman PoorHuman Greater than 0.6 GTM scores Automatic Less than 0.6 GTM scores Automatic147

Johann Roturier May Impact of source compliance on comprehensibility

Johann Roturier May Future directions Use automatic tagging and pre-processing –Tagging: New tags may be used to mark ambiguous terms and used to produce context-sensitive translations (Systran XSL) –Pre-processing: add, remove or move source words or phrases Combine a CL approach with a semantic reuse approach –Increase 100% TM matches by leveraging TMs during authoring (CL through example) Use CL approach to check MT output –Can repetitive MT errors be flagged? And possibly fixed automatically? Use CL knowledge to deal with uncontrolled content –Short technical notes cannot always go through a full CL cycle, community generated content is increasing (e.g. forums, chat) –Term variants and deprecated terms can be added to MT normalisation dictionary

Thank You! Copyright © 2009 Symantec Corporation. All rights reserved. Symantec and the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. This document is provided for informational purposes only and is not intended as advertising. All warranties relating to the information in this document, either express or implied, are disclaimed to the maximum extent allowed by law. The information in this document is subject to change without notice.