News On The Go! How NewsHunt reached 1 Crore Downloads ? INDIAN LANGUAGES!!

Slides:



Advertisements
Similar presentations
Unicode Mark Davis Unicode Consortium President IBM Chief SW Globalization Architect
Advertisements

Rashida Jamil WHY COMPANIES ARE IN NEED OF MOBILE DEVELOPERS.

Chapter 8_2 Bits and the "Why" of Bytes: Representing Information Digitally.
Working with the data type: char  2000 Prentice Hall, Inc. All rights reserved. Modified for use with this course. Introduction to Computers and Programming.
Data Representation CS105. Data Representation Types of data: – Numbers – Text – Audio – Images & Graphics – Video.
Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
Unicode, character sets, and a a little history. Historical Perspective First came EBCIDIC (6 Bits?) Then in the early 1960s came ASCII – Most computers.
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky Veronika.
ENCODING AND DECODING Experiencing one (or more) bytes out of your A’s.
Introduction to Computing Using Python Chapter 6  Encoding of String Characters  Randomness and Random Sampling.
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture: Character sets
Sophia Antipolis, September 2006 Multilinguality, localization and internationalization Miruna Bădescu Finsiel Romania.
Unicode & W3C Jataayu Software C. Kumar January 2007.
Introduction to Mobile Computing CSE 390 Fall 2010.
Creating Interfaces: Localization Language & other issues character codes Homework: preparation for future topics.
UNICODE Character Sets and Coding Standards Han Unification and ISO10646 Encoding Evolution and Unicode Programming Unicode.
ASCII and Unicode.
Encoding and fonts Edward Garrett Software Developer, ELAR.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
Digital Multimedia, 2nd edition Nigel Chapman & Jenny Chapman Chapter 10 This presentation © 2004, MacAvon Media Productions Characters & Fonts.
Lecture 2 Character Codes and Low-Structure Text Document Formats.
B.Sc. Multimedia ComputingMedia Technologies Character Representation & Font Technology.
Future of Digital Media
HTML5 for Mobile Andrew Kinai. HTML vs HTML5 HTML:A language that describes documents' formatting and content, which is basically composed of static text.
ATC Mobile Solutions Richard Brown Head of R&D and Mobile Solutions October 2010.
The Android Operating System I- Introduction II- History III- Features IV- Competitors V- References.
Newsbrands and mobile phones comScore GSMA Mobile Media Metrics & MobiLens 1.
Anlab ( ) Kim, Yangjung Characters & Fonts.
Data Files on Computers Text Files (ASCII) Files that can be created by typing on the keyboard while using a text editor such as notepad or TextEdit.
UNICODE & Indic Scripts
SEC (1.4) Representing Information as bit patterns.
Strings in MIPS. Chapter 2 — Instructions: Language of the Computer — 2 Character Data Byte-encoded character sets – ASCII: 128 characters 95 graphic,
The character data type char. Character type char is used to represent alpha-numerical information (characters) inside the computer uses 2 bytes of memory.
Representation of Characters
Data Encoding COSC Computers and Data Computers store information as sequences of bits Computers store many types of data: numbers text audio images.
Characters CS240.
Information Coding Schemes Group Member : Yvonne Tiffany Jurifah bt Junaidi Clara Jane George.
HNC COMPUTING - COMPUTER PLATFORMS 1 Micro Teach Binary.
Character representation in the computers Home Assignment 1 Assigned. Deadline 2016 January 24th, Sunday.
17-Mar-16 Characters and Strings. 2 Characters In Java, a char is a primitive type that can hold one single character A character can be: A letter or.
Unicode WTF is UTF? (for Secondary School Students) Jan Zidek Tieto Czech s.r.o. ☺ U+263A.
3 Ways to Transfer Calendar from Android to Android Gihosoft Studio
LOGO iPhone to Galaxy Note Transfer Transfer iPhone Data to Galaxy Note 3/Note 4/Note 5/Note Edge.
Basics of Unicode (base upon a presentation by NRSI, SIL International)
1.4 Representation of data in computer systems Character.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
1 Non-Numeric Data Representation V1.0 (22/10/2005)
Text and Images Key Revision Points.
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Machine level representation of data Character representation
Lesson Objectives Aims You should be able to:
INTERNATIONALIZATION
Characters & Fonts Digital Multimedia, 2nd edition
Sinhala Language Support for Java Micro Edition
Representing Information as bit patterns
Data Encoding Characters.
TOPICS Information Representation Characters and Images
Data Representation ASCII.
Android Mobile apps development services company in India
Strings.
Lecture 2 Data representation
Devanagari Font Support For Linux
Characters & Fonts Digital Multimedia, 2nd edition
Mobile Internet in Local Government
Chapter 2 Data Representation.
How Computers Store Data
Lab 3: File Permissions.
Lecture 36 – Unit 6 – Under the Hood Binary Encoding – Part 2
Introduction to UNICODE (ஒருங்குறி)
Presentation transcript:

News On The Go! How NewsHunt reached 1 Crore Downloads ? INDIAN LANGUAGES!!

Landscape: Print Media: Local Languages Dominate! Readers (Millions) Need for Local Language and Local Content is much bigger than English and Generic Content.` India’s English Literacy is 8-10% Vernacular Language Literacy is 65% Available on NewsHunt

NewsHunt Coverage across India

NewsHunt : Handsets and Platforms Launching Shortly Java NOKIA S40, Samsung, SonyEricsson, LG, etc… Symbian NOKIA S60, Others… Blackberry iPhone From $40 Phone to Smart Phone, to Mobile Computers Android

Usage of News & Video Mobile Apps: Total Minutes

6 Unicode Character Encoding UnicodeUTF FFFF xxxxx-xxxxxxxx-xxxxxxxx UnicodeUTF FFFFxxxxxxxx-xxxxxxxx FFFF110110yy-yyxxxxxx xx-xxxxxxxx UnicodeUTF F0xxxxxxxx FF110xxxxx 10xxxxxx FFFF1110xxxx 10xxxxxx 10xxxxxx FFFFF11110xxx 10xxxxxx 10xxxxxx 10xxxxxx FFFFFF*111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx FFFFFFF* x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx ए = F ए = 09-0F ए = E0-A4-8F UTF F = UTF-8 = = E0-A4-8F UTF-16BE = 09 0F UTF-16LE = 0F 09 UTF-16 is backward compatible to UCS-2 UTF-8 variable length, multi byte format. UTF-8 is backward compatible with 7 bit ASCII U :: U+10FFFF requires 21 bits (marked x below)

7 Text Rendering … {U+0915, U+094D, U+0930, U+093F, U+0915, U+0947, U+091F} {Letter K, Half Indicator, Letter R, I matra, Letter K, E matra, Letter T} This needs to be composed to group of Akshars {{0x915,0x94D,0x930,0x93F},{0x915,0x947},{0x91F}} which is {{Unicodes forming kri},{unicodes forming ke},{unicodes forming T}} Each Akshar needs to be mapped to visually ordered sequence of glyph indices within font And then this glyph sequence is drawn

Chandu Sohoni