Download presentation
Presentation is loading. Please wait.
Published byTilde Jespersen Modified over 6 years ago
1
Language Technology and Big Data Use Case: Media-based Business Intelligence Kimmo Valtonen CTO M-Brain Oy 12/27/2018 Copyright © M-Brain 2012
2
Media-based Business Intelligence
Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
3
Core value Detection of business-relevant events in media
Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
4
Big Data problem Textual data from 70+ languages
Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
5
Big Data problem: Volume
Global media data is the size of the Internet Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
6
Big Data problem: Variety
Media types vary extremely wrt syntax, punctuation, lexicon, capitalisation, length... Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
7
Big Data problem: Velocity
New relevant media items arrive potentially several per second for any well-known brand Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
8
Example architecture: M-Brain
Language Technology has potential applications at all stages Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
9
Application areas Machine Translation of content untranslated by humans Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
10
Application areas Draft summarisation of content
Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
11
Application areas Support for recognition of events (”A acquired B”)
Semantic enrichment Sentiment recognition Topic classification Mood recognition Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
12
Application areas Normalization and cleaning of textual data Relevance
Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
13
Things a buyer reflects on
Scalability Need to cover ”all” languages Has to be ”fast” / easy to parallelize Needs to be consistent across languages Precision vs. tuning needed Extreme precision vs. Use case optimal precision Modifiability Med kimmo, patenter, vilka pneding, när får vi dom? IPR! 12/27/2018 Copyright © M-Brain 2012
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.