Presentation is loading. Please wait.

Presentation is loading. Please wait.

Monday, July 24, 2016 1:00 pm ET Sandra Serkes, President & CEO

Similar presentations


Presentation on theme: "Monday, July 24, 2016 1:00 pm ET Sandra Serkes, President & CEO"— Presentation transcript:

1 AutoClassifying Documents at Their Source: PowerHouse Behind the Firewall
Monday, July 24, 2016 1:00 pm ET Sandra Serkes, President & CEO Valora Technologies, Inc.

2 What is File Classification?
The creation of metadata (information about a file) to assist in storage, retention, analysis & data mining, risk control, forecasting, etc. Commonly used file classification components File and folder naming Creation Date, Last Modified Date Author, Source, Custodian, Path Emerging classification components Keywords, themes Content sensitivity Languages Intended Audience and/or Responsible Party Most programs will let a user (or administrator) create metadata for classification purposes. Watch as I show you how easy, intuitive and practical this is!

3 A Better Way: AutoClassification!
AutoClassification does not rely on people to create their own metadata. Instead, a set of linguistic and pattern-matching rules create the classification content (metadata) and schema (storage hierarchy). AutoClassification is a much better option for Large volumes of data Consistent labelling and file storage People or circumstances where manual metadata creation is tedious, wasteful, time-consuming or impossible Sensitive information IG & Records scenarios with high scrutiny (litigation, investigatory, regulatory, etc.) Cost reduction

4 What is AutoClassification?
Computer software that is custom-configured to a particular environment Software contains recognition algorithms for Document Type Content analytics Indexing & Tagging Recommended locations & naming “Middleware” that sits between current file locations (includes Archive) and intended file locations (includes deletion), providing an intelligent filter and metadata creation/enhancement for all content. Valora’s AutoClassification engine is called

5 PowerHouseTM & BlackCatTM Architecture
BlackCat Presentation Layer PowerHouse Platform Layer PH AutoProcessors (Pattern-Matching Algorithms & Rules) PH Quality Control User Interface (QCUI) PH Admin Console (Admin) SQL Server Database Layer

6 PowerHouse Architecture With BlackCat
Administration Console PowerHouse Configuration Editor PowerHouse Quality Control BlackCat End-users User Interface SharePoint/O365 SharePoint iManage OpenText Intake Transfer Agent Export Transfer Agent Relativity Exchange BlackCat Data Transfer Filesystem PowerHouse Processing Components PowerHouse Controller BlackCat Controller BlackCat Web Server Processing BlackCat Document DataBase BlackCat Filesystem Cache PH Tracking DataBase PH Temporary Filesystem Storage

7 How does AutoClassification Work?
Files are analyzed by a sequence of processing and tagging Processing (aka Intake) is the process of “ingesting” data into an analytics engine Creating OCR for scanned images Extracting text for native files & Speech to text for audio/video files Translating content to English Re-ordering or re-aligning pages Applying redactions Tagging (aka Coding, Indexing, Sequencing) is the process of extracting key information and attributes about each document Document Type, Important Dates Key Names & Phrases Topics, Keywords & Themes File, Content and DocType attributes Relation to other documents (duplicate, related, attached, contradictory, etc.) native text text fielded data

8 Classifying a patent application with data mining (analytics)
Date Format = US DocType = Patent Application Date = 10/18/2007 Author = Patent Authors, Author City, Author Country Assignee = RIM Tone = Neutral to slightly positive Embedded Graphic with Title Other Data Capturable Data Elements: Patent Number Filing Date Key Phrases & Terms Managing PTO Implied/Attached Docs Bar Code Present And many more . . .

9 Classifying an email with data mining (analytics)
Author Doc Type & Implied attachment range Matter indicator & validation Author Validation & Contact Info Implied matter: Passaro ( )

10 Intake – PowerHouse – Output
PH Web Portal Hosted Repository OCR/Text Extraction Translation/Transcription Unitization Coding/Tagging Rules/Disposition Redaction Exceptions Shared Server Poll Folder Taxonomy

11 PowerHouse Portal Users drag & drop files into the portal for immediate, automatic loading into PowerHouse. PowerHouse responds with an automatic acknowledgement .

12 AutoClassification Live Demo
PLEASE NOTE: Time permits only a quick tour of BlackCat (and PowerHouse supporting it). If you would like to see a more in-depth presentation, please contact us at:

13 Typical Problems Valora Solves
Legal/Litigation/eDiscovery Problems Too many documents to review, cull & produce by hand Cost-effective alternative solutions to contract attorney & offshore labor “armies” Missing, poor, or ineffective metadata Re-unitization, organization, indexing & redacting of documents Bridging multi-language document populations to English Records Management Problems Help automate defensible deletion efforts for IG Organize & control loose documents on shared drives, desktops, networks & devices Eliminate expensive and information-poor storage options Serve as automated intake for multiple content generation sources Business Intelligence Problems Organize & control decades of contracts & agreements Provide brand integrity/protection data mining of public/private documents Forecast & trending of topics, people & locations over time Loose, shared files analysis & control Health Care Problems Heavy expense & time converting hardcopy medical records to EMRs/EHRs Cannot keep up with fax server data collection Cost effective alternative solutions to “armies” of temp data entry coders

14 Who We Serve Corporate Legal Departments with complex document/data/content management needs Litigation Risk Exposure Compliance Records Information Governance Government Agencies with limited resources for document/data/content monitoring, analysis, management Investigations Their Advisory Counsel The law firms, consultancies and service providers who support these entities

15 (this is Valora’s story, too)
Valora Technologies Bedford, MA software firm specializing in machine-assisted document processing capabilities (aka analytics) World experts in the automated analysis, indexing, mining and presentation of documents, data & content 20 staff, 200+ clients, 1,500,000+ pages every week Customers: corporate legal departments, government agencies, and their professional advisory colleagues (law firms & consultancies) Target market: those who wish to harness and profit from the 2.5 quintillion bytes of document & content data being created each day, aka “Big Data” Objective: to overtake traditional information repository creation (manual data entry), management, analysis (search, review) and workflow (retention, production, routing) with high quality, low cost, scalable technology & best practices in analytics. Provide cost competitive document analytics solutions in the United States Provide efficient, world-class, targeted solutions to data, document & content utilization problems The power of Big Data is the story about the ability to compete and win with few resources and limited dollars Forbes, March 2012 (this is Valora’s story, too)

16 Valora Technologies, Inc.
Thank You! For More Information: Valora Technologies, Inc. 101 Great Road, Suite 220 Bedford, MA 01730


Download ppt "Monday, July 24, 2016 1:00 pm ET Sandra Serkes, President & CEO"

Similar presentations


Ads by Google