Tech Inside Extended Document Management System (EDMS)
Technological basis linux-based: performance and stability DMS Appliance using containerized environment based on industry leading Docker™ technology (deploy strategy) Can be run in virtualization platforms like ESXI, Hyper-V or bare metal (virtualization overhead vs. ease of maintainability) Based on a set of professional industry leading technology components (Nuxeo, Postgres, Elastic Search, Postfix, etc.) Unified deploy mechanism (shell scripts)
System overview
General considerations DMS ... Step towards big data type processing: Every document version, every trashed document << all in-/outbound email incl. Image Attachments << OCR processing and indexing/search 3 main data stores: Elastic Search – latency critical, locally attached storage or fast iSCSI (not SMB or NFS) Postgres DB – local or network (best iSCSI) Document binaries – local or network 3 main CPU focus points: Data extraction processes – demand varies between startup phase and prod. Phase (OCR, metadata extraction, preview processing) Page serving, pdf-operations Full repo-searches
Hardware considerations CPU: startup phase: depends largely on repo size (file count) – „sweepers“ will parse through all documents in the repo (often ~106) attachments from all emails will be transformed into physical documents – adds significant document count document metadata is being extracted previews are being rendered Text is being extracted (incl. OCR) Production phase: same as above but ~102 docs per day DMS can be used during startup phase but may experience noticable performance difference. ➯ Enough resource (CPU) crucial for timely result and satisfaction – flexibility of virtualized env. beneficial
Hardware considerations Storage: RAID always a good idea – recommended level 10 RAID does not replace daily Backup! For environments >30-40 users ➯ SSD RAM: 16G (<10 users) < 48G (<50 users) < 128G (<200 users) OCR-Appliance: Recommended for repos >106 documents Will take significant load form DMS ➯ more moderate requriements (a) Tesseract appliance; (b) OCRkit appliance https://doc.practiceinsight.io/pages/viewpage.action?pageId=3997724
Basic Requirements Port 7999 outbound to ssh://support.practiceinsight.io Client configuration repository – accessible by PI, Patrix, and client machine Port 587 outbound smtp://email-smtp.eu-west-1.amazonaws.com Outbound communication of the service container health monitoring alarm messages For startup maintenance, ssh (over VPN or not) to DMS appliance highly recommendable
DMS startup process Pre-delivered DMS VM image based on a Centos 7 minimal distribution (Centos = „community version“ of RedHat) All relevant components pre-installed to quickly and easily startup a DMS environment (following deploy) Basic maintenance and monitorin tools also pre-installed Auto-deploy scripts pre-installed Download image ➯ define IP settings for VM ➯ create ssh-key for client configuration repo access ➯ send key to Patrix ➯ pull default configuration ➯ configure & push to client config repo ➯ create empty repo (script) ➯ deploy DMS https://doc.practiceinsight.io/display/DMS/DMS+Installation
Email flow options Rule based
DMS upgrade process By design, up- or downgrades don‘t touch data – only the DMS logic binaries (containers) are replaced Configure & push to client config repo ➯ deploy DMS Upgrade process generally takes <10 minutes DMS upgrades are separate from Patricia updates It is suggested to snapshot before an upgrade – remove snapshot after upgrade!
DMS upgrade parts
Client side components DocIntegrate Manages down-/upload of documents; maintains local copies and monitors server-connection NuxeoLib.dll Connects Patricia natively to the DMS – Patricia speaks DMSian (system library) DocIntegrate Outlook Shows save status plus additional information relevant to each email (outlook plugin) https://doc.practiceinsight.io/display/DMS/Windows+Client+Configuration
Environment configuration In essence, there are 2 locations where you configure your environment PAT_DMS_SETTINGS table in Patricia database (system behaviour) https://doc.practiceinsight.io/display/DMS/PAT_DMS_SETTINGS+Configuration /Workspace/Paricia/Settings/settings.xml document (email templates and certain casebrowser settings) https://doc.practiceinsight.io/display/DMS/Email+Integration and subpages
Environment maintenance The DMS comes with a number of diagnostic and maintenance endpoints. CAUTION: You MUST know what you are doing! https://doc.practiceinsight.io/display/DMS/Diagnosis+servlets
THANK YOU 🤓 https://doc. practiceinsight