Deploying disk deduplication for Hyper-v 3.0 Žigmund Maťašovský
Content Deduplication in windows 2012 Hyper-v and files Cooperation Hyper-v and deduplication Performance „Eye-opener“ Questions & Example
What deduplication is: Goal: Use less storage space Method: Ensure that identical content is stored only once (on volume) How it‘s work: Post-process Base on variable block size Transparent for application Selective compression
„Save space“ in previous version NTFS compression Work with single file Real compression, write is CPU intensive Single instance store File-based (WDS) NTFS hard link File based Not transparent to application
Dedup mode Where dedup work: Source (RDC) Target When dedup work: Inline (NTFS compression; slower write) Post-Process Which object dedup: File (SIS) Block (fixed size, cluster,...) Chunk (variable block size,...)
Dedup architecture File system driver Dedup service: Data Deduplication Service Data Deduplication VSS JOBS Is Not supported: removable device, CSV System disk Remote mounted device
How dedup work Post-process Preserve latency and throughput of primary (on-fly) data access Flexibility in scheduling background job on cold data Optimize, Delete, check/repair Optimize job need no input/output operation on deduplicated file Base on variable block size = chunk Calculate Rabin fingerprint hash for sliding window Declare chunk boundary Selective compression 48bytes metadata for chunk Organized in LOG-structure Use also heuristic algorithm
How dedup work – detail I. Identify dedupicate data:
How dedup work – detail II. Optimize target files:
How to manage dedup GUI Server manager (RSAT) Command line Ddpcli.exe DdpEVal.exe Powershell: Disable-DedupVolume Enable-DedupVolume Get-DedupStatus Get-DedupVolume Set-DedupVolume Update-DedupStatus Get-DedupMetadata Measure-DedupFileMetadata JOB: Get-DedupJob Start-DedupJob Stop-DedupJob Schedule Get-DedupSchedule New-DedupSchedule Remove-DedupSchedule Set-DedupSchedule
Dedup – know problems How to create file duplicate (create only metadata)? Scrubbing Repair from redundant copy hotspot Garbage Collection Space for deduplication process Event log Monitoring using SCOM2012 (MP File Server 2012) Backup problem
Hyper-v & files Format: .vhd .vhdx Types: Fixed size - possible thin provisioning on storage spaces Dynamically expanding - need compact if require frees space Differencing – read-only parent disk(s)
Hyper-v & storage local iScsi SMB 3 CSV Storage spaces
Hyper-v & dedup I. Dedup on HOST Not recommended Because performance impact (for standard disk) Because deduplication requires a file not in use Dedup in GUEST Problem with compact dynamic.vhd
Hyper-v & dedup II. Approach for TESTING environment & LABS: VHD Parent file (golden image) deduplicate volume, high performance read Low cost or old-model SSD Use differencing disk, non deduplicate storage With high write IOPS/ throughput -> (intel DS...) In-time diff file grow to fast (change one byte need save whole file in diff) Need manually-scripted management Separate SWAP disk (or switch off swap) – no dedup Separate data disk (mssql, exch,...) – no dedup because performance impact Do not use Bitlocker in GUEST
Hyper-v & dedup III. Possibilities: Use iScsi.vhd (fixed) file on deduplicated volume On „standard“ volume On storage spaces thin provisioning volume .VHD on dedup storage DedupJob with save-state VMs ...
Performance – capacity / resouces Optimizing: Memory 30MB-1GB (depend on chunk count) CPU: 30-40% Disk usage: median disk queue = 0 Save spaces: (parent.VHD) Capacity: GB FreeSpace: 8.08 GB UsedSpace: GB UnoptimizedSize: GB SavedSpace: GB SavingsRate: 86 %
My performance-test reality: 3 different virtuals speeder 10+-% (VMM Library store) - readonly slower 30+-% (WDS) – read-write Problems: Non deterministic (partialy heuristic algoritm), repeated operetion can get different result Cache store chunk highly dependent on device IOP charasteristics No dedup file -> possible sequence operation Dedup -> Chunk fragmented (partialy compress) avarage chunk size 80 KBytes Performance I.
Performance – Data access
Eye-opener Dedup in windows 8 The „same“ kernel as windows 2012 Licensing (?) Install using DISM dism /online /add-package /packagepath:..... dism /online /enable-feature /featurename:Dedup-Core /all Manage using PowerShell Import-Module Deduplication Enable-DedupVolume S: Set-DedupVolume S: -MinimumFileAgeDays 0 Start-DedupJob S: –Type Optimization Get-DedupJob Get-DedupStatus Get-DedupVolume S: | FL Backup using wbadmin
Summary Do not use in product environment Good for LAB / testing environment / NTB Good for save space on SSD (wait for next version)
Questions and/or Example Dedup volume S: Ddpeval S:\ Enable-dedupvolume s: Get-dedupVolume s: | FL ! MinimumFileAgeDays Set-DedupVolume S: -MinimumFileAgeDays 0 Start-DedupJob S: –Type Optimization Get-DedupJob Get-dedupVolume s: | FL
REF: Ronald Beekelaar: Windows 8 DDDD - Disk Deduplication Deep Dive the-mvps-windows-server-2012-s-data-deduplication-feature.aspx the-mvps-windows-server-2012-s-data-deduplication-feature.aspx us/library/windows/desktop/hh769303(v=vs.85).aspx#best_practices _for_data_deduplication us/library/windows/desktop/hh769303(v=vs.85).aspx#best_practices _for_data_deduplication us/library/windows/desktop/hh769303(v=vs.85).aspx us/library/windows/desktop/hh769303(v=vs.85).aspx sions/SudiptaSengupta-JimBentonPrimary_Data_Deduplication- revision.pdf sions/SudiptaSengupta-JimBentonPrimary_Data_Deduplication- revision.pdf deduplication-in-windows-8/ deduplication-in-windows-8/
Thank for your attention