Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Challenge the future Matlab Handling Large Data Sets Efficiently in MATLAB ®

Similar presentations


Presentation on theme: "1 Challenge the future Matlab Handling Large Data Sets Efficiently in MATLAB ®"— Presentation transcript:

1 1 Challenge the future Matlab Handling Large Data Sets Efficiently in MATLAB ®

2 2 Challenge the future Handling Large Data Sets is Like…

3 3 Challenge the future Memmapwhat? Problems in handling large data sets Strategies for handling large data set Maximizing available memory on your system Minimizing required memory in MATLAB ®

4 4 Challenge the future What are Large Data Sets? Data in MATLAB ® represent physical quantities Large Data Lots of quantities Varying by time, space High resolution Example: 5-10 TB data per flight test Genomic sequencing data! Amazon

5 5 Challenge the future Causes of Out of Memory Error Required memory > available memory Memory constraints on (32-bit) computer system Memory usage characteristics of MATLAB Lack of understanding of memory constraints or requirements >>A=rand(6000,6000);B=svd(A); ” Why out of memory?” “I have 1 GB file but I have 3 GB of RAM. Why out of memory?” Mistakes >>a=rand(10000); % need 800MB storage

6 6 Challenge the future Memory Constraints Causing Out of Memory Errors Datazeros(1e9,1) New Data Requirements Contiguous Free Block } Other Fragments MATLABWorkspace Other ML Variables Workspace ML footprint, Win DLLs } MATLAB Process virtual memory Limit OS dependant e.g. 2GB in Win2k Other Apps MATLABProcessVirtualmemory All Applications memoryrequirement } RAM Page File } Total System Memory

7 7 Challenge the future Memory Mapping Example % Default, map whole file as uint8 m=memmapfile('waferdata_uint8.bin') m.Data(1:20); % Specify format and name m=memmapfile('waferdata_uint8.bin','format',{'uint8' [20 100] 'x'},'repeat',20*1000) A=m.Data; % Change format on the fly m.format={'uint8' [1 4] 'headerbits';... 'uint8' [4 9] 'middle';... 'uint8' [7 1] 'tail'}; A=m.Data;

8 8 Challenge the future Use Smallest Data Type Depends on intended actions Complicated Math (e.g., Linear Algebra) Doubles or singles, 8 or 4 bytes For example: a=single(7) Simple arithmetic and original data is integers Integers, 1-4 bytes, for example e.g. a=int8(7) Can be faster than doubles Try with waferdata (Exercise 6) Sparse Just non-zero values and index stored >>a = sparse(2e9,1,pi);

9 9 Challenge the future Example % Filename tmpname=‘c:\temp\magweg\tmpmemmapfile.bin'; % Make empty matrix CreateEmptyFile(tmpname, 3000*3000*8); % (*8 for double) % Open memmory mapped file dataMatrix = memmapfile(tmpname,'Format',{'double',[3000,3000],'x'},'Writable', true); % random data dataMatrix.Data.x=rand(3000)+eye(3000); % manipulate cov(dataMatrix.Data.x)

10 10 Challenge the future Also in R ff package Bigmemory package Handling Large Data Sets in R with Memory Mapped Pages of Binary Flat Files


Download ppt "1 Challenge the future Matlab Handling Large Data Sets Efficiently in MATLAB ®"

Similar presentations


Ads by Google