Using a 3-dim DSR(Document Sender Receiver) matrix and

Communications Analytics prediction and anomaly detection for emails, tweets, phone, text
Using a 3-dim DSR(Document Sender Receiver) matrix and 2-dim TD(Term,Doc) and UT(User Term) matrixes. fR The pSVD trick is to replace these massive relationship matrixes with small feature matrixes. fR fS f1,S f2,S 1 fS 1 Using just one feature, replace with vectors, f=fDfTfUfSfR or f=fDfTfU DSR U 2 3 4 5 TD 1 D T UT fD 1 D 2 3 4 5 fD 1 DSR rec  sender  fT 1 fT 1 Replace DSR with fD, fS, fR 5 4 3 T 2 TD fD 1 Replace TD with fT and fD fD 1 FU 1 fU 1 U 2 3 4 5 UT fT 1 fT 1 Replace UT with fU and fT feature matrixes (2 features) Use GradientDescent+LineSearch to minimize sum of square errors, sse, where sse is the sum over all nonblanks in TD, UT and DSR. Should we train User feature segments separately (train fU with UT only and train fS and fR with DSR only?) or train U with UT and DSR, then let fS = fR = fU , so f = This will be called 3D f. <----fD----> 1 <----fT----> <----fU----> <----fS----> <----fR----> Or training User the feature segment just once, f = This will be called 3DTU f <----fD----> 1 <----fT----> <fU=fS=fR> We do pTrees conversions and train F in the CLOUD; then download the resulting F to user's personal devices for predictions, anomaly detections. The same setup should work for phone record Documents, tweet Documents (in the US Library of Congress) and text Documents, etc.

(Problem to solve: mechanism for SVD prediction of Sender?)
3DTU: Structure relationship as a rotatable matrix, then create PTreeSets for each rotation (attach entity tbl PTreeSet to its rotation Always treat an entity as an attr of another entity if possible? Rather than add it as a new dimension of a matrix? E.g., Treat Sender as a Document attribute instead of as the 3rd dim of matix DSR. The reason: Sender is a candidate key for Doc (while Receiver is not). (Problem to solve: mechanism for SVD prediction of Sender?) Sender CT LN DR DT 3 5 1 4 1 1 2 1 1 D TD 3 5 4 1 TU 1 2 4 5 3 1 RD 1 pDRR1 1 pDRR2 pDRMask pDSh,0 1 pDS,1 pDCT,0 1 pDCT,1 pDLN,0 1 pDLN,1 2 3 T pDTT1,2 1 pDTT1,1 pDTT1,0 pDTT2,2 pDTT2,1 pDTT2,0 pDTT2,Mask pDTT1,Mask pDTT3,2 pDTT3,1 pDTT3,0 pDTT3,Mask UT U 3 5 4 1 1 2 1 2 pTDD1,2 1 pTDD1,1 pTDD1,0 pTDD1,Mask pTDD2,2 pTDD2,1 pTDD2,0 pTDD2,Mask Only provide blankmask when blanks pTUU1,2 1 pTUU1,1 pTUU1,0 pTUU1,Mask pTUU2,2 pTUU2,1 pTUU2,0 pTUU2,Mask pTrees might be provided for DST (SendTime) and D(LN (Length): pUTT1,2 pUTT1,1 1 pUTT1,0 pUTT1,Mask pUTT2,2 pUTT2,1 pUTT2,0 pUTT3,2 pUTT3,1 pUTT3,0 pUTT2,Mask pRDD1,2 1 pRDD2,1 pRDMask

Here we try a comprehensive comparison of the 3 alternatives, 3D (DSR); 2D (DS, DR); DTU(2D) [em9 em10] DT 4 5 UT u1 5 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 1 d d2 sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D DT UT u1 5 u2 t1 t2 t3 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D sseDTU tDSU T1 T2 T3 D1 D2 U1 U2 S1 S2 R1 R2 sse2D t2D sse3D t3D

Comprehensive comparison of 3 alternatives DTU [em11] 2D(DT,UT,DS,DR);
3D(DTD,TDT,TUT,UUT,DDSR,SDSR,RDSR) DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 TD1 TD2 TD3 DT1 DT2 TU1 TU2 TU3 DSR1 DSR2 U1 U2 S1 S2 DT _________ e sse t DTU 42.59 t2D fT UT t3D e u u t1 t2 t fT DSR e fS d1 r d s1 s fD e fS d1 r DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 TD1 TD2 TD3 DT1 DT2 TU1 TU2 TU3 DSR1 DSR2 U1 U2 S1 S2 DT _________ e sse t DTU 16.44 t2D fT UT t3D e fT DSR e fS fD DSR e fS DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 TD1 TD2 TD3 DT1 DT2 TU1 TU2 TU3 DSR1 DSR2 U1 U2 S1 S2 DT _________ e sse t DTU 21.84 t2D fT UT t3D e fT DSR e fS fD DSR e fS DT 4 5 UT u1 u2 DSR s1 s s1 s2 1 d1 r d1 r2 d d2 TD1 TD2 TD3 DT1 DT2 TU1 TU2 TU3 DSR1 DSR2 U1 U2 S1 S2 DT _________ e sse t DTU 15.67 t2D fT UT t3D e fT DSR e fS fD DSR e fS

pSVD for Communication Analytics, f = fDTD fTTD fTUT fUUT fSDSR fDDSR
Train f as follows: Train w 2D matrix, TD Train w 2D matrix UT Train over the 3D matrix, DSR pSVD for Communication Analytics, f = fDTD 1 fTTD fTUT fUUT fSDSR fDDSR fRDSR sse=nbTD(td-TDtd)2 sse=nbUT(ut-UTut)2 sse=nbDSR(dsr-DSRdsr)2 ssed=2nbTD(td-TDtd)t sseu=2nbUT(ut-UTtd)t ssed=2nbDSR(dsr-DSRdsr)sr sset=2nbTD(td-TDtd)d sset=2nbUT(ut-UTtd)u sses=2nbDSR(dsr-DSRdsr)dr sser=2nbDSR(dsr-DSRdssr)ds pSVD classification predicts blank cell values. DSR fSDSR 1 fDTD fTTD fTUT fUUT U 2 3 4 5 D T fDDSR TD UT fRDSR pSVD FAUST Cluster: Use pSVD to speed up FAUST cluster by looking for gaps in TD rather than TD (i.e., using SVD predicted values rather than actual given TD values). The same goes for DT, UT, TU, DSR, SDR, RDS. E.g., on the T(d1,...,dn) table, the tth row is pSVD estimated as (ft*d1,...,ft*dn) and the dot product vot is pSVD estimated as k=1..n vk*ft*dk So we analyze gaps in this column of values taken over all rows, t. pSVD FAUST Classification: Use pSVD to speed up FAUST Classification by finding optimal cutpoints in TD rather than TD (i.e., using SVD predicted values rather than actual given TD values). Same goes for DT, UT, TU, DSR, SDR, RDS.

Recalling the massive interconnection of relationships between entities, any analysis we do on this we can do after estimating each matrix using pSVD trained feature vectors for the entities. DSR 1  sender  rec UT 1 On the next slide we display the pSVD1 (one feature) replacement by a feature vector which approximates the non-blank cell values and predicts the blanks.  Customer 1 2 3 4 Item 1 customer rates movie as 5 card cust item card 5 6 7 People  1 2 3 4 Author movie 2 3 1 5 4 customer rates movie card 2 3 4 5 PI 2 3 4 5 PI 4 3 2 1 Course Enrollments 1 Doc termdoc card authordoc card 1 3 2 Doc 1 2 3 4 Gene genegene card (ppi) docdoc People  term  7 6 5 4 3 Gene 1 2 3 4 G 5 6 7 6 5 4 3 2 t 1 termterm card (share stem?) 1 3 Exp expPI card expgene card genegene card (ppi)

fE fDSR,S fG1 fG2 fG3 fG5 fE2 fG4 fUT,T fDSR,D fUT,U fCI,C fCI,I fTD,T
On this slide we display the pSVD1 (one feature) replacement by a feature vector which approximates the non-blank cell values and predicts the blanks. 1 fDSR,R fDSR,S Train the following feature vector thru gradient descent of sse, but that each set of matrix feature vectors be trained on only the sse over the nonblank cells of that matrix. / train these 2 on GG1 \ /train these 2 on EG\ / train on GG2 \ And the same for the rest of them. Any data mining we can do with the matrixes, we can do (estimate) with the feature vectors (e.g., netflix like recommenders, prediction of blank cell values, FAUST gap based classification and clustering including anomaly detection). fG1 fG2 fG3 fG5 1 fE2 fG4 fUT,T Doc Sender Receiver fDSR,D fUT,U UT fCI,C 1 2 3 4 Gene2 Item Doc 5 G3 Experiment G1 6 T1 7 =Customer=users Author People = movie Course T2 fCI,I CI fTD,T fG1 fTT,T1 1 2 3 4 5 6 7 fUM,M fE,S fG2 1 fTD,D 1 fTD,D AD TD 1 fD2 3 2 1 fD1 fE,C GG1 Enroll DD fUM,M fE fG3 UserMovie ratings fG5 fTT,T1 1 fE1 1 fE2 ExpPI ExpG 3 2 fTT,T2 1 TermTerm fG4 GG2

k=1..n(f1R1f1Ck+..+fKR1fKCk)dk =
A n-dim vector space, RC(C1,...,Cn) is a matrix or TwoEntityRelationship (with row entity instances R1...RN and column entity instances C1...Cn.) ARC will denote the pSVD approximation of RC: FC= f1C f2C 2 4 ... 5 A N+n vector, f=(fR, fC) defines prediction, pi,j=fRifCj, error, ei,j=pi,j-RCi,j then ARCf,i,j≡fRifCj and ARCf,row_i= fRifC= fRi(fC1...fCn)= (fRifC1...fRifCn). Use sse gradient descent to train f. fC 4 1 ... 3 RC C1 C2 ... Cn R1 R2 . RN fR 1 : 6 4 2 3 5 fR1(fCodt) Once f is trained and if d=unit n-vector, the SPTS, ARCfodt, is: k=1..n fR2fCkdk : k=1..n fRNfCkdk fR1k=1..n fCkdk = fR2k=1..n fCkdk fRNk=1..n fCkdk k=1..n fR1fCkdk = (fR1fC)odt = (fR2fC)odt (fRNfC)odt fR2(fCodt) fRN(fCodt) 1 : 2 4 3 5 f1R f2R 1 1 1 1 1 1 Compute fCodt=k=1..nfCkdk form constant SPTS with it, and multiply that SPTS by SPTS, fR. d 1 ... Any datamining that can be done on RC can be done using this pSVD approximation of RC, ARC e.g., FAUST Oblique (because ARCodt should show us the large gaps quite faithfully). Given any K(N+n) feature matrix, F=[FR FC], FRi=(f1Ri...fKRi), FCj=(f1Cj...fKCj) pi,j=fRiofCj=k=1..KfkRifkCj Once F is trained and if d=unit n-vector, the SPTS, ARCodt, is: (FR1oFC)odt = (FR2oFC)odt : (FRNoFC)odt k=1..n(f1R1f1Ck+..+fKR1fKCk)dk = k=1..n(f1R2f1Ck+..+fKR2fKCk)dk k=1..n(f1RNf1Ck+..+fKRNfKCk)dk FR1o(FCodt) FR2o(FCodt) FRNo(FCodt) Keeping in mind that we have decided (tentatively) to approach all matrixes as rotatable tables, this then is a universal method of approximation. The big question is, how good is the approximation for data mining? It is known to be good for Netflix type recommender matrixes but what about others?

Of course if we take the previous data (all nonblarnks=1
Of course if we take the previous data (all nonblarnks=1. and we only count errors in those nonblarnks, then f pure1 is error=0. But of course, if it is an image (fax-type image of 0/1) then there are no blanks (and zero positions must be assessed error too). So we change the data. t sse e a b 1 2 3 4 5 6 7 8 9 a b c d e f -.2 .07 .94 0.000 a b 1 1 4 5 8 12 15 2 fR fC

Using a 3-dim DSR(Document Sender Receiver) matrix and

Similar presentations

Presentation on theme: "Using a 3-dim DSR(Document Sender Receiver) matrix and"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using a 3-dim DSR(Document Sender Receiver) matrix and

Similar presentations

Presentation on theme: "Using a 3-dim DSR(Document Sender Receiver) matrix and"— Presentation transcript:

Similar presentations

About project

Feedback