دسته بندی نیمه نظارتی (2)

Name: دسته بندی نیمه نظارتی (2)
Uploaded: 2017-07-12T12:41:46+00:00
Duration: PTM10S33
Channel: Meagan Stewart
Description: دسته بندی نیمه نظارتی (2)

دسته بندی نیمه نظارتی (2)
زهره کریمی Introduction to semi-supervised Learning, Xiaojin Zhu and Andrew B. Goldberg, University of Wisconsin, Madison, 2009.

روش های یادگیری نیمه نظارتی
مدل های Mixture و روش EM روش Co-Training روش های مبتنی بر گراف روش های مبتنی بر SVM یادگیری نیمه نظارتی انسان تئوری

Co-Training Named entity Classification Location

Co-Training Named entity Classification Location Location

Co-Training دو دسته بندی کننده را یاد می گیرد: هر دسته بندی کننده روی یک دید نمونه هایی را که در یک مدل دسته بندی با اطمینان بالا دسته بندی شده اند به داده های آموزش مدل دسته بندی دیگر اضافه می کند.

Co-Training فرضیات هر view به تنهایی برای دسته بندی کافی باشد
Why is the conditional independence assumption important for Co-Training? If the view-2 classifier f (2) decides that the context “headquartered in” indicates Location with high confidence, Co-Training will add unlabeled instances with that context as view-1 training examples. These new training examples for f (1) will include all representative Location named entities x(1), thanks to the conditional independence assumption. If the assumption didn’t hold, the new examples could all be highly similar and thus be less informative for the view-1 classifier. It can be shown that if the two assumptions hold, Co-Training can learn successfully from labeled and unlabeled data. However, it is actually difficult to find tasks in practice that completely satisfy the conditional independence assumption. After all, the context “Prime Minister of ” practically rules out most locations except countries. When the conditional independence assumption is violated,Co-Training may not perform well. If the conditional independence assumption holds, then on average each added document will be as informative as a random document, and the learning will progress.

کاربردها Web-page classification متن صفحه: کلمات رخ داده در صفحه
متن hyperlink: کلمات رخ داده در hyperlink ها به صفحه مورد نظر Classify Speech phonemes سیگنال Audio سیگنال video نمایش دهنده حرکت لب ها

Multiview learning (1) The squared loss c(x, y, f (x)) = (y − f (x))2
c(x, y, f (x)) = 0 if y = f (x), and 1 otherwise c(x, y = healthy, f (x) = diseased) = 1 and c(x, y = diseased, f (x) = healthy) = 100

Multiview learning (2)

Multiview Learning (3) MULTIVIEW LEARNING
هدف تولید k مدل بر اساس k دید است The semi-supervised regularizer: میزان عدم توافق k مدل را روی داده های بدون برچسب اندازه گیری می کند Individual Regularized Risk Semi-Supervised regularizer

Multiview learning(4) فرض: مجموعه فرضیه ها با یکدیگر موافق باشند و علاوه بر آن emprical risk آن ها کوچک باشد

دسته بندی نیمه نظارتی مبتنی بر گراف (1)
نمونه های برچسب دار و بدون برچسب متناظر با راس های گراف شباهت بین هر دو نمونه متناظر با وزن یال بین دو راس گراف متصل کامل گراف kNN گراف NN

دسته بندی نیمه نظارتی مبتنی بر گراف (2)

چارچوب Regularization
تابع برچسب f روی گراف پیشگویی برچسب f نزدیک به برچسب داده های برچسب دار باشد loss function f روی کل گراف هموار باشد (با توجه به regularization framework) special graph-based regularization

Mincut (1) نمونه های با برچسب مثبت معادل راس های source
نمونه های با برچسب مثبت معادل راس های sink هدف، یافتن مجموعه کمینه ای از یال ها است که source را از sink جدا می کند

Mincut (2) 1 3 5 4 2

Mincut (3) Cost Function Regularizer Mincut Regularized Risk problem s

Harmonic Function (1)

Harmonic Function (2) >= 0, predict y = 1, and if f (x) < 0, predict y = −1). The harmonic function f has many interesting interpretations. For example, one can view the graph as an electric network. Each edge is a resistor with resistance 1/wij ,or equivalently conductance wij . The labeled vertices are connected to a 1-volt battery, so that the positive vertices connect to the positive side, and the negative vertices connect to the ground. Then the voltage established at each node is the harmonic function,1 see Figure 5.3(a). The harmonic function f can also be interpreted by a random walk on the graph. Imagine a particle at vertex i. In the next time step, the particle will randomly move to another vertex j with probability proportional to wij : graph as an electric network.Each edge is a resistor with resistance 1/wij ,or equivalently conductance The random walk continues in this fashion until the particle reaches one of the labeled vertices. This is known as an absorbing random walk, where the labeled vertices are absorbing states. Then the value of the harmonic function at vertex i, f (xi), is the probability that a particle starting at vertex i eventually reaches a positive labeled vertex

Harmonic Function (3) راه حل تکراری راه حل بسته
unnormalized graph Laplacian matrix L W is an (l + u) × (l + u) weight matrix, whose i, j -th element is the edge weight wij

Harmonic Function (4) unnormalized graph Laplacian matrix

Manifold Regularization (1)
مسائل روش های موجود Transductive بودن فقط امکان برچسب گذاری داده های بدون برچسب موجود حساس بودن به نویز فرض f (x) = y برای داده های برچسب دار

Inductive بودن پایدار بودن در محیط های نویزی

normalized graph Laplacian matrix L توان هایی از ماتریس Laplacian نرمال و غیرنرمال

فرض روش های مبتنی بر گراف (1)

Spectral graph theory

a smaller eigenvalue corresponds to a smoother eigenvector over the graph The graph has k connected components if and only if λ1 = = λk = 0. The corresponding eigenvectors are constant on individual connected components, and zero elsewhere.

Graph Spectrum

Regularization term اگر مقدار ai یا λi نزدیک به صفر باشد Regularization term کمینه خواهد بود. به عبارت دیگر، f ترجیح می دهد که از پایه های هموار (با λi کوچک ) استفاده کند.

در گراف k-connected component، کمینه Regularization term

کارایی حساس به ساختار گراف و وزن ها

شهود فاصله از مرز تصمیم تا margin: geometric margin.

Support Vector Machines

Support Vector Machines
The signed geometric margin: The distance from the decision boundary to the closest labeled instance decision boundary Maximum margin hyperplane must be unique

Non-Separable Case (1)

Non-Separable Case (2) lie inside the margin,
but on the correct side of the decision boundary lie on the wrong side of the decision boundary and are misclassified are correctly classified

S3VM (1)

S3VM (2) the majority (or even all) of the unlabeled instances are predicted in only one of the classes

S3VM (3) Convex function The S3VM objective function is non-convex
The research in S3VMs has focused on how to efficiently find a near-optimum solution

Logistic regression SVM and S3VM are non-probabilistic models
conditional log likelihood Gaussian distribution as the prior on w:

Logistic regression regularizer Logistic loss
The second line follows from Bayes rule, and ignoring the denominator that is constant with respect to the parameters.

Logistic regression

Entropy Regularizer Logistic Regression+Entropy Regulizer For SemiSupervised Learning Intuition if the two classes are well-separated, then the classification on any unlabeled instance should be confident: it either clearly belongs to the positive class, or to the negative class. Equivalently, the posterior probability p(y|x) should be either close to 1, or close to 0. Entropy

Semi-supervised Logistic Regression
entropy regularizer for logistic regression

Entropy Regularizer

فرض روش های S3VM و Entropy Regularization

دسته بندی نیمه نظارتی (2)

Similar presentations

Presentation on theme: "دسته بندی نیمه نظارتی (2)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

دسته بندی نیمه نظارتی (2)

Similar presentations

Presentation on theme: "دسته بندی نیمه نظارتی (2)"— Presentation transcript:

Similar presentations

About project

Feedback