Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning, Sebastiano Galazzo

Similar presentations


Presentation on theme: "Machine Learning, Sebastiano Galazzo"β€” Presentation transcript:

1 Machine Learning, Sebastiano Galazzo
best practices and vulnerabilities Sebastiano Galazzo Microsoft MVP A.I. Category

2

3 Sebastiano Galazzo Microsoft MVP
@galazzoseba

4 Best practices

5 The perceptron In machine learning, the perceptron is a binary classifier or a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. 𝑓 π‘₯ =Ο‡ ( ⟨ w , x ⟩ + b ) w is a vector having weights of real values, while operator ⟨ β‹… , β‹… ⟩ is the scalar product, b is the 'bias’, a constant not related to any input value and Ο‡ ( y ) is the output function

6 Main evolutions Easy way, Logistic Regression, Support Vector Machine
Pro: Easy and fast use use Cons: Low accuracy (compared to neural networks) Hard way, Neural Networks Pro: If get convergence you gain a very high accuracy (State of the art) Cons: Very difficult to model, a lot of experience is required

7 Easy way Pseudo equation π‘₯βˆ—βˆ +π‘¦βˆ—π›½+π‘βˆ—π›Ώ+..+π‘§βˆ—πœ”=(0,1) #logisticregression #svm

8 Hard way #neuralnetwork

9 Advanced modelling of Neural Networks
Use case, provide a customer's willingness to vote a political party Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican 24 Other 39,000 San Francisco 51 Prefer not to say 71,000 Seattle

10 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican 24 Other 39,000 San Francisco 51 Prefer not to say 71,000 Seattle 0,17 18,24 25,35 36,45 46,60 >60 π‘šπ‘Žπ‘™π‘’ π‘“π‘’π‘šπ‘Žπ‘™π‘’ π‘’π‘Ÿπ‘π‘Žπ‘› π‘Ÿπ‘’π‘Ÿπ‘Žπ‘™ [π‘ π‘’π‘π‘’π‘Ÿπ‘π‘Žπ‘›]..[democrat][Republican] > 20 parameters

11 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican 24 Other 39,000 San Francisco 51 Prefer not to say 71,000 Seattle Age Gender /= 100 /4 [0,100] 0 = Male, 0.25 = Female, 0.5=Other, 0.75 = Prefer not to Say, 1 = Unk [0.25][<20.000][ ][ ][ ]…

12 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican Method 1-of-(C-1) effects-coding: Standard deviation 𝜎= 1 𝑁 𝑖=1 𝑁 π‘₯ 𝑖 βˆ’πœ‡ 2 πœ‡=π‘Žπ‘£π‘’π‘Ÿπ‘Žπ‘”π‘’ π‘œπ‘“ π‘Žπ‘™π‘™ π‘£π‘Žπ‘™π‘’π‘’π‘ 

13 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican age = ( ) / 4 = 40.0 𝜎= 1 𝑁 𝑖=1 𝑁 π‘₯ 𝑖 βˆ’πœ‡ 2 𝜎= βˆ’ βˆ’ βˆ’ βˆ’ =8,12

14 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican 𝑉 β€² = (π‘‰βˆ’π‘šπ‘’π‘Žπ‘›) 𝑠𝑑𝑑 𝑑𝑒𝑣 𝑉 β€² input will be used in place of the original input Having the age average is 40.0, standard deviation is 8.12, and our current value is 30.0: 30.0= (30βˆ’40) 8.12 = βˆ’1.23

15 Advanced modelling of Neural Networks
One of parameters: Italian cities (About 8000) π‘€π‘–π‘™π‘Žπ‘›π‘œ π‘‡π‘œπ‘Ÿπ‘–π‘›π‘œ π‘…π‘œπ‘šπ‘Ž … πΆπ‘Žπ‘‘π‘Žπ‘›π‘–π‘Ž Binary compression: 2 13 =8192 City Value Milano 0,0,0,0,0,0,0,0,0,0,0,0,0,0 Torino 0,0,0,0,1,1,0,0,0,0,0,1,0,0 Catania 0,1,0,0,1,0,0,0,0,1,0,1,1,0 With 13 nodes we can map 8192 values (Having the same meaning/context)

16 Advanced modelling of Neural Networks
Age Gender Income City Political party 30 Male 38,000 New York Democrat 39 Female 42,000 Page Republican βˆ’1, ,4 [0.25][0,3] The model has a mapping ratio of 1:1 between concepts and the number of neurons. Only 5 parameters! Can be managed without neural networks by an IF,THEN sequence in the code

17 Advanced modelling of Neural Networks
Data must be manipulated and made understandable by the machine, not for the humans!

18 Vulnerabilities

19 Vulnerabilities Let’s imagine that we run an auction website like Ebay. On our website, we want to prevent people from selling prohibited itemsβ€Š. Enforcing these kinds of rules are hard if you have millions of users. We could hire hundreds of people to review every auction listing by hand, but that would be expensive.

20 Vulnerabilities Instead, we can use deep learning to automatically check auction photos for prohibited items and flag the ones that violate the rules. This is a typical image classification problem.

21 Vulnerabilities – Image Classification
We repeat this thousands of times with thousands of photos until the model reliably produces the correct results with an acceptable accuracy.

22 Vulnerabilities - Convolutional neural networks
Convolutional neural networks are powerful models that consider the entire image when classifying it. They can recognize complex shapes and patterns no matter where they appear in the image. In many image recognition tasks, they can equal or even beat human performance.

23 Vulnerabilities - Convolutional neural networks
With a fancy model like that, changing a few pixels in the image to be darker or lighter shouldn’t have a big effect on the final prediction, right? Sure, it might change the final likelihood slightly, but it shouldn’t flip an image from β€œprohibited” to β€œallowed”. β€œexpectations”

24 Vulnerabilities - Convolutional neural networks
It was discovered that this isn’t always true

25 Vulnerabilities - Convolutional neural networks
If you know exactly which pixels to change and exactly how much to change them, you can intentionally force the neural network to predict the wrong output for a given picture without changing the appearance of the picture very much. That means we can intentionally craft a picture that is clearly a prohibited item but which completely fools our neural network

26 Vulnerabilities - Convolutional neural networks
Why is this?

27 Vulnerabilities - Convolutional neural networks
A machine learning classifier works by finding a dividing line between the things it’s trying to tell apart. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited) Right now, the classifier works with 100% accuracy. It’s found a line that perfectly separates all the green points from the red points.

28 Vulnerabilities - Convolutional neural networks
But what if we want to trick it into mis-classifying one of the red points as a green point? What’s the minimum amount we could move a red point to push it into green territory? If we add a small amount to the Y value of a red point right beside the boundary, we can just barely push it over into green territory. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited)

29 Vulnerabilities - Convolutional neural networks
In image classification with deep neural networks, each β€œpoint” we are classifying is an entire image made up of thousands of pixels. That gives us thousands of possible values that we can tweak to push the point over the decision line. If we make sure that we tweak the pixels in the image in a way that isn’t too obvious to a human, we can fool the classifier without making the image look manipulated. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited) Global AI Nights - London 2019

30 Vulnerabilities - Convolutional neural networks
+ = People Squirel Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited)

31 Perturbation of math model

32 Perturbation of math model

33 Perturbation of math model

34 Perturbation of math model

35 Perturbation of math model

36 Perturbation of math model

37 Perturbation of math model

38 Perturbation of math model

39 Vulnerabilities – The steps
Feed in the photo that we want to hack. Check the neural network’s prediction and see how far off the image is from the answer we want to get for this photo. Tweak our photo using back-propagation to make the final prediction slightly closer to the answer we want to get. Repeat steps 1–3 a few thousand times with the same photo until the network gives us the answer we want. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited)

40 Snippet of a Python script using Keras
Vulnerabilities Snippet of a Python script using Keras

41 How can we protect ourselves against these attacks?
Simply create lots of hacked images and include them in your training data set going forward, that seems to make your neural network more resistant to these attacks. This is called Adversarial Training and is probably the most reasonable defense to consider adopting right now. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited)

42 How can we protect ourselves against these attacks?
Pretty much every other idea researchers have tried so far has failed to be helpful in preventing these attacks. Here’s how that looks on a graph for a simple two-dimensional classifier that’s learned to separate green points (acceptable) from red points (prohibited)

43

44 Thanks! Sebastiano Galazzo Microsoft MVP @galazzoseba


Download ppt "Machine Learning, Sebastiano Galazzo"

Similar presentations


Ads by Google