Download presentation
1
Arushi Raghuvanshi Prof. Marek Perkowski 24 May 2008
Using Quantum Fuzzy Logic to learn facial gestures of a Schrödinger Cat puppet for Robot Theater Arushi Raghuvanshi Prof. Marek Perkowski 24 May 2008
2
Background: Quantum Robots
H A B P Q Old Duck Biped S1 S2 L1 M6 L2 M5 M3 M4 M2 Quantum Braitenberg M1 Mr PotatoHead (ISMVL 2007) Schrödinger's Cat *character in Interactive Robot Theatre
3
Programming Robot Behaviors
Simple sequential flow with no feedback Behavior Selection sound Theatre Director Input Initialization Quantum or other logic controller Measurement Effectors
4
Programming Robot Behaviors
Adding emotions and environmental feedback Behavior Selection sound Theatre Director Input Initialization Quantum or other logic controller Measurement Effectors Theatre Director emotion Environment including human audience
5
Programming Robot Behaviors
Emotional Interactive Robots with Sensors and Feedback Modifying the Behavior Behavior Selection sound Theatre Director Input Initialization Quantum or other logic controller Measurement Effectors Theatre Director emotion sensors Environment including human audience
6
Quantum & Fuzzy Logic Quantum Circuit
(Can be transformed into Quantum Fuzzy Logic, by replacing gates) NOT -> Fuzzy NOT OR -> MAX AND -> MIN Fuzzy Logic with MIN & MAX operators New Operators and Literals can be defined for Quantum Fuzzy Logic
7
Fuzzy Logic Example 0.3 0.3 0.3 0.3 0.3 0.3 0.7 0.7 0.7 0.7 0.7 0.7 0.7 0.3 1
8
Fuzzy Logic Operations
Multiple ways to create Fuzzy operations Two examples below Example 1 NOT (a) = (1 – a) e.g. NOT (0.34) = 0.66 MIN (a, b) = if (a < b) then a else b e.g. MIN (0.3, 0.75) = 0.3 MAX (a, b) = if (a > b) then a else b e.g. MAX (0.63, 0.83) = 0.83 Example 2 MIN (a, b) = a * b e.g. MIN (0.3, 0.7) = 0.21 MAX (a, b) = (a + b) – a*b e.g. MAX (0.3, 0.7) = = 0.79 As in example 2, MAX and MIN may be misnomers. They can be called OR and AND operations instead a MAX b = NOT ( NOT (a) MIN NOT (b)) =NOT ((1-a)*(1-b)) =NOT(1-a-b+a*b) =1-1+a+b-a*b =a+b-a*b
9
Representing Fuzzy Values on Bloch Sphere
Fuzzy values can be represented in different ways on Bloch Sphere Simplest way to represent is along the meridian (as shown on left) After measurement, value can be 0, 1 or anywhere in between Other mechanisms (e.g. values inside the Bloch Sphere, or parallels of latitudes etc. ) can also be used Z Y X 1 -1 |0› |1› 0.15 0.5 0.8 1 Measurements
10
Quantum Fuzzy Literals
Rotation Around Y Axis Rotation Around X Axis Phase Shift (270 degree rotation around Z axis) Z We use this to define the Fuzzy NOT operations (Other literals can be used as well). X Y
11
Quantum Fuzzy ‘NOT’ operator
Inverter is defined in exactly the same way as in quantum logic: Fuzzy Quantum Not(α|0 +β|1)β|0 +α |1 where the square of the (in general complex) value associated with ket |1 is an equivalent of fuzzy value in interval [0, 1].
12
Quantum Fuzzy ‘MIN’ operator
= Davio (α1|0 + α2|1, β1|0 + β2|1, 0) α1|0 + α2|1 β1|0 + β2|1 α1β1|000 + α1β2|010 + α2β1|100 + α2β2|111 = (α1β1|00 + α1β2|01 + α2β1|10) |0 + (α2β2|11) |1 => Probability of measurement of ‘1’ is |α2β2 2 R (Davio) Input is Kroenekar product of 3 parallel input lines α1 α2 β1 β2 1 α1β1 α1β2 α2β1 α2β2 1 = Toffoli Gate 000 001 010 011 100 101 110 111 α1β1 α1β2 α2β1 α2β2 α1β1 α1β2 α2β1 α2β2 α1β1 α1β2 α2β1 α2β2 = X = Input Matrix Output Matrix
13
Quantum Fuzzy ‘MAX’ operator
The definition of Fuzzy Quantum Maximum Operator is calculated from De Morgan rule: A max B = NOT ( NOT (A) min NOT (B)).
14
Quantum Fuzzy Logic in Robots
Fuzzy Value Sensors Light Sensors 0 = completely dark 0.5 = semi-dark 1 = completely bright Sound Sensors 0 = pin-drop silence 0.5 = normal noise (people talking) 1 = loud crash Image Sensors Motor Controls causing output behaviors Quantum Fuzzy Logic
15
Combination of Genetic Algorithm and Quantum Fuzzy Logic
Back to Robot Theatre…. Combination of Genetic Algorithm and Quantum Fuzzy Logic
16
Synchronizing Lips with Speech
Want This Not This 16
17
Traditional Methods Use mapping of phonetic symbol to a lip shape (as shown on left) Sound streams can be parsed to generate phonetic symbols The methods are language dependent (i.e. different mapping for different language) Need to be modified for speed and style of speaking
18
Using Genetic Algorithms
Sound Input A GA Engine Initial Set of genomes representing lip movements (initial population for GA) Input to Fitness Function (User evaluation – interactive) These are dynamically generated by program B Sequence representing Lip movements matching with input stream ‘A’ *** The matching function is dynamic, so it doesn’t matter if people have different accents, talk slower/faster, etc. ESRA Robot Shows Lips Movements
19
Genome A Genome (or a chromosome) is a pattern that corresponds to a behavior. A possible solution to the given problem can be encoded encoded to create a genome. In genetic algorithms, a set of random genomes are created. When decoded these genomes represent possible solutions to the given problem. In my experiment, a genome is an encoded string that represents a sequence of lip movements. For example: 49__9__31__9__46_1640__ When decoded, this code represents the lip motion for the phrase “Hi I am a robot.”
20
Encoding Lip Shapes for Defining the Genome
Code 0, 1 Upper: 127 Lower: 127 Code 5 Upper: 0 Lower: 0 Code 2 Upper: 87 Lower: 173 Code 6 Upper: 0 Lower: 167 Code 3 Upper: 170 Lower: 120 Code 7, 8 Upper: 80 Lower: 45 Code 4 Upper: 140 Lower: 56 Code 9 Upper: 100 Lower: 45
21
Fitness Function The better the robot completes the problem, the higher the fitness function. When synchronizing sound and lip motion the fitness function would be a user input. To test the Genetic Algorithm, I calculated the fitness function by comparing the genomes to the best solution. The best solution was determined by the traditional method.
22
Fitness Function Algorithm
1 4 9 5 7 _ 3 8 Best Genome (for calculating Fitness Score) 5 3 _ 8 Genome Under Test ↑ Find Difference for each corresponding element Closeness implies better match (4-3 is better than 1-5) Pauses ‘_’ must match in position to get any score, so it is either 0 or 9 X = 4 1 9 3 9-X = 5 8 6 9 Higher number is better now ! Total Score = = 51 Fitness Score % = (Total/TotalPossible)*100 = 51/72 * 100 = 70.83%
23
Selection The higher the fitness score, the higher the probability of being selected. Selection methods include the Roulette Wheel, Tournament Selector, and Truncation Selection In my experiment, I used a Roulette Wheel for selection. Tournament Selection- two randomly picked- the one with higher fitness score is kept: repeated 2 times per reproduction pair Truncation deletes a fixed percentage of the weakest chromosomes, not possible for weak to be select
24
Crossover When two chromosomes from the group are selected they are combined to create a new genome. Dependent on the crossover rate the bits from each chosen genome are crossed at a randomly chosen point. The higher the crossover rate is, the more likely it is that a crossover will occur. The crossover occurs at a randomly chosen point in the genome.
25
Mutation Depending on the mutation rate, chosen bits of the genome are changed. The higher the mutation rate, the more likely it is that a bit will be changed. Shown to the right are many types of mutation
26
Mutation In my experiment I used two different mutation functions
Swap mutation myMutator I created my own mutator which changes a single bit, rather than swapping two bits.
27
Terminating Conditions
This generational process is repeated until a termination condition has been reached. Common terminating conditions are * A solution is found that satisfies minimum criteria * Fixed number of generations reached * Allocated budget (computation time/money) reached * The highest ranking solution's fitness is reaching or has reached a plateau such that successive iterations no longer produce better results * Manual inspection * Combinations of the above. I used a fixed number of generations as the ending criteria. Default-4,000 generations; I also experimented with changing the number of generations.
28
Basic Genetic Algorithm Flow
initialize population select individuals for mating based on Fitness Function mate individuals to produce offspring mutate offspring insert offspring into population are stopping criteria satisfied? finish
29
GA for Lip Synchronization
Automated Mode Test Sound Input Interactive Mode Matching Sequence for Automating Fitness Fn Evaluation length A GA Engine Initial Set of genomes representing lip movements (initial population for GA) Interactive Input to Fitness Function These are dynamically generated by program B Sequence representing Lip movements matching with input stream ‘A’ original sound input In real application, input to Fitness Function is dynamic, language independent, and it doesn’t matter if people have different accents, talk slower/faster, etc. ESRA Robot Shows Lips Movements
30
Genetic Algorithm Behaviors
33
32- less time, but higher objective score
34
GA Results thus far.. Created a self-learning robot that can learn how to synchronize sounds and words with appropriate facial expressions. Finding the best solution depends on different conditions. In general, I noticed that the functions that gave the higher objective scores tended to take more time to complete 4,000 generations.
35
Ongoing work Combining Quantum Fuzzy Logic to Robotic Theatre.
Modify the body language (hand and arm movements) based on environmental sensors Sound Sensors (fuzzy value input) to detect noisy or quiet environments and modify behavior Light sensor values (fuzzy value input) to detect day and nights and modify behavior Quantum Fuzzy Schrödinger Cat sitting on Quantum Fuzzy Braitenberg vehicle arguing with Einstein, singing a song and going crazy .
36
Cat Singing A lively little quantum went darting through the air, Just as happy quanta go speeding everywhere ………..
37
Thank You
38
Genetic Algorithms A genetic algorithm is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are a particular class of evolutionary algorithms that use techniques such as inheritance, mutation, selection, and crossover.
39
Traditional Method (Without Genetic Algorithms)
Phonetic Letters, Punctuation, and syllables Audio Speech Recognition Language Dependent Matches input to correct lip motion: Static Sequence representing Lip movements matching with audio input string. *** Since the matching function is static, it will have to be entirely recoded for different people: they have different accents, talk slower/faster, etc. ESRA Robot Shows Lips Movements
40
ESRA Robot Facial Expressions
ESRA Robot has several motors for lips, eyelids and arm movements I am primarily using lip motors for my experiment Specific position of lip motors define the shape of the lip The shape can be matched with speech Motor for Eye Lids Motor for Upper Lip Motor for Lower Lip
41
Crossover Single Point Crossover
Double Point Crossover gives any two points on each genome an equal chance of being split up. In my experiment, I used a single point crossover with a 90 percent crossover rate.
42
Procedure Create a robot with a face, a mouth, and two motors for lip movement. Assign shapes of the mouth for every sound/syllable Encode these shapes using numbers and characters Create a random set of genomes for a given input. Depending on the number of encodings that match with the appropriate sound, a fitness function will be assigned to each genome. Using a Roulette Wheel, genomes will be selected for reproduction. The higher the fitness score: the higher the probability of being selected for reproduction. To create a new set of offspring, one random crossover point will be chosen for each pair of genomes. There will also be a 1% mutation rate. A new set of genomes (the offspring) are created. Repeat steps 5-9 for a fixed number of generations. Change the Genetic Algorithm parameters and record the dependent variables.
43
Program I used GALib from MIT lab as a library in my program.
I designed my own genome Defined my fitness function Created an initializer function Created a mutator function Program link- Project file EsraGA- Main C++ source code
44
Data Data Tables with swap mutator Data Tables with my mutator
45
Abstract The purpose of this project is to create efficient Genetic Algorithms for robotic learning and the synchronization of speech and visual expressions. This experiment uses an ESRA robot which has a set of motors to control facial expressions including lip motion and eyebrow motion. Emotions can be created using facial expressions and arm motion; however, for the simplicity of this experiment, the focus is on lip motion. Various shapes of the mouth are assigned to the appropriate sounds and encoded. Using these encodings I create a random set of chromosomes. I then use Genetic Algorithms so the robot can develop the lip motion to correspond with spoken text. Next, I use the Genetic Algorithm to test how long it takes to synchronize text and lip motion for varying length, crossover rate, mutation rate, number of generations, population size, and number of offspring. Overall, I concluded that my hypothesis was supported because using genetic algorithms for behavioral evolution, I was able to create a robot that can learn how to synchronize sounds and words with appropriate facial expressions. After testing various parameters, I concluded that functions that return higher objective scores, take a longer time to complete. Some applications of this project include translating text into lip motion for animation movies and humanoid robots. The next step in this project would be to try different parameters such as convergence and migrating populations. I could also develop body language as well as lip motion.
46
Applications With a program using genetic algorithms, matching lip movements to speech are language independent. Also, one can use the same program for different people. In the traditional style, the tables would have to be recoded because everyone has individual accents, body language, and how fast they talk. This program can be used to match text and lip motion for movie animation and humanoid robots. Animation industries don’t have to hand draw lip motion or use a databank of words. This would be most affective if I used a combination of pre-programmed lipcodes and user inputs. This could be used to convert sounds into lip motion so deaf people can understand what is being said in situations in which they can’t see the person who is speaking. I t could also be used in reverse and convert lip motion into text. This could be useful in documenting presentations, speeches, and even court cases. It could also be used to create subtitles in movies.
47
Representing Fuzzy Values on Bloch Sphere
Show L1 through L5 options
48
Synchronizing Lips with Speech
Want This Not This 48
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.