MÁQUINA vs. HOMEM: Quem vence no JOGO do DINO?

MÁQUINA vs. HOMEM: Quem vence no JOGO do DINO?

Today we are going to use artificial intelligence to train a machine to play the Dinosaur Game on Google Chrome and see if it does better than a human being. We've hacked the game a few times here at Manual do Mundo. First I tried to do it with Arduino. The Arduino would hit the computer key and try To jump. Then I switched, I made the Arduino press the key virtually. And then We tricked it even more, putting Sonic on to play in the place of the dinosaur and changing the programming code. We managed to beat the game by cheating a little bit. Only now we wanted to beat the game for real, without cheating, trying to teach a Machine to play better than a human being. What is possible, we know it is, but can we do it? You're aware that Artificial Intelligence is very popular with the arrival of ChatGPT That creates text for us, there are a lot of image creators, search engines that Use Artificial Intelligence… all these technolgies use neural networks that is a technology inspired by our brain. So, now with Petrobras sponsoring our SuperTech series, we are going to use this neural Network to teach the machine how to play. But first, let's understand what this neural network is, right? As you may have noticed, this is obviously a mosquito, an aedes aegypti, this is an eye, a brain and an arm. The moment we see this dengue mosquito, the first reaction is to slap this animal That we know is dangerous and wants to kill. Really simplifying what happens inside our brain, the moment we see the mosquito and slap it, what will happen? Our eye is connected to the brain by neurons. Neurons are cells that we have all over the body, but the brain has a lot, that's where most of the neurons live. And it's a very special cell because it can connect with other neurons and transmit and receive electrical impulses. So, these neurons that come from the eye, they will communicate, connect with other neurons that are here in our brain. Just for you to have an idea, an adult person has more or less 80 billion neurons. And since one neuron can connect with many, those connections number in the trillions. So, come on, did you see the mosquito and how your brain will perceive that? It will notice because here behind the eye we have a kind of sensor, which is the Retina, which communicates with the brain through the optic nerve, which are neurons. So let's assume that a signal comes into your brain and activates a neuron. This neuron that received the signal can pass forward or not. If it passes forward, we say that this neuron was activated. So, let's assume that this neuron here has been activated and it passes the signal to two other neurons. And then, the next neurons can be activated or not.

So, let's pretend that this one activated and passed this information forward or activated other neurons that are in front of it. And these other neurons can also activate the next ones or not. And so the information goes on. Then you must have noticed that you have already created a tangle of electrical signals for all sides. The path that these electrical signals will take inside your head depends on a lot Of things, including the things you learned during your life or some instincts that You were born with, you were born with a series of configurations in your brain. So, let's agree that you don't even need to know how to identify this mosquito exactly, The moment your brain realizes that there is a mosquito there, it will send an electrical Impulse to the nerve in your arm that triggers the muscle here and slaps the mosquito. When scientists studying the human body began to realize that this is how the brain Works, scientists studying computers began to think, "Wow, wait a minute. Is it Not possible to create some artificial neurons and do it on the computer?" Yes, It is possible. In our body, this is called a neural network because it is a neural Network, in this case it is a natural neural network, and this one is an artificial Neural network that we created to explain to you how this works. The function of This neural network is to identify the difference between an apple and an orange. So, if we put an apple in front of it, it knows it's an apple. If you put an orange, It knows it's an orange. But for now it knows nothing. It will have to learn and you will understand how it learns. These little balls are the neurons, so on the left we have three neurons, there is A layer of neurons that is responsible for receiving information from the outside world. So, there we have a sensor that will send information if the yellow is arriving, if The red is arriving or if there is roughness in the fruit that we are analyzing. On the right side, we have another layer that will indicate the probability that it is an apple or an orange. This probability will be a number that the computer will show us. And the big secret of things is in the connections. You can see that the first layer is completely connected to the second layer, there are six little connections there. These connections can be strong or weak. Let me explain to you, imagine that we put something very red there in front. So, this first neuron here in the red one will understand that wow, there is something that is almost completely red. This connection between red and apple has to be strong. Look, here I can control Whether the connection is strong or weak. It has to be strong, it has to say "wow, There's a very high probability that it's an apple if it's red". But these connections can work the other way around too, some information can show

Which object is not the one, for example, roughness. Look at the apple. If you see anything that is rough, it's probably not an apple. So, we can take this roughness connection, let's assume that there is something that Is more or less rough here and say to the apple "look, if there is something rough, Take away the probability of being an apple, reduce this probability". With orange, It's the opposite. With an orange, if you have something that's rough, it's more likely to be an orange. You are noticing that I am changing these connections and the computer is already Beginning to understand whether it is an orange or an apple. But when we create software from scratch and it needs to be trained to understand Something, it's not a person who will be there saying whether that connection has to be weak or strong. The computer will figure it out on its own, so at first these connections come all Scrambled, all messed up. Let's start training the machine. What I'm going to do Here is supervised training, that is, there's a human being who is telling the machine If it's right or not. And with that, it will start to get it right more and more Until one hour there it learns. So, let's assume that our neural network is going To read this apple. This apple, I'm going to guess the numbers here, okay? It has 90% red, it has a little yellow too, about 20% yellow and its roughness is very low, Right? Almost practically smooth, but it has a little bit of roughness, 10% roughness. And the machine is telling me that it has a 56% chance of being an orange and a 20% chance of being an apple, that is, it doesn't know anything, it's thinking that The apple is an orange. When that happens, I, who am a human being, am training, I'll just say to the machine "no, this is an apple. You're wrong". The machine Will change the parameters of the connections a little bit to try to get it right. So, let's assume that yellow here is giving a lot of importance to the yellow in the apple. But yellow is not such a big indication of an apple, so we can tone that down a bit. On the other hand, red is negative. Let's play this here a little bit more for the positive. Now we have a better chance, they're already saying it's apple but they're also saying a lot that it's orange. Let's put another training cycle here, I put this orange here and it has a lot of yellow. There's a bit of red too, after all, the color orange is yellow mixed with red. But not so much red. Roughness has enough and it told me it's an apple again. Then I'm going to tell the Machine that it's not an apple, my dear, it's an orange. Once again, the machine

Will try to change these connections here, let's assume that the roughness is negative In the case of the orange. No, if it has roughness it's probably because it's an Orange. And it's big in the case of the apple. No, apple does not have roughness. So, it's already saying it's an orange. But you have to remember that when I change These parameters, it goes for both apples and oranges. Let's put an apple again. This other apple already has enough yellow. More or less half of the apple is yellow, But it also has almost no roughness, right? So, 0.1 of roughness it's saying it's Still an apple. What happens in this case? I tell the machine that it's correct. Yes, it's an apple, but there's a problem there, right? There is a greater chance of being an apple, but there is still a chance of being An orange that is too big, you can improve these parameters. And the machine can do that. For example, the fact that you're considering the red in the orange too much. The same thing with red only in relation to the apple. It's not cool because if it's red, it has a great chance of being an apple. So let's Improve this connection, make it stronger. Let's add one more orange here to finish off. This image of orange is already quite different, it has a lot of yellow, it has a Little red, but I would say not so much, 0.2, and it has medium roughness. The computer was really in doubt, it's almost divided, but if it were to choose, it would say apple. So, let's associate yellow with orange more. Look how cool, now 85% the chance of being orange, the chance of being apple has decreased And I can even increase the chance of roughness a little. It has roughness and there Is even more chance of orange. Look, 90% orange, 56% apple. Let's go back to that apple from the beginning to see if it still hits the apple because We mixed everything, it might start to miss. 96% apple, 48% orange. I think the machine has learned. What is important for us to understand is that this learning was all through the connections between neurons. So, in the end, machine learning is just fine-tuning the connections between neurons. If we were to store this learning in a file, we would only have a file with six numbers Inside them, which is the number of connections, which is 0.3, 0.5, – 0.5 and so on. With only six numbers, we have a lesson there that we can differentiate an apple from An orange. One important thing is that this was just an example, I only have two Layers of neurons and there are very few neurons. In real life, we can have many Layers of neurons and each layer has many neurons. If it has more than three layers, we already say that it is a deep network.

You can already learn complex things. And to train these deep networks, we use a learning called deep learning. So, the Next time you hear this expression in English, you already know that it is a neural Network with more than three layers of neurons that is being trained there. Even These image generators with artificial intelligence and ChatGPT, which everyone is Talking about, are trained using deep learning. What's more, half of the programming on this little screen that I used to show you How the neural network works was programmed by ChatGPT itself. And it is logical That if we are talking about cutting-edge technology, Petrobras could not be left Out. They do use a lot of deep learning. For example, with sensors that they have installed under the water, they can see what Type of soil it is, if it has any specific chemical element… or by measuring the Pressure and temperature of a well in real time, they can predict whether there may Be a critical moment there in oil exploration. What's more, deep learning is essential to implement 4D seismic, which is the technique That Petrobras pioneered for mapping the ocean floor — and which we already showed In a video not long ago here in Manual do Mundo. And let's play our little dinosaur game. If you've never heard of the dinosaur game, it's this little animal that appears on Chrome when the internet goes out. It gives you this option to have some fun, to have something to do without the internet, It's a dinosaur that's in the desert and it has to jump over the cacti. You hit The space bar, if you have it on your cell phone, just tap the screen and it jumps. Only the cacti come faster and faster, and the game gets harder. Up there it's marking My score, on the side, it marks the record. You can see that I'm not a person who Goes very far in this game, right? So, come on, I've already scored 300 points, For now it's easy because the pterosaurs haven't appeared yet… I died before they Appeared. Let's see if with an artificial intelligence we can improve this shameful Performance a little bit… Teaching this dinosaur is similar to teaching a computer To identify apples and oranges. We need to show something to the computer. In this Case, we need to show what the dinosaur is seeing. So, we made a little program Here in Python that will do all that. In this case, now, the dinosaur is curled Up in the bunch of cacti over there. What it is seeing in front of him, it is seeing This here, look. A lot of cacti in a slightly distorted way, but the important thing Is that you can identify them. If it had seen that, it knew it was time to jump.

The dinosaur is only seeing half of the game here, okay? Because if we were to Put the entire screen, it would take a lot of processing, the computer is very big. Just so you can have an idea, in this little window it's looking at, each dot, each Pixel is being sent to a neuron. There are more than 8,000 neurons just to receive The information. It is very different from the three that we were working on before. I'm going to take another small screen here just to show you. Those two cactuses in the distance are looking pretty, including the cloud over the cacti. And our program is so refined that the dinosaur even sees its own nose, sees the tip Of the nose just as we, human beings, see the tip of our noses. It sounds silly, But with this information, our program is able to know when the dinosaur jumped because The nose lifts. A very cool thing is that this game will have a programming very Similar to the drawing I did back in the beginning. It's as if the mosquito were the cacti, this eye is the neurons here that are seeing, There are no longer four neurons connected in the eye, there are 8 thousand and a Few are connected to that small screen receiving the information. Here, I'm going to have several artificial neurons, several layers of neurons and That's why we can say that we're working with deep learning. There are many layers and then we need to take some action at the end. What is this Attitude? It's not about slapping the mosquito. In the case of the dinosaur, it Means jumping, crouching down or doing nothing, just keep running. In short, based On what it is seeing, he has to take some action and this will be learned little By little in the connections between these artificial neurons here. Here in the dinosaur there is something that makes it a little easier, you don't need To have a human being saying that it is an action or an orange. The game itself manages to tell the dinosaurs if it died or not, how many points it Is gaining and the game can start again and so it trains alone, the dinosaur trains With the game itself without interference from a human being. So let's go! One Important thing is to remember that in the beginning, those connections, they are All mixed up there, the values they have there… the dinosaur is completely stupid If we don't teach it anything. It keeps trying to jump, crouch, there are times When he doesn't do anything, everything is kind of random. It will rarely hit anything Like it did now, for example, and died with 55 points. At that moment, the dinosaur is already training, so it will be jumping around, doing The crazy moves there and, the moment he gets it right, the computer will understand

"Oops, you got it right, looks like this one is the right path. Let's try to vary the parameters a little bit from there". There's only one issue here, do you realize that every time it plays it takes like 5 seconds. Until he gets it right the first time it will take like several minutes and the game Is slow. This game happens very slowly, we can't speed it up… it skipped two, I think it's learned a little better now, let's see. This game is slow, we cannot Speed up the game and make it learn faster. So this dinosaur will spend hours and hours trying to learn something here. You know that here at Manual do Mundo we prepare things before recording and we've Already spent a few days training this dinosaur. I'll show you what happened after 24 hours of training. Look, it can already jump some cacti, it sees the cactus, it jumps, but he doesn't even reach 100 points. Remembering that the cacti are random, so there are times when he can score a few Points more, a few points less, but it is scoring less points than me when I play. That is, the training of this here is extremely slow. We left it here training on Friday night and today is Monday, there was still no time To run the training to see if it learned it. Suddenly, the  dinomay be doing Kung fu. Let's go! Look, 130 points is something, huh? 179 points. It's not sutpid, it's learned. But I think it learned little. The technique we used here to train the dino is what people use to train Atari games, But later we found out that at Atari you can speed up the game and here we can't. So it trained little. Despite training for three days, there are few games for its to learn from scratch. The way we did it here, I don't know, it would take about a year to get good. But calm down, we are not going to end up in this sad way now, no. We discovered Something really cool. Vitor Dias, from the Universo Programado channel, had done The same exercise that we are doing now. He tried to train the dinosaur, but he did something that went way beyond that. Since Chrome's game is too slow to train, he took the game mechanics and re-programmed everything from scratch with his code. And that brings at least two very cool advantages. The first is that you can put thousands of dinosaurs to train at the same time. He said that he puts between a thousand and five thousand dinosaurs to train.

And the second advantage is that, as he has the code behind the game, he knows what's Happening behind the game, he doesn't need that little screen to see it. He manages to know exactly how far away the obstacle is, how high the obstacle is, how fast it is approaching… And that saves a lot of neurons. So, just imagine training the game with thousands of dinosaurs and much less neurons. To mean, Let's see this happening because he also made a very cool interface for us to see the game working. Vitor, thank you for releasing the game to us here. Look how cool. These colorful dinosaurs are the dinosaurs being trained. Thousands of them there. When they hit, they fall behind and die. They just keep skipping the dinosaurs that got it right and, with that, it makes a Selection of the dinosaurs that are working. Right next to it, we can see which neurons are acting, which are very few. Look, there are six neurons just receiving information, then it goes to another layer That only has six and, in the end, we only have two actions: jump or duck, and if Both are off, it means that it doesn't need to do anything . And here you can see how quickly you've eliminated a bunch of dinosaurs and there's Only one left. And this one is good, huh? Damn, the little dino goes far. A training like this that we ran, like, in 40 seconds is already equivalent to months of our training. And look how cool when the dinosaur died. It takes the numbers of this dinosaur that died, the parameters of the neurons there, How they were and creates a legion, a family of this dinosaur that has very similar parameters with some variations. It's something very similar to what happens in natural selection, with what happens to us. Children inherit their father's genes with some modifications. So, now, you realize that this second generation of dinosaurs is much better than the previous one. Now we have at least four dinosaurs here, because yellow is worth two, which are going Very far, that is, this second generation is already much better than the first. When this generation dies, you'll have a much better dinosaur than the first, and Then you take new genes from it, create a family and run away. Third generation now, let's see if this is making sense. The idea is that with each generation the dino gets better until there comes a time When it's perfect, when it won't make any more mistakes not in a million years. I mean, no one will play for a million years.

What I think is very beautiful about this example here that Vitor did is that you Can see the neurons happening in real time, what I drew on the board there, one activating The other and everything else, we are seeing it happen here in front of us . It's a very well illustrated way of showing how the computer is capable of learning. So now we are in the seventy-first generation, and you can see that the whole group Is doing very well. It's not like at the beginning that almost everyone dies. No, now it's much better. That blue graph is the score of the best individual in the family in that game, and The red one is the average of the family. You can see that the two are going up Quite a bit. So, now, I'm going to press ESC here and let it run for a while, let's see how far this dinosaur goes. Generation 155. Some still die, but the dinosaurs are very good. And what I find really interesting is that dinosaur behavior is different. They jump at different times, there are some that jump, sometimes some don't. But they still move forward, that is, the behavior does not have to be identical. The important thing is that it gets it right when it has to get it right. Generation 216. There's something very important about this here. The program is learning, But it's not saying "ah, there is a right distance to press the jump button and that Distance is 200 meters with 20,000 seconds before the dinosaur gets close"… None Of that. There are connections between neurons that are being reinforced there or Are being reduced and that's all we have. You don't really know what's going on, if you asked the machine to teach you how to Play the Chrome game, it wouldn't know how to tell you. And the craziest thing is That this is exactly what happens with other artificial intelligences that we use Today, like ChatGPT. Not even those who programmed the thing know exactly what's Going on in there, it's a lot of connections between neurons just like our brain. If a surgeon opens your head, he doesn't know what you're thinking. And now, to finish, Let's see a human being playing against this artificial intelligence. Let's put It together, shall we? I feel sorry for the dinosaur. Look, the yellow one is me, The gray one is trained by Artificial Intelligence. So, you can see that I'm already ahead, I'm going, I mean… Yeah… and it only has 14 neurons. Look, in the previous episode of our SuperTech series, I wasn't that embarrassed. We were able to show you some things that your cell phone knows about you that you yourself don't know it knows. Maybe it's more things than you realize, huh? Sponsored by Petrobras.

Posts Similares

Deixe um comentário

O seu endereço de e-mail não será publicado. Campos obrigatórios são marcados com *