11:30:58 I don't know exactly about but I really would like to. 11:31:02 We can just like, go over some of the topics experiments that I presented. 11:31:08 If you have any thought to share with us, first by a start from Kashore and India, then everyone else who wants to chip in. 11:31:17 I. This is a bit orthogonal, but I figure since we talked about sort of error-based learning. 11:31:23 I just want to clarify. You know why I think many in the field consider it to be separate or distinct from reinforcement learning, or statistical learning. 11:31:33 Whether that's true or not is always up for debate. 11:31:36 But again, classical literature would refer to it as a supervised form of learning in which there's a model of what the motor actually, the best work is. 11:31:43 This is from Dan Woolpert and others, but where there's a motor action that you want to perform, and you either successfully perform it, or let's say you don't. 11:31:51 Then there's a correction circuit that gets initiated, which then over time gets you to do the motor action correctly. 11:31:59 And so you do need to have an internal model of that. 11:32:01 But there's no explicit reinforcement right? And so now I think you had mentioned the split treadmill experiment again is a perfect example of this type of error based learning, which is done in humans. 11:32:13 And it's thought to be cerebellar dependent right? 11:32:15 So if you have cerebellar lesions, people have much more trouble with this type of split treadmill experiment where basically you have your 2 feet, and then one of the treadmill goes starts to go faster and then you initially, have sort of very odd gate. 11:32:33 But then eventually, people will actually adjust their motor program to be able to walk in a way that seems relatively normal. 11:32:39 But then, if you do that with somebody with a cerebellar lesion, for some reason that that would be much more difficult, and that they can use online correction approaches to try to help them, but that it's not persistent and so so just to clarify that that's 11:32:57 that's at least the one model of error-based learning which is thought to be distinct from reinforcement learning, although now there's a lot more, you know, studies coming out suggesting that there's an interaction at least at the brain level between these basal ganglia 11:33:12 reinforment, learning circuits and the cerebellum. 11:33:15 Just what was that? 11:33:29 Awesome, they said. 11:33:40 Optimization problem, of your life. So this is a really last question like, why is it? 11:33:44 This is the. 11:33:49 Why is always a tough sort of, I think there's one version of it which it came from the brain up, and that, I think, is true in human work. 11:33:59 I don't think that's necessarily true in civil motor skill, learning in non-human animals. 11:34:04 But yeah, I don't have a particularly good reason as to why. 11:34:12 Defense understanding of. 11:34:17 Just for the recording. Yeah, yeah, okay. 11:34:27 I just wanted to. 11:34:29 Typically it's. 11:34:35 An inverse model. 11:34:39 And so the extent that. 11:34:45 Okay. So I mean to the extent that we're starting to look at. 11:34:52 Perhaps the reality that in any learning tasks, that multiple things are being learned simultaneously, this airbased learning is kind of a perfect example. 11:35:03 Where most people, you know now acknowledge that you have multiple models learned simultaneously. 11:35:23 Do I need to repeat that? Is it multiple models learned simultaneously? 11:35:29 Or is it multiple learning mechanisms contributing to the same model? 11:35:31 I mean, I don't have an answer to that, you know, when you said that I was like oh, I don't know. 11:35:36 I would have actually have assumed that opposites, that there is multiple learning mechanisms possibly contributing to the same building of the same model depends on your definition of the model. 11:35:47 It depends on your definition. I'm off. 11:36:01 So the split gate and the cerebellum thing that sounds really interesting and I'm slightly worried that maybe it's because the output of the game is spinal and the cerebella spinal circuitry, you know, not so you know as opposed to some other output form how could that 11:36:21 happen? One, be sure that this is the nature of the learning, and not just what you need to express the learning. 11:36:29 I think I'm not. By no means, thanks. 11:36:34 I think that's a great question. 11:36:40 Is there any cerebellar person here? Negative? Very best people. 11:36:50 And. 11:36:53 Suppose larger. 11:37:00 Sorry. Guess, okay, large variability in cerebellar neurons corresponds to faster learning in mice. 11:37:06 Right? So there is a curlation between, not not at the motor output, but at the at at the learning, right. 11:37:16 This stage. I'll find a paper. 11:37:19 Want. 11:37:21 I think the state of the cerebellum might be thought of as State estimator in general, so not a final motor output. 11:37:28 But state, estimation. 11:37:33 So it's just I started talking with a statement I have. 11:37:38 I don't understand the difference between error, learning, and reinforcement, and I think maybe the difference is this is sort of the second part, which I wanted to say that even in the warm, what we've found is that there are in and simple republican like learning scenario there are multiple reinforcement signals 11:37:59 and multiple different different circuits, some of them just like the temperature, and some of them like the food that has been there, and some of them don't like. 11:38:09 If there is no no food right, and that's a simple warm, and I think that maybe. 11:38:18 If you is that the just, it's all some form of error or reinforcement. 11:38:28 What? But then the error is a negative reinforcement that goes through one channel, and sugar is a positive reinforcement that goes through as a channel, and we just don't need to view them as exact opposite of each other. 11:38:45 They are slightly different because they go through different passwords, but eventually they get added up in an error based model. 11:38:54 Largely. It's considered you need to have model, or you need to have there's a supervised element to it, whereas in a reinforment learning you don't need it, you do it right but it's based on the feedback. 11:39:05 Can I? Make a suggestion as to maybe, how to resolve that? 11:39:10 I think I think you said that what you meant by our ways in learning is some form of supervised learning bye. 11:39:20 So, as we know from Sutton, Barto, the main difference between supervised learning there's a couple of main differences between supervisor and reinforcement learning. 11:39:28 One is that in supervisor learning you get feedback every time 2. That the feedback is much more informative in general than reinforcement learning, because in supervised learning you get to know what the correct answer would have been, whereas in reinforcement learning you just get a scalar signal as to how 11:39:43 good, you response was, but not what the exact correct response would have been. So to me. 11:39:51 Those are different tasks. Kind of paradigms. 11:39:53 And again there's a distinction between what is the task and what mechanism? 11:39:58 You are kind of deploying at trying to solve that task, and it is the case that some learning mechanisms or algorithms will be able to solve one of these kinds of times better or more efficiently than the other one. 11:40:12 But I think at least at the level of the task. 11:40:16 Setting, I think, it's it's it's fairly the distinction is is fairly well defined. 11:40:24 Now you can solve supervised learning task with with a reinforcement. 11:40:32 Learn essentially of enforcement, learning mechanism. It's hard to go the other way around. 11:40:35 Usually, I think, but the better your computer better your, but that's different. 11:40:45 And that's what I better. There are. You know how the task setting is defined is different from how you are solving it, and whether your particular algorithm somewhere in there explicitly computes an error or not, that to me is orthogonal as to whether you are solving 11:41:04 a supervised learning task, or a reinforcement learning task. 11:41:07 You can solve reinforcement learning tasks without explicitly computing errors necessarily and you can probably do that with supervised learning tests as well. 11:41:19 So again, I think the question is whether we are talking about learning mechanisms, learning. 11:41:23 Algorithms, whether they internally compute somewhere an error signal or not, or we are talking about a task setting, as defined by, you know us, the experimenter, whether we are providing explicit, instructional cues to the animal as to what it should be doing at every point along the way that's 11:41:46 that's the way I see. 11:41:55 Maybe matter. And sure. Do you think that we can formulate that that split treatment task into a reinforcement learning task? And so then, error driven supervised? 11:42:05 I think, as you mentioned, I mean, people use reinforcement learning algorithms on these asks. 11:42:13 And they can be, I mean, for sure. True, I remember the work community wanted to. 11:42:20 But there's ways to kind of push it right where you try to introduce a manipulation that would work in a reinforcement. 11:42:29 Learning, paradigm that actually doesn't help the person on the treadmill. Right? 11:42:33 So you can describe the behavior, but the can't manipulate it. 11:42:38 Using this traditional approaches that you would use in reinforcement learning. 11:42:42 But I don't remember what those details were. Sort of be. Their views. 11:42:50 So, then on top of that, imagine that you do all the experiments that you were planning to do with the human Ilya, and you can show that again. 11:42:59 It's not that some specific kind of a really error driven behavior is happening, but human subjects are learning a distribution. 11:43:09 Imagine that we have that result. How should we conclude that what should we conclude from that that result? 11:43:16 Then this trade meal, the display treatment task is not a reinforcement learning task. 11:43:25 I mean, it's an argument about definitions. It's an argument about definitions. 11:43:41 And this is in some sense almost always pointless arguments, almost always pointless arguments. 11:43:46 I don't know with the, but we can turn it into an argument about definition, arguments about algorithms I mean, on Monday Matthew made a very good point that computation, like also, when I started this morning computationally, maybe we can really cast all of these questions, in terms, of reducing 11:44:08 errors right at a very abstract mathematical level. 11:44:12 But then, at the end of the day, we are dealing with some agents that need to implement a behavior based on some algorithms follow some algorithms and follow some algorithms, and that at the end of the day that algorithms should basically be also implemented in the brains so maybe definitions are basically about that 11:44:31 computation. And we can kind of forget about that. But algorithmically, do you think there's a difference between reinforcement learning and your proposal for birds? 11:44:41 How basically they were the difference between reinforcement learning and your proposal for birds, how basically they were via Bayesian inference and Beijing learning. 11:44:47 They were forming this distribution. So in our paper in Appendix we tried to prove that the standard formulations of reinforcement learning cannot explain this. 11:44:56 It's somewhere there. Right? So yeah, at least not please. 11:45:00 Not the. There is a difference with textbook formulations, with human treadmill. 11:45:08 Learning there is actually a very interesting result itself. The advantage of being at a university was a huge medical school. 11:45:17 Right. We have rehabilitation hospital, and I found out a couple years ago that when somebody has a brain injury and loses ability to walk, they, of course, are being put on the treadmills slowly. 11:45:31 Teach them to walk again, and it's a very well known phenomenon that if you put them on a treadmill that moves at a constant speed, it takes forever. 11:45:42 But if you put them on a treadmill that shakes, slows down, stops, activates, velocity, is not a quantity. 11:45:52 Then learning is predictable, is much faster in terms. Protocols were proved in the seventies, and once they are proved, they are kind of frozen. 11:46:01 You cannot really experiment with new ones, but that's a part which I couldn't understand from the perspective of. 11:46:17 Is that reinforcement or supervised, or whatever it is like. 11:46:21 Why would? It'd be faster? Yeah. But then so why would it be faster for an animal? 11:46:29 To for human right to learn on something that shakes and the animal is not exploring. 11:46:37 Right. It's effectively exploration in the sense that you may sometimes accidentally hit the right step because they treadmill is just shake the right way. Once you were doing it. 11:46:51 But you know that you are not exploring right? You know that you are working at the same, or trying to activate the same circuits and so on. 11:46:58 So I don't know how to explain this but this is an observation that doctors use since the ages, and how to model. 11:47:05 This with any of the I don't know what the name of the model is. 11:47:08 I just want to understand what's going on. 11:47:18 I don't know if this is a worthwhile direction to go down, but I would say I think there probably is exploration going on there. 11:47:25 I mean in a developmental context. I think a lot about trajectories of change. 11:47:30 And if you're starting at a different starting point, your trajectory is different. 11:47:34 And so, even though they're trying to do the same behavior again, the path by which someone who has had a brain injury walks again regardless of where that injury is is an exploration of finding a new circuit. 11:47:45 So. But it's in me. It's a fascinating film. And I had no idea about that. It's amazing. 11:47:51 So just related to that isn't what you're talking about. 11:47:54 With the randomness, with the threat Mills related to how, in animal behavior, if you give an animal kind of rewards on a kind of random schedule, and they tend to learn faster for that which is what you see also in like gambling addicts you know, if you actually go to vegas, you know, you 11:48:12 get people that take it much faster by giving them random walks. 11:48:18 Yeah. 11:48:26 That's also true. Yeah, I mean, it's a very interesting observation. 11:48:43 Model the situation my best. 11:48:49 What I want to do from this experiment is different. Human have different variability, difference. 11:48:56 And I want to be able to measure houses. Person tries to walk, and you know they sort of cannot really walk one step, one foot is behind, and by this much, with this much variability, etc., etc., and I want to be able to tell the doctor, oh, you need to run your treadmill. 11:49:11 With this there velocity, profile, and then this person is going to. 11:49:15 I mean, I'd like to do that right, and I have no idea how to it would be particularly interesting to look at patients who had cerebral injuries versus cerebellar drugs. 11:49:27 This question of the probabilistic reports. 11:49:34 We're very interested. 11:49:37 All the evidence suggests that animals express their learning more quickly, but we think we have a way of demonstrating the acquisition of the contingencies are actually pretty similar, and really, what's changing is that the strategy the animals using in a probabilistic paradigm versus a 11:49:57 deterministic paradigm are actually quite different. You know. 11:50:01 I think that's a very interesting question, doesn't seem directly connected to what we're working. 11:50:06 But I think this idea, that sort of having some non-rewarded trials at the animal expects to be rewarded on essentially triggers them to change their approach on the immediately following and then that leads to the behavior looking like the performance, is better and when in reality they had already learned the 11:50:25 contingencies to be even expected. The possibility of reward on that that makes sense. 11:50:38 I thought, Brad, did you wanna say something before? 11:50:43 After, the. 11:51:02 Rewards, adding noise to help, especially on the rewards scene, which is kind of strange, and that's like definitely. 11:51:09 Yeah, so that's the opposite spaces of me. There's a test where. 11:51:19 You can't kind of do the obvious thing in the first step. 11:51:26 Those. That's enough direction against the solution. 11:51:41 So can I go back to the the model based versus Model 3. 11:51:49 Rl, is it? I think, at least in my mind, it's tempting to conclude that we might have a kind of a different structures that one can learn this value-based policies basically and the other learns about the environment structure and means a model so and then I am tempted to conclude 11:52:14 that. That's the second system. That means this model of the environment maybe that's the statistical learning system that we kind of. 11:52:27 We talk about. So who agrees with disagrees? 11:52:34 Maybe to do. Those who disagree should comment. Everyone agrees. 11:52:50 This is probably going to be the topic of multiple of these discussions, but it seems as though but that makes it just a ton of sense to me and then, in the nut standard. 11:52:59 But in the statistical learning paradigms that look at sensory regularities right? 11:53:04 That is essentially taking advantage of that system, and so just to say, I agree. But I'd like to pass it on to somebody else. If you agree or disagree, please jump in. 11:53:24 So! 11:53:29 But yeah, that's. 11:53:33 Sometimes. 11:53:50 Good, we solved it. 11:54:20 Oh, sorry. I was just going to say that in the sometimes I was curious about, even if even in these model-based systems, if there's a sense in which that was learned implicitly, or because, you know, we believe that down the line will be relevant for some reward task it's some structure has been 11:54:40 learned sometimes figuring out what the optimal policy is given, that information is also just computationally, reasonably challenging. 11:54:48 So I was sort of curious about whether that's yeah. 11:54:54 That feels like a sort of different motivation to learn directly policies, for like how to do well versus learning. 11:55:00 Something general, because you have this like computational cost. If I learn something gender, then I ought to think a bunch about. 11:55:06 Now what to do with that? 11:55:11 I mean one thing that I still struggle with is that you know there's a sense of Cisco learning that might be related to you know how one might make decisions. 11:55:26 There's a and you know I kind of think of this as some set of relationships between you know, precision, perception, decisions and actions and rewards and things of that sort. 11:55:39 And then there's another aspect of statistical learning I still think about in terms of, you know, basically just the pattering of our sensory systems in relationship to the environment. 11:55:53 And so you know, one example would be the extent to which all of us are learning the acoustical structure of this room by being in it, and so that after we're in this room for a certain period of time, especially under repeated circumstances we're able to better localize where somebody speak in the room because 11:56:12 we've kind of learned. The director reverberate ratio of the sounds in the room, and I still kind of feel that that type of Cisco learning has a model-free component, and that you know, it's something that is basically pattern in the responses of our perceptual 11:56:33 apparati, but then the way it's learned is through a biased structure where attention and reward shapes. 11:56:41 How we might learn some of these more than others, and I still don't think of that as policies per se. 11:56:48 But I do think that one could make a counter argument. 11:56:53 And so that's where I kind of feel it's both model-based and model-free. 11:56:58 Or that model free doesn't exist. It's just a different model. 11:57:06 Yeah, so in reaction to what is sheet of? Was saying, I just wanted to. I mean, I guess that's what you meant. 11:57:13 But of course kind of motivation for distinguishing between model-based and model free comes exactly from the principle you mentioned is that yes, in principle, you know, the strictly normative thing to do would be to learn a model of everything and then you know compute policies on the fly 11:57:35 every time based on a model of everything. But that's computationally prohibitive, and therefore you can only do approximations. 11:57:42 And so really, the question is out of, you know, realistic, approximate algorithms. 11:57:48 What sort of algorithms you're going to use. And there, the once you start, consider a proxy as you have to. 11:57:56 Then the road bifurcates trifle kids, or many farcades, multifarcates, I guess, is would be the word. 11:58:06 I guess. Poly fork it. That's right. 11:58:08 Yes, and and and and now you can choose different algorithms based on again, on the constraints that you care about. 11:58:16 If your constraint is is compute time, then your absolute right. 11:58:21 You should not try to use them model, because it's just going to be computationally expensive. 11:58:24 If you should do model free. If your cost is behavior of flexibility, then a model is actually a good idea, because it allows you to be flexible, like related learning. 11:58:35 So absolutely now in reaction to what Aaron said II actually also tend to think. 11:58:43 And I think my thinking has actually evolved quite a bit over the past couple of years. 11:58:46 Days here, but it feels like it's been so intense, at least, when I think about statistical learning. 11:58:56 But I really mean are 2 somewhat different forms or kinds of learning. 11:59:04 One is what Athena has been referring to, and many people in the field as model-based learning, and I would totally agree with you that if I had to choose between model free and model-based, I would definitely choose model-based as having to do something with statistical learning on the other 11:59:21 hand kind of underneath all these model, based on model, you know. 11:59:26 But when we teach reinforcement learning, and we come to the point that look here are here possible reinforcement learning algorithms starting with vanilla Td as a Federica example of auto free and then going on to other things, the standard way in which we teach this is what is called the 11:59:45 framework the theoretical framework of Markov decision processes, and, of course, even if you don't know what they are, the key thing is something that I think Athena also mentioned is that for that you need to start from an assumption that you know what is the current state of the world that is relevant 12:00:00 for you in terms of you know how things are going to happen in the future, and how your actions are going to affect that future. 12:00:07 And so this whole model-based model free starts from that point when you already have a state, and then the question is given, your knowledge of states. 12:00:17 And how do you learn about things in the world? And how do you use that? 12:00:22 But of course, states are not God given to you. You have to work to extract states from the environment, and so that's when this other form of learning which I would also call statistical learning comes in, which is often called and in fact, even in the definitive textbook of the field of computation 12:00:42 neuroscience. Dan and Abbot. Chapter 10 is something that is called their representational learning, which is about how we learn useful graphs of the world right? And so in the big scheme of things. 12:00:57 When you, you know when you do ringinforcement learning, and you know, if you ask me, that's what whole life is about. 12:01:03 It's all about reinforcement learning. But what it's useful to break it down to components. 12:01:08 A really important component is how you get from raw sensory observations to state. 12:01:15 And that's where representational learning is super important. And it just so happens that the the computational principles that underlie you know useful algorithms that are going to be useful for learning those representations. 12:01:32 Are going to be very similar to the computational principles that underlie how the algorithms that are useful for learning models as a model-based reinforcement, learning. 12:01:44 So long story short, I think there is really 2 forms of learning here there is model-based reinforcement learning which operate which kind of operates on states and learns. 12:01:54 You know the structure of the world at the level of states, and how states are related to each other as a function of our actions. 12:02:02 And there is a, if you like, a lore level form of statistical learning which we sometimes call representational learning, which then goes from row, sensor observations to states. So we stopped sorry. 12:02:13 That was long. 12:02:16 Thanks. Matthew. Oh, yeah. Okay, so. 12:02:29 Thanks a lot. So now I have in mind. So there are some very simple paradigms like oddball, paradigm or mismatch, negative. 12:02:37 I believe we can call them also statistical learning. I think Ellenell can uses those paradigms for the purpose of statistical learning right? 12:02:47 And they are super simple, it's just like the oddball garden for those who know it's just like, Imagine, it's a sequence of aaiiaaiaaaa. 12:02:53 And then suddenly B, so now the response to B, when it's embedded in this background of A is different when just B alone. 12:02:59 So it's a mismatch negativity, right? 12:03:02 You can't give it in the under, andesthesia, in the A. 12:03:05 One of rat mice it just doesn't leave any conscious perception models. You. 12:03:10 Can only have one neuron that has some sort of intrinsic properties with adaptation, and you can get a mismatch negativity or adaptation responses in these paradigms it's some sort of statistical learning. 12:03:22 But it's just we don't. So how should we treat this one? 12:03:26 And I think it's a kind of a spectrum of complexity, of an electronic, actually has some interesting, very beautiful works. 12:03:35 He will be here. He will be here in 3 weeks, so that, like you, start, change in the complexity, add some memory, and then the difference between the anesthesia and awake comes in, and things like that. 12:03:47 So it's since there's kind of a spectrum of complexities in the in at the input that's that can be now learned, maybe with different mechanisms. 12:03:57 So how should we think about that? Is it model based? Or so it's and we should not think it. 12:04:07 's a kind of we should not try to put it in model based on Model 3. It's another. 12:04:14 Okay, upon which? 12:04:21 I like that distinction. 12:04:28 On which level it is a perceptual representation. 12:04:36 Because? 12:04:48 Absolutely. I think that's a great question like, how like, how much has dependent their representations are? 12:04:57 I think some are pretty fast, general and there's going to be some that are casting events. 12:05:01 So the picture that I came to was deliberately characteristic. 12:05:04 I think there's some value in that. All sense. 12:05:11 Well, then, you have to deal with it. Biology initiates your total right that in some of our representations are more general purposes. 12:05:20 Here it was, you know, scaffolding the for 3% of all organs, or actually specialized. 12:05:37 Anything better than the theoretical framework which can capture. That's actually an interesting. 12:05:50 Think there's a simple explanation, and you see it both with misinformation and with is that both of them? 12:06:05 You know the system is not consciously aware of this. 12:06:10 And so that, as the kind of more pure central learning, component, representational, both of them can be shared, how attention, and by our rewards and other reinforcements. 12:06:29 And so I'm still sure what we want to call that now modeling by. 12:06:37 Its yeah. But the is that you have the end quick. Yeah. 12:06:47 And that basically that's being represented in perception. 12:06:51 And then you have attention. Attention. Well, basically bias that response. Very early centric warriors, where you know will be enhanced. 12:07:06 And in reinforcement will change the learning of how that is being captured. 12:07:16 It's a simplistic, but you don't think that the reinforcement will change the represents. 12:07:22 So what I think by running great is change the representation, and I these things are not truly independent. 12:07:33 The same signals that are thought to be involved in attention are also involved in reinforcements and so like, there's lots of that. 12:07:45 You were to try to understand workers, and it is. 12:07:55 What we're trying to biological. 12:07:58 I think this is all very interesting. So a lot of my experimental bread and butter recently has been looking at representational perceptual change of infants in different predictive contexts or different learning contexts, and I, we've been able to show that you find differences, in perception that seem to be 12:08:15 connected with prefrontal cortex, activity in infants in short, amounts of learning, so is that a technique in this framework, and I think this is a very useful framework to be working on, I would say yeah, I'm so proud of what everyone is saying, here. 12:08:28 In that context. But you know the idea that there can be this like very preattentive sort of representational learning. 12:08:36 That's very, very simple, fine! I don't think that explains the phenomena that we're looking at. 12:08:41 Whether or not that's attention is a question. I wish Florist was here to weigh in on. 12:08:46 When do you find to dissociations between predictions and attention that has been arguments that they're the same? 12:08:51 There's been arguments that there's different. I am personally agnostic at the moment, but you know, I think, that there is work out there to suggest that it doesn't. 12:08:58 All that sort of attention, biasing representations doesn't explain everything. 12:09:07 Attention. So when we first operationalize in extension in a useful way, yeah, then, some forms of attention. 12:09:20 Yeah, I don't love attentional research, partly because I'm like, what the hell is it? You know what I mean? 12:09:27 Like, part of the reason why I started doing more predictive work and not attentional work was because it was much better formulated. But it is. 12:09:34 I mean, it's an important question, right? And we do see in some of our tasks you get like frontal parietal connectivity. 12:09:38 For example, in a way that, like well, maybe that is attentive, but I mean, I do worry. I have to say one of my criticisms of not just the attention field, but a lot of folks is when you see these sort of modulations. 12:09:50 They're like, Ok, that's just attention. It feels like we're just throwing something into a bag and not explaining it. 12:09:56 I feel like I just don't give a shit at that point. 12:09:58 No, but you can. Yes, whether you call it attention or not, we basically have hi statistics of the we're making that filter through. 12:10:15 You know right now, some basic centralized, and that something that will change on that response. 12:10:29 Attention, Launching, reinforcement, learning well, so there's one type of change that basically, you could think of as momentarily changing that activity happen. 12:10:44 Be the same soon, but computationally. But then, what kind of signals would be? 12:10:58 Then again, changing representations, because representations change in ways that is not just this simple preattentive kind of thing. 12:11:13 That all are called attention and. 12:11:16 And that's where there's a lot of work. 12:11:19 So to me, this is, maybe this goes back to one of the things that I talked about earlier. 12:11:25 A little bit is like, are we talking about one model that's created with multiple different learning systems acting on it? 12:11:31 Or are we talking about multiple models are created? And to me, part of what you're saying is more that there's one model there's one set of representations that get modulated in different ways through multiple different to me that makes a lot of sense. 12:11:43 Right I think, my Major, when I originally raised my hand, mostly what I wanted to say in relation to Monte was proposing is okay. 12:11:49 So maybe there's some kind of like this very simple representational learning. 12:11:52 And then there's some other complex, maybe model-based reinforment learning. 12:11:57 These are obviously interacting as well. And if they all ultimately act on similar or the same representations that are relevant to that learning context are relevant to that task. 12:12:05 That means that we have one model. That's being learned and modulated by multiple different mechanisms. 12:12:10 And then that allows a very to me an easy way to have these things interacting. 12:12:14 And it also allows learning to be really complex and flexible. 12:12:19 Right one of the things I was reflecting on. I don't want to take time long time on the mic, but back when I was an undergraduate, one of my mentors said to me, I was doing very simple cognitive psychology, work, and one of the things he said to me was we actually 12:12:30 throw out all the most interesting data which are the practice trials that we get before we actually start the experiment and part of what's inspiring to me about this field is that we're looking at those practice trials. 12:12:40 Right, and he admitted it was like that is the most intractable problem. 12:12:43 I mean, this was back in the early 2 thousands, but this is the most intractable problem. 12:12:47 How would you possibly ever explain that so that's fantastic. 12:12:52 But we ultimately want to be able to explain how this adaptation happens, and we know there's many paths, and we know there's many mechanisms we need to both segment them to explain them. 12:13:00 But also ultimately not lose the fact that they're all operating. 12:13:03 And if you want to scale it to the real world, and you think of language learning as an example, we know that there's so many different patterns. 12:13:10 It's so individual. There's feedback, I mean, all of it's happening simultaneously. 12:13:15 And it's all building up a language learning system. So we need to be building interaction. 12:13:19 To begin with, we know these things interact. 12:13:27 To do with status, with stimulus, specific adaptation, and preattentive processes. 12:13:33 So I mean we I often hear that priattentive processes are simple, and II react to this. 12:13:41 Because in in the auditory system there's a lot of pre-adentive processing happening. 12:13:47 And I think it's very far from being simple. I probably my Phd students probably don't know what is pretty attentive for us, just like we can give you. 12:13:57 Breathe. Yeah. So this could be another. It's like talking about attention, right? 12:14:02 As complicated as attention. Preattentive, would refer to things. 12:14:05 You're not aware of that are happening without you being aware that you're learning. 12:14:09 Isn't. Learning them essentially, or that you even have a representation. 12:14:13 And just to give you 2 examples that are actually the next, the next comment I was going to make about representation. 12:14:18 What is representation? So in the auditory system, in this, within this pre-attentive processes, you have processes that are acting online like stimulus, specific adaptation, where you do Dttdt, Dt. Dt. 12:14:33 Dt. And are these generating a representation? They are definitely being detected by the system. 12:14:41 These statistics are being detected, and they are changing the activity of the neurons. 12:14:45 Are they generating a representation in the sense, I mean? 12:14:50 It would depend how we, how we conceptualized representation. 12:14:54 But it's not something that lasts for more than 2 or 3 s in the road, and breathe. 12:14:59 For example, this information about the statistics is completely gone. 12:15:03 If you silence this, the stimuli for 3 s. 12:15:06 While there are other processes that are also preattentive, like, for example, learning the sounds of a room. 12:15:14 This is something that we do automatically. We're not aware of. 12:15:16 I always. I think I've already mentioned it to a couple of uses typical example that you don't realize you're hearing the ventilation until it stops working. 12:15:27 Your brain never told you that ventilation was part of the sound characteristic sound of this room. 12:15:32 It's only when something has changed. So obviously, you had a representation. 12:15:37 You had learned this, and it was stored somewhere. But it's pretty attentive. 12:15:42 So these are to me clearly 2 forms of statistical learning, but they are very, very different, and they are both likely happening at a very early level of the of the circuit. 12:15:58 How complex can those sorts of representations be? Because with the way you're describing them, both of those to me could be explained, based on processing of individual things, right processing the ventilation system processing, you know, the different tones, do you see that that kind of preattentive processing 12:16:19 could also be sequential, for example, or spatial. Just totally, not a question. I've no idea. 12:16:28 I think that's a very good question. And this is something we're actually starting in the lab. 12:16:35 And you obviously to learn motifs at a very high level. 12:16:40 And this is humans which might be different because they have training on sequences, speeches very sequential, and music. 12:16:48 But you can detect if you hear a new piece of music. 12:16:55 You are very quickly and very automatically detect motifs, and often these motifs are repeated at very different frequency ranges, and you will still detect it as a melody. 12:17:07 A motif that belongs to that melody. So it's it's quite complex. 12:17:13 I would say. 12:17:20 It's pretty attentive in the sense. I mean, it's probably related to what you were saying about. 12:17:25 If you tell people there is a grammar, they have a harder time finding it, and I think it's pretty attentive in that sense that it's something your brain does automatically. 12:17:36 Yeah. 12:17:43 So one thing that kind of connects the phenomena that you described in terms of you know, when you have some Cisco learning that's then forgotten. 12:17:55 A few seconds later, versus you know, when it's long-lived, you know, can be the interaction with the amount of time of training. 12:18:03 What reinforcement signals are there, and things of that sort? 12:18:06 So this is a place where perceptual learning has done a lot of work. 12:18:09 So like, you know some of Ybert dins work, you know where he'll have people kind of walk around where you know they'll have like back in the days of an ipod that would basically be connected to some solenoids that would palpate against the finger and that basically they'd 12:18:26 be, doing this during the day, while doing other task, not really aware of the stimulation, and that depending upon how much time you're stimulating, you basically find that on the representational changes which are typically measured in terms of the representational changes which are typically measured in terms of you know tactile 12:18:44 sensitivity, either are short-lived or long-lived, and then there's other studies that show that, if you pair these things with reward and things of that sort, it depends upon whether it's short-lived or long-lived, you know one of the theories that have 12:18:58 been put forth, you know, is that you have a certain amount of stimulus activation that might be appropriate. 12:19:09 Some people say that's like short-term potentiation, or something of that sort, but that you might need something extra to be able to have it turn into Ltp. 12:19:18 Or something of that sort which may or may not be the correct models, and then going into the idea of how complex it could be. 12:19:25 You know one of the things that've been put forth in the theory once again, for you know perceptual learning is that you have multiple stages of representations, and that a lot of these, you know, are can still be preattentive to some extent, and that the complexity of what you might be all pickup might 12:19:45 depend upon your hierarchy of representations, and that might differ between the rodents and the human. 12:19:54 Where, of the original, highest? 12:19:59 Most relevant. Do you have to? 12:20:07 Yeah, and this is actually a point where Barav and I disagree. 12:20:13 And so I'll kind of throw this out and see whether you're enticed to respond. 12:20:17 Is that I've often tried to look at this from this standpoint, that the earlier stage representations show more limited amounts of plasticity. 12:20:31 So slower learning rates, and so that there's many contexts where you won't get a longer term change. 12:20:42 Then when you basically find when in a higher area, Rob has put forth the idea of this reverse hierarchy, where you know, basically, you start with learning of the higher layout level, and that comes down, I think both are interesting frameworks. 12:20:57 I'm just curious. You want to add something wrong. 12:21:04 So, if I understand you correctly, what you are saying is that you propose that lower level of less inclined to be plastic. 12:21:13 One kind. 12:21:19 Okay, and the extent to which you'll observe learning across the system. 12:21:24 Well, in what way? Right? I mean this love, this level of description is agreed by both of us. 12:21:37 In what way is it dependent? So I stand for myself, and then, perhaps someone can stand for his. 12:21:45 So we propose that when you are performing a task, so whenever you do anything, whenever you're exposed to anything, every minute is a plastic minute. 12:21:57 So when you perform a task, your initial stages, it was in the context of perceptual tasks, but in your initial stages are faced are the higher levels which means they are more plastic in the sense that they change faster even though we don't disagree that lower levels change 12:22:18 too, and the advantage is that in terms of representations represent previously learned representations. 12:22:26 Anyhow, in terms of representations. It's more general, typically because higher level of more abstract. 12:22:34 So you are changing faster. It's to a broader context, perceptual context, then it's more specific to the environmental context. 12:22:45 So it would be specific to to broader aspects which are represented at higher levels, but it would generalize more across basic features that are more obstructly represented at higher level. 12:22:59 And then, if you are asked in terms of what the task requires, if you are asked to be more refined in terms of the basic features, then you would sort of seek lower level representations and find the more informative input level and then make changes there which will allow you to perform better but this is only possible. 12:23:21 if you get a chance to actually access the more informative level. 12:23:27 So it has an impact on what are the protocols that would allow it? 12:23:31 Bottom up changes and basically both induced changes would be the outcome of adaptation and homeostasis processes that would induce changes aimed to roughly retain the level of activity. 12:23:50 And of course, there's big, open question of how these interact. 12:23:58 And the reason why I have it. If a different model actually goes to the experiment. 12:24:04 I briefly showed on day one where I show this example, where you had this person with a tube in the mouth. 12:24:10 They're getting drops of water and are showing different information to each eye. 12:24:15 And so in that context I was trying to create a circumstance where this stimulus that was being paired with reward was one that would not be quote unquote, seen by the higher level representations, because basically when you start getting past primary visual cortex there are very few neurons that 12:24:42 are responding just to one eye. Most of them are binocular, and so their responses are dominated by that bright, flickering pattern that are showing to one of the eyes. 12:24:54 And so the argument which you know there are caveat I could discuss with people later, of course, with any study. 12:25:02 Is that it was primarily in the early representation, such as perhaps V. 12:25:07 One where you still had a large number of cells that were responding to the stimulus that was shown to just one eye. 12:25:14 But those cells were responding at the same time as the drops of water. 12:25:20 So a primary reinforcer were being delivered to the system, and so in this case the relationship between the neural activations and the reward was present in the set of binocular neurons. 12:25:34 Probably in early representations, and unlikely present later in the system. 12:25:41 And so, you know, it's outside of a task context. 12:25:45 And so I think when you start looking at task context, Marah's model is starting to explain a lot more information. 12:25:52 But I've spent a lot of my time trying to look at like what is the learning that you could get? 12:25:56 That's not in the text context and that. And so this is where, when we first start talking today about reinforcement learning, it's like, well, to me, I think of reinforcement learning, you know, in the context of the study, I just described where you basically have a reward that is leading to release of 12:26:14 neuromodulators. That is coincidence with the activation. 12:26:17 Somewhere in the brain. And that's changing the momentary plasticity of those cells and changing the representation. 12:26:36 So in the because of the time, constraints, I think it would be good, too, if it kind of focused a little bit back to the original course, at least clarify with the original goals where? 12:26:51 So the title of this entire meeting was the You're a biology of statistical learning, or something along that line right? 12:26:58 And so the the basic concept was that we know that learning is complex. 12:27:02 We know that biology is complex and and we have this problem here, that we would like to figure out what next steps you would take in order to ask interesting questions and a conceptual space about learning and tie it back to the things that we are. 12:27:21 Some of us are doing with animals and other kind of things, so within that framework, if I understand correctly, today's topic of menu was that to clarify whether reinforcement, learning, and statistical ring, whatever that means are should be handled in and one framework. 12:27:40 or 2 frameworks of several frameworks, and I don't know yet. 12:27:45 The main goal was that because if you answer this one, then luckily we are going to be able to design better experiments right? Some. 12:27:53 I guess this is the hidden thing, so it seems like we. 12:27:54 We all have another 6 or 7 weeks just to clarify what's statistical learning is and ends on, and so on. 12:28:00 So rather it would be perhaps a little bit more useful if we say that there are some things that we can, except that these are delineated type of learnings or things that we unavoidably we have to deal with, and then see that if in that framework, within that in that light what kind of 12:28:23 experiments that we would like to see from each other. 12:28:25 That was the original idea that theoreticians would say, Ok, give me this kind of experiment, and vice versa. 12:28:30 The experimenters say, just give me your definition of statistical learning something along that line. 12:28:36 So in that sense, can we collect any kind of final conclusion sort of what we have learned today other than that life is very complicated. 12:28:49 Yeah. 12:28:58 I guess, to offer a leading question off of Joseph's comment. 12:29:03 So when designing a task as a student. Now, one thing I've for example, with my thing spending a lot of energy, say, thinking about, do I have to separate bye? 12:29:17 If I have a task that is trying to get at statistical learning. 12:29:24 If I have reinforcement learning components in that task. 12:29:26 Is that a problem, does it? You know, do people here think that's a problem? 12:29:32 That's one simple leading question I'm going to offer now. 12:29:38 Opportunity, no, I don't think it's a problem I so I very much prescribed to the sort of thing that Matthew was talking. 12:29:45 It all depends reinforcement learning in terms of Cisco, and it all depends on what your State abstraction is. 12:29:50 And perhaps one sorry to see this one. Maybe statistical learning is when you are able to generalize. 12:29:56 This is sort of generalization versus memorization, trade-off. 12:29:59 So I can say I have learned some statistics where I can perform better on samples which were not in my empirical training distribution, and so classic. 12:30:08 Rl, systems were and you're given some grid world, etc. 12:30:13 You can't. You can learn that one single Rl. Task. 12:30:17 And that's just like pure memorization. But perhaps when you're put into assist a situation where you're allowed to learn some state abstraction, perhaps learning from many task, you can learn an interesting straight extraction state extraction, and that allows you to then generalize your values, to some new situation, that would be a 12:30:34 statistical learning within the Rl framework as well. So yeah, I romance conform to the generalization, memorization, trade-off and inhrl. It's all about what state of abstraction you learn. 12:30:49 Pardon. 12:30:59 I'm not gonna say, I think that's one possibility. 12:31:02 That's for quite yeah. I think I think I think in the case of Can I disambiguate myself from memorization which I think many people would argue? 12:31:12 Wasn't statistical learning, and the only way you can disinviguate yourself from memorization is, how well do I perform on samples? 12:31:19 I have not seen before. 12:31:28 Thanks, no, BA, but generalization, I would say, depends on so if you say that there's a representation formed in the brain that is perhaps more based on statistical learning, basically long-term integration of your experiences, you have made generalization, I would say follows that map right? 12:31:53 So so the so how you generalize, then, which may then be based on more reinforcement. 12:32:02 Learning, type experiences on a shorter time scale, that generalization follows what you have follows on the statistics. 12:32:11 You have basically integrated. So I think you cannot separate it. 12:32:16 This is only one, and this is the other. But suppose the. 12:32:23 Yeah, the same. 12:32:30 Different stages of the word. 12:32:34 Yeah. 12:32:41 So regional system. 12:32:47 Basic physicality. 12:32:54 Make sure that we do the same. Learning, experience. 12:33:01 Reduced these maps can be at a different hierarchical levels, right? 12:33:29 So so you can imagine that there are different forms of plasticity happening at the level of a map. 12:33:35 But the semantics of the map may be very different. 12:33:38 Right? So so you can think. But what you're saying is learning. 12:33:42 Probably propagates through the different levels of hierarchy. 12:33:45 And yes, it will basically affect your representation of maps at the same time. 12:33:52 But the you could like in terms of implementing neuronal mechanisms, how you change the map can be in whatever sensory cortex. 12:34:03 Similar as you change your map and prefrontal cortex. 12:34:05 However, then the meaning you attach to your different representational items. 12:34:11 Of course they will be different. Right? So that at the sensory periphery it will be more, maybe something. 12:34:18 How you read it out. Then, later. This is also an important aspect. 12:34:24 What of the changes you are trying to assess? 12:34:30 Then, after you have your subject, which has learned, and you try to extract information, and maybe you can test more perceptual items based on you have this beautiful sleight of the different cues is there a little Wiggle? 12:34:45 And then you can assess that maybe subconsciously, they will report, or you can ask at a higher level and ask whatever it is. 12:34:52 Did you understand the task? Right? And then they will answer whatever. 12:34:57 But it has happened at the same time at different hierarchical levels. 12:35:02 That's how I would see it a little bit. Thank you. 12:35:06 We are 5 min into the lunch break technically. Now. 12:35:09 So maybe you should end it now. I would love to chat with a lot of you afterwards, so I hope that yeah, it was mainly so. 12:35:18 You wanted to say some? Could I ask one question? 12:35:27 So tonight is the win in or here on that note we still need your money yeah, 7, 30 in the in mangro are here 7, 30 in the move. 12:35:51 7, 7, 7, 7, 7, 7, so, yeah, I would like to be there a little bit earlier. 12:35:55 And if anybody wants that check, continue charging. Yeah, we are collecting Olivia here, Joseph there, and I here are collecting money, not from you from the pis. Yeah, I mean, you can come to my apartment cook and then we take it to the to the oh, the food. 12:36:14 Yeah, it's so. 12:36:23 Really exciting, have said that before several times, but I am trying to do that. 12:36:28 The thing is that the event itself is 70'clock, right, and not 7, 30 s. 12:36:38 The places, the same place where we have been before and before, and before, in the municipality. 12:36:45 The third one is for those that were not there. This is the southwest, that's helps. 12:36:51 Yes, opposite to the end. Yes, that was no, you know. 12:36:59 So you you should have somebody's telephone number. 12:37:00 So that they will let you in at the door, but never does. 12:37:05 Okay. The third one is when we say wine and cheese. 12:37:09 That means that eating is on your own meaning, that you can bring stuff with you and do barbecue. There. 12:37:17 You can have your your dinner beforehand. You can do all kind of solutions. 12:37:20 The point is that there you will find some wine and some cheese, and that's it. 12:37:26 And some people hopefully. So that's the the over structure. And then collecting the money is as levy as I told you. 12:37:33 It's a covering previous present and and future. 12:37:38 Thanks until we run out of that money, and it's on the P. 12:37:42 It's a and not on any other. 12:37:45 People were lured in here for gaining some money out of them. 12:37:48 Any more questions about this particular case. Not a bad statistic of it. 12:38:01 I just told you. That's right. If you're worried more than what you find there, bring your own stuff.