08:40:20 So, yeah, I think, in the next 2520 minutes or so. 08:40:25 I'm going to tell you something very useful for just about everyone, hopefully, and which is how to dig in and get started with analyzing and using simulation data. 08:40:35 So galaxies and cosmological simulation data. I put here, unless you're TMG but actually, but actually I wanted to make the title. I can change the slide, a slightly more general because indeed. 08:40:53 A lot of what I'll talk about is clickable to two or three simulations which are currently publicly available so just be beyond just the western. 08:40:59 Okay, so I want to start with two pictures lens, which are fluff. And the point here is important though the point is that in modern cosmological simulations we have an extraordinary amount of information right so these images are showing from the left 08:41:15 to the right that dark matter, the gas density the gas velocity, the stellar structure the gas temperature the guest list it shocks, the gas, the magnetic field of the gas and X ray emission from the gas is there just a sampling on large scales. 08:41:31 So, hundred mega parsing scales from top to bottom of the kinds of information content we have which covers in the dark matter gas stars and not shown here also the supermassive black holes. 08:41:44 Right. 08:41:45 It's good to keep in mind when you go from large scales to small scales, you have simultaneously in these large volumes, actual galaxies, right, actually, highly resolve individual galaxy so I'm showing you images from 2015, just to emphasize that point 08:42:02 right so this is a smattering of TMGO, Richard zero, nice looking stellar lights mock images of galaxies, showing you some of my favorites, showing you the incredible range of structure right all the way from the interiors of discs, that kind of killed 08:42:18 up carsick, and even some killer carsick scales out to the CGM and how 200 megabit per second level. 08:42:25 Okay. So, how's multiple simulations, we're talking about I think you can see my cursor, change it to laser. We're talking about electricity ng Eagle resume Gen Simba these kinds of projects. 08:42:38 I like to refer to these as synthetic universities in a box. And so these are literally cubic boxes where hopefully we have a statistically representative galaxy population and thus representation of the universe, the volumes are talking about here and 08:42:54 these days are roughly 50 to say 300, a co Movie Maker Parsons on a side and these are the four main constituents that we have so we have the gas represented as a fluid, of course, and then the Dark Matters stars and supermassive black holes all represented 08:43:11 as body particles 08:43:15 in the simulation like this, you're going to have roughly 100 snapshots, so hundred snapshots between the initial condition which is redshift 100 or so, and Richard zero. 08:43:25 Right. Each one of these hundred snapshots is big can be a few terabytes each, and each one contains roughly speaking, for gas back matter stars and black holes where they are in space and then a list of properties of those particles. 08:43:40 So these are the snapshots company that are the catalogues right so each natural has a catalog which tells you what collapsed structures we've actually found and identified at that point in time. 08:43:51 These are the dark matter halos, and then sub halos also galaxies. 08:43:57 The third piece of information we have is a merger tree which, although it does tell you about murders. It's simply the way that we link things in time, right if you don't have the connectivity and time. 08:44:06 And each of these hundred catalogues and hundred snapshots are fully independent at different points in time throughout cosmic evolution. 08:44:14 Okay, so I want to make two quick points here. The first is about resolution resolution is always finite always limited in the simulations, as we discussed in the past weeks it's best in the interstellar medium and high density and it always gets worse 08:44:28 towards lower density gas CGM. 08:44:31 The second point is about the physical realism of the simulations are simply saying that they're getting very good. This is not a science talk. 08:44:40 I just want to make the point that you can often go into these modern galaxy simulations and simply look at what happens, look at the outcome without worrying too much about, is it correct enough, of course, it certainly could have issues could have attentions 08:44:53 in certain regimes but they're getting very good, the physical outcomes and the. 08:45:00 Yeah, the properties of the galaxies and their environments which was okay so what kinds of halos and what kinds of galaxies actually exists in these simulations, these are simply two histograms of dark matter Halo mass on the left and Galaxy stellar 08:45:16 mass on the right for three kind of representative volumes 1500 300 whole movie megapixels. So, if you look anywhere your favorite Halo mouse so Milky Way's and you look up and you look at what the number is here, This tells you roughly how many objects 08:45:30 are those, how many exist in such a box and appoint two decks, so plus or minus point one decks. 08:45:38 This drop off of the massive halos to the right side of these figures, this is the natural result of limited volume right these are rare objects. 08:45:48 On the other hand, the drop off to the left hand side here this is numerical right this turnovers represent the smallest objects which exists, which are identified in the simulation so this is the absolute minimum, where you can look at halos or corresponding 08:46:02 the galaxies forgiven resolution. 08:46:05 galaxies forgiven resolution. Okay, so this is my one table about the simulations, and it has quite a few important points. So the three different rows are kind of three different classes of resolution so on the top is to do 50 hybrid solution, and we 08:46:36 number, there's not much going on. There's not much happening in the simulations, you can think of everything is essentially smooth below the scale. 08:46:44 Roughly speaking, so 200 parsecs up to around a killer Parson for lower resolution simulations next column is also very important this is the minimum galaxy mass stellar mass, you would ever want to think about right so you ever consider possibly resolved, 08:47:00 possibly of interest. 08:47:02 So, if you say I want, maybe a minimum of 100 stuff stellar particles or 1000 stellar particles and maybe 10,000 dark matter particles and my Halo. These are the kinds of numbers that you get so seller mass greater than 10 728 with a very high resolution 08:47:17 simulations, even up to 10 to the nine of the 10 as the minimum that is stellar mass you probably want to consider and doing analysis. 08:47:27 And these last four columns give you some counts, roughly the number of objects, you have in these kinds of volumes, doors, lots lots and lots and lots never showed indoors Milky Way's say 100 and a small volume 10,000 really big volume, massive groups, 08:47:42 maybe 20 or so, and 10 to 50 thousands and change the 300 and big clusters, one, of course, and 50 mega parsecs, they're quite rare and only a few hundred clusters about 10 to 14 and a baby boy. 08:47:57 Okay, pushing into the actual beta of the simulation What does it look like for us, the data is in HDFS files. If you like these great if you don't, don't be frightened, but an HDFS files, essentially the same as it fits file that in the end it doesn't 08:48:12 even matter right because we rarely touched the actual files. But if you go to a snapshot snapshot 99 Richard zero and you say, Let's meet everything that's in that HDFS file. 08:48:23 This is what an actual snapshot of a simulation looks like there's a bunch of metadata here at the top and then we see part type 013455 different types of particles is correspond right so Project Zero is the gas for type one of the Dark Matter event three 08:48:40 are tracers for the stars and prototype five are the supermassive black holes. 08:48:52 What does this mean right for every gas cell, part time zero. Each of the 15 billion gas cells inside TNG 300. We know the x y z position so the three vector of coordinates, and we know all these other properties of every single gas So, similarly for 08:49:06 matter the stars in the black holes. So this is what I call the snapshot This is the particle level data is quite large, along with the snapshot we have the group catalog right in the group catalog is similar have just two things. 08:49:21 That's what we call groups which are halos the dark matter halos and we know a bunch of properties about each of them, and it has some heroes. So both dark and luminous so a dark side Halo with the halo with no stars and the luminous of Halo which has 08:49:36 stars, that would be what I would call a galaxy. And again, we know lots of pre computed properties about every Halo and every galaxy in the simulation. 08:49:46 Okay. So, given all that data, what can you actually compute about the gas stars the black holes and the answer is a whole lot of stuff. Basically everything you can possibly imagine, some of which are very directly from the simulation to the temperature, 08:50:01 density starvation rate of the gas others which require more post processing or modeling efforts on top of the simulation for instance metal ions coming from karate extra admission coming from APAC, so on and so on. 08:50:15 Information about the stars, high energy observable is radio to the goals the sharks things about the black holes things about the dark matter and extraordinary amount of information in these simulations. 08:50:26 And this is all again at the particle level. So all this content about the gas and the stars and the black holes exists, also in the context of what galaxy or what Halo that matter is sitting with him. 08:50:38 You can also of course correlate and look at the relations between all these properties of the particles and the gas as a function of the galaxies and the dark matter here. 08:50:50 Okay. So, in terms of actually using the data right there's three ways. 08:50:56 And I'm talking here about the data release platform for Lester's tangi which I'll show you in just a second. there's three kind of fundamental ways to, to look at the data. 08:51:07 First I call it, local data, local analysis local means on your machine. So here we're talking about actually downloading the actual raw data files to your system, and working there to do the analysis. 08:51:20 Okay, second option is remote data, local analysis, so here you leave all the data on the remote server, but you do the analysis on your own machine. Here we're using an online API so a way to interact with the server to get back, small pieces of data. 08:51:36 The third option I call remote data remote analysis so you leave all the data on the remote machine, and you also do all the analysis on the remote machine. 08:51:44 This is very nice. This is through a web based Jupiter lab or Jupiter notebook kind of interface. 08:51:50 Okay so, option number one. If you want to download all the data yourself who am I talking to here you have your own cluster and you have your own storage. 08:51:58 This would be a requirement, you want to do complex or slow analysis, you're thinking option one. There's lots of examples in this context, and these three languages so Python Metalab idea. 08:52:09 What about remote data local analysis so if you've heard of REST APIs. This might be for you. You can do searches you can make queries online you can download small pieces of data, pieces of catalogs, do all the analysis and plotting on your laptop. 08:52:25 This is option to option three actually working on the remote web based platform. This is great for quick exploration. This is great for full analysis of the particle data without downloading anything and it's great for prototyping and developing your 08:52:40 analysis scripts and 08:56:34 Um, I was simply explaining here these three different ways to basically analyze So download or not and analyze the data. Right. and then I was jumping immediately and I'm going to show you the actual website and give you a bit of a walkthrough, hopefully, 08:56:51 if it works, and show you how this works a bit more in practice. So, I'm going to actually put the slide and open up a web browser. So, you should probably be seeing our beautiful slack. 08:57:07 website here and type it in. And this is where you land. 08:57:12 So if you click on Data axis. On the top bar here you get to the data page for these simulations. 08:57:22 What you see here on the upper left is getting started this explains exactly what I just explained it's three ways to to look at the data to analyze the data. 08:57:31 Here the lower left is documentation, this is quite important I'll show you a bit of this on the upper right, we have a few tools for exploration and quick looks at the data, and then lower light here that there's a discussion forum for questions and 08:57:43 FAQ, and so on. 08:57:47 Let's start with actually downloading the data. So if you go down on this page. What you see here is an actual table of the simulations which exists, which are available. 08:57:57 So, let me hide the dark matter only version so these are the these are only the baryonic simulations and have gas in them. And you see the different families right so you see the TTNG 100 simulations dash one dash to dash three these are the three different 08:58:11 There's 2300, the large volume has to do 50, small volume here's the original illustrious simulation and ego sitting at the bottom. 08:58:17 resolution levels, so going down and resolution. 08:58:24 So for each simulation you have just a little bit of metadata you have the size of the volume itself. You have a number of dark matter particles some resolution statistics in terms of announce the dark matter of political the massive gas number of snapshots, 08:58:39 the number of sub halos that's a proxy for the number of galaxies say in the simulation which is zero. 08:58:45 And if you go into one of these pages so I'll just click on the tangi 101 and open it in a new tab where you get to and the top is just a list of metadata about this simulation, how it was run with the box sizes what the cosmology was and so on. 08:59:01 And if you scroll down, you get to a table of the actual snapshots, so it started with snapshot zero retro 20 and going down to the bottom, the last snapshot is snapshot 99 at Richmond zero, right again, all these numbers are just some statistics about 08:59:20 each snapshots are where they are, how many particles and how many galaxies exist at that point in time, and these links. if you click on Download snapshot, you get an offer here for web commands, which if you type that into a console will start to download 08:59:35 download as it says, unfortunately, 500 gigabytes worth of data to get that snapshot so you can go ahead and download that into your cluster for analysis. 08:59:47 And what you're actually doing there as it says is this natural to split into several files and go and click on this link you can see indeed that a snapshot is simply a collection of a bunch of very small files, and you can download them, click on one 09:00:01 and you'll get download even in the web browser that particular piece of the snapshot 1.2 gigabytes. 09:00:12 And it's okay so that's actually downloading data directly back here let me go through in about two minutes the documentation which is quite important right so link one background and important details as it says it's important details right it's just 09:00:27 an overview of the simulation. So in the context of kanji presents the three boxes and what each of them are, what their characteristics are. 09:00:38 Talk to me about the resolution in these three simulations and then this table again gives you a complete listing of all the illustrious tg simulations on the top, or the baryonic runs and on the bottom here. 09:00:55 Dark Matter only analogs for each of those longings, and again statistics about each. 09:01:00 I'm back here at the top. 09:01:02 Scientific remarks and cautions right to be quite important for anyone who's interested in the data. This discusses observational tensions with the simulations and at this discusses numerical considerations which are important to know when you start to 09:01:15 dig into the data. 09:01:17 Okay, so that's the background important details the next piece of documentation is the specifications again this is quite important. This tells you, every possible piece of information you could need to know about the data. 09:01:27 So about the snapshots here the top than about the catalogs, about the merger trees and so on. So for instance if we want to understand what is in the snapshots with respect to the gas we can go down to prototype zero, the gas and here we have a listing 09:01:41 of every field that we know about for every gas cell. 09:01:45 So its name, the units, right, units have to be taken into account very carefully when you're converting any of these fields into a physical economy and then a description field itself. 09:01:59 Okay, so let me go back to the top. 09:02:02 As with the particle data and snapshots. We also have details about the catalog objects the dark matter halos here and the subprime loans or galaxy. So if I go down to the sub payloads for instance, against the listing of the properties every property 09:02:16 know already computed about surveillance, their units and descriptions for instance. Well, the center of mass XYZ position or scroll down the total mass of the sub Halo, just a number of units attended the pencil or masses over age, or maybe say the star 09:02:34 formation rates of the sub Halo and solar masses per year. Just to give three quick examples. 09:02:41 Okay, so those are the two most important pieces of documentation right the background, and the data specifications that we have to kind of walk through here, if you download the data on your local machine how to actually load it and do some very simple 09:02:58 analysis. The second walk through here, of the web based API is showing you this option number two, so if you want to actually make. 09:03:08 If you don't want to download any data. Instead, you want to make remote requests and do the analysis locally, this is a second way of working with the data so I'm not going to jump into these right now but I'll just open it very quickly. 09:03:23 To show you that what it is is essentially step by step walkthrough in Python or or ideal as you like. 09:03:33 Going through the steps of how you would use this service from the very beginning so quite easy to follow and walk through and say half an hour's. 09:03:41 Okay. 09:03:45 The forum here is filled with questions, and people asking what is going on so I encourage you to also post their questions or things and not working. 09:03:57 And we try to answer them. 09:03:58 and an FAQ Of course always good to check. 09:04:02 Let me focus here on the upper right. So, these are a few very important things. The top is option three. This is the web based interface for analyzing the data remotely I'm going to jump into that in just a second. 09:04:15 First I want to show you these three other quick tools which are all just ways to make great quick exploration of the data. So let me open the first one here it's called search the catalogs right. 09:04:27 And what this page is it's simply allows you to select the simulation, and select a snapshot or redshift so the original zero, and then make a search over the catalogs, so right now I've searched on the mass of the halo with these two bounds and I've 09:04:43 gotten this list of objects which you can see below, but scroll down, there's the whole lot of, sorry 43,000 matches so payloads which match that particular search, and if I wanted to make some other kind of search on the property we know for instance 09:04:58 the star formation rate I could not restrict this further between five and eight semesters per year, its search again and get a listing of the objects which satisfy that criteria, there's actually only six, which satisfy this combined restriction on mass 09:05:11 and star formation. So a quick way to see what is in the catalog. 09:05:29 Okay, let me go back to the main page. The second tool here it says plot catalogs it's very similar. It allows you to take a very quick look at 09:05:25 it between properties of halos or galaxies. So again first pick the simulation 2100 and pick the redshift or the snapshot and then a whole bunch of options and if you just leave that all alone. 09:05:38 Hit request blocks, you get a default. 09:05:40 What this default is, is the specific star formation rates, versus stellar mounts for every central galaxy in the simulation Richard zero. This is the tangy 100 years old. 09:05:51 Right, so we can change things so we can plot, all kinds of things about galaxies and if you hover over one of these possibilities for what you want on the x axis you'll get a little description here and for instance the galaxy gas fraction so we can 09:06:07 lead the x axis on the stellar mass which is, which I like and say let's look at the mass mobilization so let's put it let's do the gas on the y axis request the plot. 09:06:17 And there is the mass the middle is the relation for the simulation. So, gas phase metal The city is a function of stolen mass. 09:06:34 One other very nice tool here, is that right now the color of this plot represents the number of galaxies right it just showing you how many galaxies are on this plane of middle st versus stellar mass but we can actually use instead of, we can assign 09:06:43 a color to every pixel based on the values of the galaxies which are there so we could, for instance, pick the gas fraction so let's color the plot by the gas fraction of galaxies and you see that the color bar here is changed now to be the gas fraction. 09:06:58 And these are higher gas factions that lower masses Logan's fractions for quench galaxies. 09:07:05 Very quick exploration of what these catalog data shows. 09:07:09 Okay, so that's enough of that last quick little tool here for exploring is this thing called visualize is very similar right pick the simulation. Pick the objects, either in terms of the sub Halo number or the halo number, and then lots of options again 09:07:24 leave them all alone. 09:07:33 Visualize, you get the default, which is a picture. Perhaps if my internet works or not. 09:07:40 Typically, a picture. 09:07:43 Oh, Halo 324 10 101. 09:07:49 That's quite impressively not working too bad. 09:07:52 Try again refresh. 09:07:59 Very strange. 09:08:02 We actually have this block. 09:08:10 Okay, so there it is. So this is a rendered image of this particular Halo. So the circle is the video radius here's the color bar at the bottom gas calm density. 09:08:14 So made on the fly. When I demanded it. That's why it's a bit slow. Right, so we can change what we're looking at. And there's a huge number of things that we can look at say we're looking at gas calm dense you could change it to the stars and look at 09:08:29 this stellar calm density, which is this image instead, or any other property of the particles we know about. 09:08:36 And there's an extraordinary number here, including many things look at it later for any object any female or any galaxy in any situation and. 09:08:48 So, very powerful way to visualize sample objects which you found interesting. 09:08:54 Okay. So, in the last, I don't know, five minutes, hopefully, we're so I want to show you the last tool here which is the Jupiter lab interface so I'll click on it you get to a page that looks like this. 09:09:08 If you scroll down, there'll be a bottom here, a button which says request access, probably, if you have never done this before, or if you've already pushed that button, and we granted access and it will say launch. 09:09:21 Now, If I click on that again if things work, you should see a screen which looks something like this. Okay, so what are we just done we've launched a small little compute system. 09:09:33 So, yeah, remote compute instance on a server in Germany, where there is a copy of all of these simulations. So, we can now run code here, and analyze and load the data, which looks as if it's sitting there locally on this computer. 09:09:52 So if you've ever used Jupiter lab before you're familiar with this interface. 09:09:56 If not you can also start a normal Jupiter notebook in this context. So, the reason I like this is because we can we can start a terminal here, we can use a normal Linux terminal. 09:10:07 At the same time we can open up say a Python script and we can flip it over to the right hand side and we can start up, Python, in the terminal type. 09:10:17 And we can you know develop and edit our script, while at the same time running things in Python so personally This is how I've worked. And this is why I really liked this interface because you can directly edit files and a text editor on the right, use 09:10:32 a terminal in the middle. And, yes, it's all again running remotely on the server where the data is so let me list the directory here so what's actually in this directory. 09:10:45 There's a few, there's an examples folder, there's some scripts, these are the helper scripts to us to load the data. 09:10:55 And there's three folders, called Sims illustrious sim other and seems tg so what's in those, you know inside Sims illustrious is literally the three old illustrious simulations and the dark matter, analyze, Emily NTMG. 09:11:09 The actual authorities innovation. So if we go into one of these folders and just go to TG 100 dash one. There's an output directory there's a post processing directory there's something called simulation five are going to be output directory say, here 09:11:21 we see hundred folders called groups, these are the group catalogs hundred folders called snap snapshots initial conditions. So boxes I won't get into that, again, say in the groups of snapshots zero, we have just a literal dump of all the html5 files 09:11:40 of the simulation so this is literally looking at the simulation data as it exists as we use it all the time, every day. 09:11:47 Okay, so let me show you very quickly. 09:11:50 An example of analysis so I'm just going to go in here into the examples folder and if you do that you'll see a tutorial file, which looks like this. So this is a walkthrough, a very short tutorial of loading and analyzing data in this platform. 09:12:06 So I'm going to go super fast, of course, just to give you a flavor. But we're importing us our usual modules, we define with simulation we're interested in looking at just by setting the path of the tangi 100 dash one simulation. 09:12:19 And then, for instance, we can load the catalog. So this command is loading all the sub halos in the catalog and it's loading the mastering the star formation rate of the supporters. 09:12:30 So if you type this command here you're literally loading again, the local data, because this compute instances running on the cluster where the data is. 09:12:39 And I go through making a quick plot of star formation rate versus massive the halo that looks a bit funny. I get down to star formation rate versus galaxy still a massive looks a bit better. 09:12:49 And this isn't the this is the star for the main sequence of the simulation. 09:12:53 I keep going I point out some cautions about these are star formation equals zero galaxy so absolutely quenched galaxies Don't forget about them. And then we we make a stellar mass function. 09:13:04 So it's very simple right this is just maybe 20 lines of code and we make the stellar mass function of this simulation just a histogram of galaxies that are mass. 09:13:14 I go a bit further here so that was the catalogs right and then we looked very quickly at the merger trees for emergencies again give you information as a function of time. 09:13:23 So I just use this load tree function to load for a few random objects these five random objects their master history, so the mass of that object as a function of time. 09:13:35 This is how they were assembled, for instance, you could use this to make the star formation histories of these options. 09:13:42 And then I go on to the actual particle data. So again this very simple command load subset loads the actual gas stars or black holes from the simulation. 09:13:52 and I make a picture. 09:13:54 Just a nice a 15 lines of code of the entire box right so you see, I just made a two dimensional histogram of all of the gas mask in the simulation to the label here gas masks surface density and you see the cosmic web, the large scale structure of this 09:14:11 Tg 100 simulation. 09:14:13 The final thing down here I'm again loading particles, but only one specific objects. So the random Halo. I load the temperature of the gas and I just make a histogram. 09:14:25 So this is maybe reasonable temperature histogram of gas in one Hello, and then I make a picture of it. So this is again just a histogram of the temperature, showing you roughly on a function of x and y, how the temperature looks in such a halo. 09:14:41 Okay, so that was extraordinarily fast, of course, and it's just to give a flavor of all the analysis that you can do when you're running kind of in this online platform which is, as I said before a really good starting point for prototyping or doing 09:14:56 some initial looks maybe then downloading data later. And the last comment I will make I think if I just go back to the main web page. 09:15:08 So, as you're exploring here as you're using the lab, as you may be downloading actual data, these three methods you can use them all together, so you can use them as you like and the ways which are most comfortable to you to dig into the data to analyze 09:15:23 it remotely to download pieces of it to be putting on the lab interface or reporting on your local system. And, yeah, so that's what I want to say it's of course, there's a lot of details. 09:15:39 A lot of documentation to wrap your head around, but on the other hand, he's getting started, walkthrough tutorials are very simple. So I would encourage you to just jump in and give them a try if you have even 10 or 20 minutes to start to play. 09:15:54 Thank you. 09:15:56 Thank you, Dylan. That was amazing. What an amazing resource for the community. I mean, I have no idea how you support all of this I know you said, shout out a couple organizations on the bottom of the page but like this just completely blew my mind as 09:16:14 to the quality of the website that interfaces like, wow, wow your collaboration has just done such amazing work making all of these data sets publicly available so congratulations. 09:16:30 Okay, so if you want to explore these public data sets more. There's a breakout room for it. 09:16:40 It's, it's called the TMG room, and you can go ahead and join and Dylan go ahead and join that room you'll moderate your own discussion, and then we are going to get started with the if you tutorial by Nicholas Taleb who I see is here, and in four minutes, 09:17:00 so you have a four minute break and Dylan I would recommend actually starting your breakout room in four minutes to it's nice to give people a little break. 09:17:08 Okay so, Nicholas tales if you want to start sharing your screen or get shot set up you're absolutely welcome to do that, and everybody else come back. 09:17:19 Now in three minutes for the start of the IU tutorial and thank you, Dylan again for that amazing tutorial.