R programming: What is a List?

Kirill Eremenko
A free video tutorial from Kirill Eremenko
Data Scientist
4.5 instructor rating • 44 courses • 1,748,676 students

Lecture description

In this lecture, you will know what the lists in R programming language are and how to create them

Learn more from the full course

R Programming: Advanced Analytics In R For Data Science

Take Your R & R Studio Skills To The Next Level. Data Analytics, Data Science, Statistical Analysis in Business, GGPlot2

05:53:11 of on-demand video • Updated April 2021

  • Perform Data Preparation in R
  • Identify missing records in dataframes
  • Locate missing data in your dataframes
  • Apply the Median Imputation method to replace missing records
  • Apply the Factual Analysis method to replace missing records
  • Understand how to use the which() function
  • Know how to reset the dataframe index
  • Work with the gsub() and sub() functions for replacing strings
  • Explain why NA is a third type of logical constant
  • Deal with date-times in R
  • Convert date-times into POSIXct time format
  • Create, use, append, modify, rename, access and subset Lists in R
  • Understand when to use [] and when to use [[]] or the $ sign when working with Lists
  • Create a timeseries plot in R
  • Understand how the Apply family of functions works
  • Recreate an apply statement with a for() loop
  • Use apply() when working with matrices
  • Use lapply() and sapply() when working with lists and vectors
  • Add your own functions into apply statements
  • Nest apply(), lapply() and sapply() functions within each other
  • Use the which.max() and which.min() functions
English Hello and welcome back to the Advanced course in R programming. In today's tutorial we're finally proceeding to lists. So question of the day what is a list. So remember how when we were talking about vectors we said that a vector can only contain elements of the same type. Well at least is a data object which is like a vector but it can contain absolutely any type of elements so it can contain a mix of different types of elements. And today we're going to learn how to work with lists. All right so to start off with let's subset our data frames. So we've got util And as you recall so if I look at summary for util we've got machines so we've got five different machines in here. And according to our challenge we only need to work with the machine which is called RL1. So let's go ahead and subset our data set we'll say RL1 will be util and then we'll add a filter util $ sign machine equals RL1 colma nothing so we get all the columns run that. And now if I look at summary for RL1 you'll see that we only have machine RL1. Now the other machines still come up because we need to rerun this factory can always do that. We can just say RL1 dollar sign machine. And then here will say factor. This is not compulsory but will make things look better. So machine if we run that. And now if you look at summary for RL1 you'll see that there's only that one factor. So you just need to when you do run a subset of a dataframe you do need to rerun the factor if you want to get rid of this legacy factor memory. So there we go we've only got one machine in this dataframe and that's great. So now we can start constructing a list so let's type that in construct list. Alright so we remember what we want in a list and we've got that information up at the top here. We've we wanted a character with the machine name. Then we've got some stats that we want. Then we want a logical vector is it has the utilization ever fallen below. That's certian value. And then we get some other things that we want to put in there but today we're just going to start with these three. So let's maybe copy them and add them in here. All right. So let's go ahead and get started. The list is going to have a name we'll give it a very descriptive name. Util actually we'll call it list RL1 And before we put anything in the list let's calculate our stats. Actually let's get rid of that for now. Let's calculate our stats. We're going to. So where you have the machine name we know that it's RL1 What we want to calculate is these stats are the mean minimum meaning minimum average or mean and the maximum of utilization for the month. So let's go ahead and do it. We'll going to create that vector so this is a vector and we're going to actually create it ourselves so let's go do that to util. stats RL1 is going to be a vector. And inside this vector we're going to put in. So minimum of RL1 That's our data set. And then we just need to actually address utilization. So if we look at that RL1 utilization that is all of the utilization for this specific machine. Since we've subset a data frame and as it gets it's got some NA's at the start. So when we do run this minimum we're going to if we just run it like this. So we want to find the minimum value we'll get NA. . So we need to remove NA's from here so just say NA dot rm equals T. That is all minimal function now. Actually that just copy that. That'll be handy. And now into this vector we also want to add the median or mean actually and the maximum. So there we go. We are looking at the minimum of all of these values the mean and the maximum. If we run that we should get this vector. Now let's get this this let's copy this. There we go. So that's how minimum utilization is. You can see it is 0. It's about eighty four point nine percent. Ninety five point one percent is our average utilization for this machine for the month excluding ours where there is no information. And the maximum utilization was ninety nine point five percent. So never reached 100 actually in no. Given hour Was the utilization 100 percent. All right so now we want to check the flag or create the flag this flag over here has utilization ever fallen below 90 percent. The answer is yes we can already see from here that because the minimum is below 90 percent but we're still going to do it programmatically through R. So let's go ahead and do that util under 90. How do we do this how do we find out if the utilization So basically in this whole vector if we have something under 90 So let's go and start by looking at the vector or it's actually it's a column which is actually a vector. So if we run this there is a vector. How do we know if there's something underneath. Well it's pretty simple we just compare it to zero point ninety. Right. So if we're run this we've got a vector of false false false on this maybe make some space here false false false. Maybe somewhere there is true. How do we find if some of them are true. Which ones actually are true. How do we find if there's even at least one which is true. Well we're going to use the which operator if we put this in brackets then this will tell us which ones exactly of them are true so far on this. You can see here it is telling us which numbers in or which elements of this vector are true and it's the good thing about which is it ignores. NA It completely ignores NA's which is a great thing because they sometimes can get in the way of your analysis. So is basically telling us at which indices in this vector you've got true values and now what we want to do is basically we want to check even if there is one of these is true. So even if there is one element in this vector then that means that utilization did fall under 90 percent right. So we don't really mind which values are in here what we need to know if there is at least one value . So how do you do that. Well you just check the length of the spectrum lenth of which Right. So lenth I know it's getting a bit cumbersome but it all makes sense right. So this is all a vector of values comparing to 0.9 you're getting logical vectors applying which tells us if at which positions there are True's in this vector and then we're checking length just to see if there's at least one. So length is 27 so actually 27 times utilization fell below 90 percent. We don't really mind how many times utilization fell below 90 percent we just want to check that it happened at least once. And so how do you convert this to a logical vector. Well there's a couple of ways you can say is this greater than zero. Right. If it's greater than zero. True but then that means that it has fallen below 90 percent. Or we can the other way you can do it is you could use as.logical right and then apply that to that vector. But I'll let you play around with that because that's just the other approach. And basically converts zero into false anything above zero into a true. So pretty much same result what we're going to do is we're going to use this approach and we're going to place this result into our flag so this is our flag. If I on this line now and I check this we've got true meaning that utilization did fall on 90 percent . So I know that this might be a bit fast we might be going a bit quick through scenes but that's my expectation that you are getting quite comfortable with R by now and that you have some of those basic skills. If it is going to fast then feel free to rewind a little bit and watch some sections again if you feel that you really like not keeping up. Then maybe it might be a good idea to check out the basic course once again just to refresh some of that knowledge. But if you're keeping up fine then that's great. Let's proceed onwards. All right so we've got under 90 utilization under 90. This is actually a flag so let's give is that name flag. And now what we want is we want to create our start creating a list because we know the machine name we've got this vector and we also know if utilization has fallen under 90 percent we've got that flag. So go ahead and create this list so to construct our list we're going to say at list underscore RL1 and here we're going to sign it list. Surprise surprise the function to create a list is called List. It's very convenient because it's easy to remember that way. So we're going to assign a character. Right. So the name of the machine then we're going to say comma util stats RL1 stats and then we'll say you're util under 90 flag so far on this as you can see the 90 flag hasn't been created because we forgot to run the Slyne Let's run that again and just check that it's there. It is true. So now if I run this the list has been created so list_RL1 And that is what our list looks like. As you can see it does look like a different structure we haven't seen something like this before it's printed out vertically so you've got double brackets to one and then you've got RL1 So this one this one over here is actually comes from the vector. And this one also comes from the vector as you would normally print out a vector. That's the same indexation it just telling you when you're gone and you line which one it's printing . And then you've got to this logical value. So basically this is the indexation of list 1 2 and 3 as you can see is double brackets. It is very different it's got a combination of things so in here we already have a character. We've got some doubles or numerics and we've got a logical value. Very interesting structure. And we're going to learn how to work more with lists throughout the section. On that note this is going to be the end for today. Hope you enjoy this tutorial and I look forward to seeing you next time. Until then happy coding .