Posts

Showing posts from April, 2021

CHAPTER 9

To get you started, we'll go through four popular statistical tests: chi-squared, t-test, correlation, and regression, as well as a method for extracting the data you need. My new Tauntaun dataset must be loaded into R for all of the analyses in this exercise. . Remember that we're working with data that has been "super-cleaned, combined, etc." and is saved as an.RDS file.Our first test, the chi-squared test, is a straightforward method for analyzing categorical results with any number of categories. For example, you can use this test to see if Tauntaun fur length (short, medium, long) and fur color (grey, white) are related (or dependent) on one another. That is to say, the probability of a specific Tauntaun fur color depends on the length of its fur; and vice versa, the likelihood of a particular fur color depends on the length of its fur.Now, the purpose of the chi-squared test is to determine if gender and party are INDEPENDENT of each other. The chi-squared test ...

CHAPTER 7 AND 8

the clean harvest The first Rdata file contains information about all the Tauntauns that have been harvested, while the second contains information about the hunters. A similar column in both datasets, hunter.id, identifies who harvested which animal.A column named hunter.id appears in both datasets, and it is the unique number that identifies each hunter. We'll use the merge feature to combine these two data frames into a single data frame.A "left join" (lower left example) will hold all records in the left table and only those in the right table where the left table matches. NA is used to fill in any missing values. A "right join" (lower right example) will hold all records in the right table and only those in the left table where the right table matches.If we use the order function, we can filter by multiple columns. Each additional sort vector is entered as an argument of the order function, separated by commas. The order function returns a vector of indices...

CHAPTER 5 AND 6

The final object type we'll discuss is a data frame, which is a collection of vectors with the same length. In R, data frames are used to store most datasets, and we'll get a lot of experience with them in the following chapters. Let's put the three vectors, fiveIntegers, fiveBooleans, and fiveRandoms, into a data frame with the data for now. role of the frame.We'll start a new project named Tauntauns here. This project will house all of the materials related to our Tauntaun population study.An absolute file path contains your computer’s root directory and all other subdirectories that contain a file or folder. In Windows the root directory is usually the C drive; in linux and MacOS it begins with a forward slash / or tilde ~. A relative file path locates your file relative to your working directory, which should be your Tauntaun project directory (one of the benefits of using projects is that the working directory is automatically set upon opening).The backslash key is...

CHAPTER 4

All artifacts we make must be stored somewhere in R. The global environment is home to our artifacts, fiveNumbers, outcome, and lowNumbers. Package functions are all contained within their own package environment.This function takes four arguments: x is the vector to sample from, size is the number of samples to take, replace is what you want to sample with replacement (which means the same numbers can be drawn repeatedly), and prob is used to weight your samples such that some have a better probability of being chosen than others.One of the most common coding errors in R, in our experience, is attempting to execute a function on objects of a given class when R is expecting something else. When it comes to R functions, the str function is one of your best friends; it gives you not only the class but also a look at the values.A vector is a single data row or column. The cells A1:A3 are illuminated, and this group of cells is labeled mixed (see the upper left of the figure). This specifi...

CHAPTER 3

This portion should not be overlooked. The purpose of a helpfile's Examples section is to provide example code that demonstrates how to use a function, beginning with the development of some sample data and then demonstrating how to use the function on the sample data. By copying and pasting the code into the Console and then submitting it, you will learn a lot.We generated an object called “xx” in line 1 of this code, which has the integers -9 to 9 in 1 increments. Another way to make a sequence in R is to use the colon operator (:). Simply type xx and submit it to the R console if you want to look at it.This code, starting on line 2, is typical R helpfile code, which nests many functions together in a short, concise code. Expert coders tend to keep their code as short as possible... it's succinct, simple to understand, and everyone else who uses it won't have to wade through many lines of code to get to the result. As long as you aren't a fledgling, all of this is per...

CHAPTER 2

We'll take a short tour of RStudio in this chapter. we will be writing the first R script along the way. The RStudio developers have excellent documentation on their website, which you can explore at your leisure. Rather than going over each and every choice, the aim here is to give us a broad overview. You can build new folders (directories) on the Files tab.You can also rename, erase, and transfer files on your computer. You can navigate to any location on your device using RStudio. When running in R, the program needs to know where to look for inputs and outputs, so it will look in a directory called a "working directory" first. The getwd feature, which can be entered in your console or in your script, can be used to locate your working directory.R suggests that you use the double question mark technique, which will be??meatloaf, since there is no documentation for meatloaf. A keyword search is launched as a result of this. The double question mark is simply a feat...