CHAPTER 5 AND 6
The final object type we'll discuss is a data frame, which is a collection of
vectors with the same length. In R, data frames are used to store most datasets,
and we'll get a lot of experience with them in the following chapters. Let's put
the three vectors, fiveIntegers, fiveBooleans, and fiveRandoms, into a data
frame with the data for now. role of the frame.We'll start a new project named
Tauntauns here. This project will house all of the materials related to our
Tauntaun population study.An absolute file path contains your computer’s root
directory and all other subdirectories that contain a file or folder. In Windows
the root directory is usually the C drive; in linux and MacOS it begins with a
forward slash / or tilde ~. A relative file path locates your file relative to
your working directory, which should be your Tauntaun project directory (one of
the benefits of using projects is that the working directory is automatically
set upon opening).The backslash key is above the Enter key on the keyboard. If
we use the backslash in a file path by mistake, R may or may not recognize the
following character as a special character to escape, but regardless of whether
it causes an error, the intended command will not be completed.The hunter
information is included in the second csv disk. We'll use the download.file
function and the relative path notation datasets/hunter.csv in this example.
This instructs R to navigate one level down from its current working directory
to a folder called “datasets” and create a file called “hunter.csv.” We heard
about the projects available in RStudio and created our Tauntaun project. We
learned some new file and directory features, as well as how to read in files
from Excel and use our first ‘apply' feature. All of this work lays the
groundwork for a seamless transition in the following chapters.The summary
function summarizes the data in an object rather than the object itself. There
are a few spelling errors in the data (such as the column name individul; it was
entered by volunteers, so we can't really complain), and at least one column
(method) has NA values, indicating missing data. Aside from correcting typos.
When saving R objects with the save feature, using the factor system saves
storage space and useful memory during the R session. Additionally, certain
statistical and graphic functions require the data to be stored as variables in
order for the system to function properly.In this chapter, we learned some new
functions and had plenty of opportunities to practice indexing. As a tauntaun
biologist (as they call us in this chapter), we'll probably spend a lot of time
working with results, and we've just scratched the surface. t
Comments
Post a Comment