By: Karthik Janar in data-science Tutorials on 2018年05月22日 [フレーム]
While doing data analysis, it is highly recommended to use proper naming conventions for files, variables and especially column names. This is very important for two reasons:
The make.names() function in R does exactly that. To demonstrate the use of make.names() function, let us use a simple data frame.
Create a simple employee data frame using four variables and 4 rows of values.
vcode <- c(20001,20002,20003,20004)
vFname <- c("Brian","Jeff","Roger","Karthik")
vLname <- c("Caffo","Leek","Peng","Janar")
vSal <- c(10000,15000,18000,20000)
emp <- data.frame(vcode,vFname,vLname,vSal)
str(emp)
## 'data.frame': 4 obs. of 4 variables:
## $ vcode : num 20001 20002 20003 20004
## $ vFname: Factor w/ 4 levels "Brian","Jeff",..: 1 2 4 3
## $ vLname: Factor w/ 4 levels "Caffo","Janar",..: 1 3 4 2
## $ vSal : num 10000 15000 18000 20000
As you can see the str shows the column names as the name of the vectors we created earlier. So let us first add some column names as below. We have included some spaces and brackets purposely to show how make.names() converts them.
names(emp) <- c("Code","First Name","Last Name", "Salary(SGD)")
str(emp)
## 'data.frame': 4 obs. of 4 variables:
## $ Code : num 20001 20002 20003 20004
## $ First Name : Factor w/ 4 levels "Brian","Jeff",..: 1 2 4 3
## $ Last Name : Factor w/ 4 levels "Caffo","Janar",..: 1 3 4 2
## $ Salary(SGD): num 10000 15000 18000 20000
Now let us call makes.names() to clean the column names.
names(emp) <- make.names(names(emp))
emp
## Code First.Name Last.Name Salary.SGD.
## 1 20001 Brian Caffo 10000
## 2 20002 Jeff Leek 15000
## 3 20003 Roger Peng 18000
## 4 20004 Karthik Janar 20000
Now the spaces and brackets are removed and replaced with dots and looks much cleaner. Make it a habit to always clean the column names of all data frames that you read from different file sources as a first step to data cleaning.
This policy contains information about your privacy. By posting, you are declaring that you understand this policy:
This policy is subject to change at any time and without notice.
These terms and conditions contain rules about posting comments. By submitting a comment, you are declaring that you agree with these rules:
Failure to comply with these rules may result in being banned from submitting further comments.
These terms and conditions are subject to change at any time and without notice.
Most Viewed Articles (in data-science )
Functions in R - Creating your first R function
Introduction to logical operations in R
What is Scrapy and how to use it.
Types of Analysis - Data Science Questions?
Manipulating Data with dplyr in R
Generating Sequence numbers in R - seq(), rep() c() etc.
Latest Articles (in data-science)
© 2023 Java-samples.com
Tutorial Archive: Data Science React Native Android AJAX ASP.net C C++ C# Cocoa Cloud Computing EJB Errors Java Certification Interview iPhone Javascript JSF JSP Java Beans J2ME JDBC Linux Mac OS X MySQL Perl PHP Python Ruby SAP VB.net EJB Struts Trends WebServices XML Office 365 Hibernate
Latest Tutorials on: Data Science React Native Android AJAX ASP.net C Cocoa C++ C# EJB Errors Java Certification Interview iPhone Javascript JSF JSP Java Beans J2ME JDBC Linux Mac OS X MySQL Perl PHP Python Ruby SAP VB.net EJB Struts Cloud Computing WebServices XML Office 365 Hibernate