When you work together on a project, it's a good idea to use the same coding style. Similarly, since I have to grade all of your coding assignments, it helps me if we all use the same style. We will adopt the following style for this class.
Variable and function names should generally be lowercase, except where mathematical convention suggests otherwise. Use an underscore (_
) to separate words within a name, except where numbers are appended. Strive for names that are short (try and keep them under 10 characters) yet meaningful so that the code is self documenting (a convention widely used in the software industry). Nevertheless, use compound names sparingly and don't overdo the meaningful name idea. In contrast to the software industry, much of what we do is math and related to equations or models with widely used conventions. Then, single letter variable names that match equations are desirable.
day1 #Nice
d #Best if d is from an equation
d_1 #Best if d subscript 1 is from an equation
day_one #Excessive for this particular example
fdom #Last resort for when you must be explicit but very short
N #Widely used in ecology for population abundance
dayOne #This is camel case, which we won't use
first_day_of_month #Who wants to type this. Limit to rare cases.
Day_one #Don't capitalize
dayone
day.one
population_abundance #As ecologists, we know what N means
abundance #Ditto
Some software industry folks might be appalled that, say, first_day_of_month
or population_abundance
are not in style but d_1
or N
are. However, R style tends to be more terse. For example, consider that the name of the function to calculate the standard deviation is sd
, not calcStandardDeviation
. Frankly, I'm glad we don't have to type the latter.
Use <-
, not =
, for assignment.
x <- 5
x = 5
Place spaces around all mathematical operators (arithmetic, logical, comparison) and the assignment operator (+
, -
, <-
, etc.) but not around other operators. However, you can use a judicious lack of space to clarify groupings in complicated equations. No spaces after the comma in indexes. Keep your script clean (for example, don't leave a string of spaces on an empty line).
x <- a + b * c
x <- 1:10
x <- y[3,2]
x <- 2 * 3 ^ 2
x <- 2 * 3^2
x <- 2 * 3 + a - b / 2
x<-a+b*c
x <- 1 : 10
x <- y [ 3, 2 ]
x <- 2*3^2
x <- 2*3+a-b/2
Always use curly braces. An opening curly brace should never go on its own line and should always be followed by a new line. A closing curly brace should always go on its own line, unless it is followed by else
. Indent the code inside curly braces by four spaces. If you're feeling like a pro, feel free to go with two spaces if that's what you're used to. But if you're new to structured programming, four spaces really helps to make the algorithmic structure stand out. There should be a space before and after the opening parenthesis of the control structure and a space before and after the closing parenthesis. Never use tabs.
if ( y < 0 ) {
message("Y is negative")
}
if ( y == 0 ) {
log(x)
} else {
y ^ x
}
for ( i in 1:10 ) {
x[i] <- 2 * i
}
if( y < 0 )
message("Y is negative")
if( y < 0 ) message("Y is negative")
if(y ==0){
log(x)
}else {
y ^ x
}
for (i in 1:10) {
x[i] <- 2 * i
}
Don't use spaces around function parentheses. Spaces are optional after commas or around mathematical operators within parentheses of a function call. Don't use spaces around the =
symbol. Balance the need for readability and fitting on a line. Use your judgment.
plot(x,y)
plot(x, y)
plot(x, y=2*z, xlim=c(0,100), ylim=c(-10,10), type="l")
plot(x,y,xlim=c(0,100),ylim=c(-10,10),main="My plot of some data",type="l")
y <- c(1,2,6,2,8,6,12)
mean (y)
plot(x, y, xlim = c(0,100), ylim = c(-10,10), main = "My plot of some data", type = "l")
plot(x, y, xlim=c(0, 100), ylim=c(-10, 10))
plot( x, y )
No extra spaces inside parentheses. Space after comma in argument list. Space before brace. Indent 4 spaces. Do not rely on R’s implicit return feature. It is better to be clear about your intent to return()
an object.
add_values <- function(x, y) {
return(x + y)
}
add_values <- function( x, y ) {
x + y
}
Strive to limit your code to 80 characters per line.
Each line of a comment should begin with the comment symbol (#
) and a single space. Inline comments should be short, positioned for readability, and do not require a space after the #
. In function definitions, arguments and returns should be described.
# Linear regression of weight data
fit <- lm(weight~height)
plot(height,weight)
abline(fit) #Note outlier
residuals(fit) #Outlier is case 52
Order your program as follows:
- Description (include purpose of program, author, version, date)
source()
andlibrary()
statements- Function definitions
- Executed statements
There is no agreed upon R style. For example, here is one R style by Hadley Wickham used with tidyverse. Wickham says his style is based on Google's original R style guide (but they were both rather different in detail in many areas). Google has a new R style guide here. Our class style adopts components of these styles. This class is about algorithms, so the spacing rules emphasize the control structures in algorithms to a greater degree than tidyverse style, which instead emphasizes functions.