R Machine Learning By Example
上QQ阅读APP看书,第一时间看更新

Working with functions

Next up, we will be looking at functions, which is a technique or methodology to easily structure and modularize your code, specifically lines of code which perform specific tasks, so that you can execute them whenever you need them without writing them again and again. In R, functions are basically treated as just another data type and you can assign functions, manipulate them as and when needed, and also pass them as arguments to other functions. We will be exploring all this in the following section.

Built-in functions

R consists of several functions which are available in the R-base package and, as you install more packages, you get more functionality, which is made available in the form of functions. We will look at a few built-in functions in the following examples:

> sqrt(5)
[1] 2.236068
> sqrt(c(1,2,3,4,5,6,7,8,9,10))
[1] 1.000000 1.414214 1.732051 2.000000 2.236068 2.449490 2.645751 [8] 2.828427 3.000000 3.162278
> # aggregating functions
> mean(c(1,2,3,4,5,6,7,8,9,10))
[1] 5.5
> median(c(1,2,3,4,5,6,7,8,9,10))
[1] 5.5

You can see from the preceding examples that functions such as mean, median, and sqrt are built-in and can be used anytime when you start R, without loading any other packages or defining the functions explicitly.

User-defined functions

The real power lies in the ability to define your own functions based on different operations and computations you want to perform on the data and making R execute those functions just in the way you intend them to work. Some illustrations are shown as follows:

square <- function(data){
 return (data^2)
}
> square(5)
[1] 25
> square(c(1,2,3,4,5))
[1] 1 4 9 16 25
point <- function(xval, yval){
 return (c(x=xval,y=yval))
}
> p1 <- point(5,6)
> p2 <- point(2,3)
> 
> p1
x y 
5 6 
> p2
x y 
2 3

As we saw in the previous code snippet, we can define functions such as square which computes the square of a single number or even a vector of numbers using the same code. Functions such as point are useful to represent specific entities which represent points in the two-dimensional co-ordinate space. Now we will be looking at how to use the preceding functions together.

Passing functions as arguments

When you define any function, you can also pass other functions to it as arguments if you intend to use them inside your function to perform some complex computations. This reduces the complexity and redundancy of the code. The following example computes the Euclidean distance between two points using the square function defined earlier, which is passed as an argument:

> # defining the function
euclidean.distance <- function(point1, point2, square.func){
 distance <- sqrt(
 as.integer(
 square.func(point1['x'] - point2['x'])
 ) +
 as.integer(
 square.func(point1['y'] - point2['y'])
 )
 )
 return (c(distance=distance))
}
> # executing the function, passing square as argument
> euclidean.distance(point1 = p1, point2 = p2, square.func = square)
distance 
4.242641 
> euclidean.distance(point1 = p2, point2 = p1, square.func = square)
distance 
4.242641 
> euclidean.distance(point1 = point(10, 3), point2 = point(-4, 8), square.func = square)
distance
14.86607

Thus, you can see that with functions you can define a specific function once and execute it as many times as you need.