03. Filtering Vectors all(), any(), which(), subset()

When working with vectors, you'll often need to filter out values that meet a criteria. In R, this process is known as filtering, and the platform provides several functions to help you extract subsets of data.

The any() Function

The any() function returns TRUE or FALSE, depending on whether all arguments match that criteria. The TRUE and FALSE are of type logical.

> x <- 1:100
> any(x > 101)
> any(x == 2)
> any(x <= 50)

The all() Function

On the flip side, we can use the all() function to test if all values meet a certain criteria.

> x < 1:100
> all(x > 40)
> all(x > 0)

Comparison Operations

With vectors, we may run comparison operations to return vector containing logical values. For example:

> x <- c(1,2,3,4,5,6)
> x > 3

As you can see, we are returned a logical vector containing TRUE and FALSE values, depending on how that positional element was evaluated.

How is this useful? We can use these resulting logical vectors to pull out subvectors. Let's say we only want to pull out odd values - we can write:

> x <- c(12,423,52,21,324)
> x[x %% 2 == 1]
[1] 423  21

The x %% 2 == 1 returns a logical vectors. All positions where TRUE is held are then printed.

We can further use this feature to replace values that meet a certain criteria:

> x <- c(1,2,3,4,5,6)
> x[x*x>20] = 1337
> x
[1]   1    2    3    4 1337 1337

Pulling out subvectors with subset()

In the methods mentioned above, NA values are included in the subvector, no matter the condition.

> x <- c(1,2,3,NA,5,6)
> x[x>2]
[1] 3 NA 5 6 

In the case when you need to exclude the NA, you may use the subset() function.

> subset(x, x>2)
[1] 3 5 6

Pulling out indicies with which()

If you need to pull out not the actual values but just the indicies in which the values of a certain condition reside, then use the which() function. This will return all the indicies that match a certain criteria.

> z <- c(1,2,3,4,5,6)
> which(z > 3)
[1] 4 5 6 

