Coding Basics 2: Loops and Functions

tutorial workflow rstats NetLogo

A brief introduction into various coding basics for people who are beginning to use R or other programming languages. In this session, we will be looking at both functions and loops in R, with examples from NetLogo and C++ and go over use and basic functionality.

Tobias Kürschner (IZW Berlin)https://ecodynizw.github.io/
2022-11-14

Functions and Loops

What are they for?

In short: making you life easier. They are used to automate certain steps in your code to make the execution faster.

Functions

A simple example: you have multiple data sets with a measurement in inches. To continue working with that data you need to convert it meter, but doing that manually would take hours. Solution: a function!!

In R:

inch_to_meter <-   #function name
  function(inch) #function input parameter(s)
  { # function body
    inM <- (inch * 0.0254) # in our case: transformation
    
    return(inM) # function reporter
  }

Now lets apply that function to some random example data:

myOldData <- c(15,98,102,5,17)

myNewData <- inch_to_meter(myOldData)

Our function is applied to all elements of the old data, giving us the converted measurements.

myOldData
[1]  15  98 102   5  17
myNewData
[1] 0.3810 2.4892 2.5908 0.1270 0.4318

For a more general approach we can use the following template for simple functions:

myFunction <-
  function(input1, input2, input3)
  {
    output <- input1 * input3 / input2 # example operation
    
    return(output)
  }

In C++:

# the viableCell function of the class Grid returns a bool and takes two inputs
# (x and y coordinates)
bool Grid::viableCell(CellCount_t x, CellCount_t y) 
{
    return !m_grid.empty() && x < m_grid.cbegin() -> size() && y < m_grid.size();
}

In NetLogo, the closest thing we have to functions are reporters:

to-report Male_Alpha_Alive 

  ifelse (any? turtles-here with [isAlpha = true AND isFemale = false])
  [report true]
  [report false] #else

end

Loops

What is a loop? It a simply a piece of code we want to repeat. But wait, isn’t that exactly what a function does? Well, yes and no. A function (after it is declared) is an embodiment of a piece of code that we can run anytime just by calling it. A loop is a local repetition of code.

Lets stay in R and look at some examples:

A Classic, the ‘for’ Loop

If you auto fill “for” in R, it will give you the following structure:

for (variable in vector)
{
  #body
}

So what is variable and what is vector? In this case the variable, you could also call iterator (often you will see the letter i used) is simply put a counter that tells our loop how many times it has repeated itself. The vector is determined by us and tells the loop how many repetitions we want before it stops.

for (i in 1:10)
{
  #body
}

The loop above will now repeat exactly 10 times and then stop. R is helpful in the sense that it automatically increments i after each repetition. Other language like C++ for example need a manual increment of i:

for (i = 0, i <= 10, ++i) #C++ 
{
  #body
}

With the basics out of the way lets look at an example:

myData <- rnorm(30) # random numbers
myResults <- 0 # initializing myResults

for(i in 1:10)
{
  # i-th element of `myData` squared into `i`-th position of `myResults`
  myResults[i] <- myData[i] * myData[i]  
  print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
myResults
 [1] 2.348470e+00 8.045558e-02 9.254717e+00 6.298287e-01 5.587004e-02
 [6] 5.496863e-01 2.161664e+00 1.326536e-05 2.026844e+00 2.065682e-01

This was of course a very simple loop and there is pretty much no limit to the level of complexity those loops can have and how many loops could be nested. But be warned there can be some pitfalls with loops. Exercise: Would the following loop work?

for (i in 1:length(summIdentSplit))
{
  tmpSumm <- summIdentSplit[[i]]
  tmpName <- SummIdentVector[[i]]
  
  if (base::grepl("HD", tmpName) == TRUE)
  {
    clearName <- "Habitat-driven movement"
    cn <- "HD"
  }
  
  combiMelt <- tmpSumm %>%
    dplyr::ungroup() %>%
    dplyr::select(t, Ninfected_mean, Nimmune_mean) %>%
    dplyr::rename(Infected = Ninfected_mean, Immune = Nimmune_mean)
  
  combiMelt_sd <- tmpSumm %>%
    dplyr::ungroup() %>%
    dplyr::select(t, Ninfected_sd, Nimmune_sd) %>%
    dplyr::rename(Inf_sd = Ninfected_sd, Imm_sd = Nimmune_sd)
  
  combiMelt_c <- dplyr::left_join(combiMelt, combiMelt_sd , by = "t")
  
  q <- 0
  combiMelt_c$quarter <- 0
  
  for (i in 1:nrow(combiMelt_c))
  {
    if (i %% 13 == 0)
    {
      q <- q + 1
    }
    combiMelt_c$quarter[i] <- q
  }
}

The answer is no. There are two loops involved where one is nested inside the other. So far, that would not be an issue, however, both loops use the iterator (variable) i. To see what would happen:

First iteration:
  i = 1

for (i in 1:length(summIdentSplit))
{
  tmpSumm <- summIdentSplit[[i]]
  tmpName <- SummIdentVector[[i]]
  
  if (base::grepl("HD", tmpName) == TRUE)
  {
    clearName <- "Habitat-driven movement"
    cn <- "HD"
  }
  
  combiMelt <- tmpSumm %>%
    dplyr::ungroup() %>%
    dplyr::select(t, Ninfected_mean, Nimmune_mean) %>%
    dplyr::rename(Infected = Ninfected_mean, Immune = Nimmune_mean)
  
  combiMelt_sd <- tmpSumm %>%
    dplyr::ungroup() %>%
    dplyr::select(t, Ninfected_sd, Nimmune_sd) %>%
    dplyr::rename(Inf_sd = Ninfected_sd, Imm_sd = Nimmune_sd)
  
  combiMelt_c <- dplyr::left_join(combiMelt, combiMelt_sd , by = "t")
  
  q <- 0
  combiMelt_c$quarter <- 0

At this point i is still 1 and lets say nrow(combiMelt_c) is also 10 (like in our outer loop)

  for (i in 1:nrow(combiMelt_c))
  {
    if (i %% 13 == 0)
    {
      q <- q + 1
    }
    combiMelt_c$quarter[i] <- q

at this point i is iterated within the inner loop

  }

once we reach this point i is 10, so the for the next iteration of the outer loop, i = 10 so most iterations of the outer loop will be skipped!

}

Solution: make sure to use different iterators in nested loops. For example the outer loop uses i, the inner loop uses j and maybe that loop has also a nested loop which then uses k as its iterator.

‘while’ Loops

Similar to for loops, while loops also repeat a certain block of code. The difference here is, that while loop repeat until a certain condition is fulfilled, potentially forever. R’s auto fill provides us the following code snippet:

while (condition)
{
  
}

A simple example:

n <- 0

while(n < 100)
{
  n = n + 1
}

print(n)
[1] 100

As long as n is below 100 we increment n every repeat. Take note: if the condition is never fulfilled, the loop will run forever and the software might crash. In complex loops you could run a “escape timer” such as in this example plucked from an IBM:

while(!viableCell())
{
   turn(2 * (atan (( (1 - 0.5) / (1 + 0.9)) * tan ((randomDouble(1) - 0.5)) * 180)) + 1) 
}

This little loop runs on an individual and check’s the cell in front of the individual for its viability to move into. As long as viableCell() reports false, the individual turns. However, there are cases when they are no viable cells around so the individual would turn in a circle forever. Introducing a random timer (note, this is not an optimal solution just a quick fix).

int timer = 0

while(!viableCell() && timer < 50 )
{
   turn(2 * (atan (( (1 - 0.5) / (1 + 0.9)) * tan ((randomDouble(1) - 0.5)) * 180)) + 1) 
  
  ++timer #increment the timer
}

Now the individual turns a maximum of 50 times before the loop ends. Another option would be conditionals:

Bonus Round: Conditionals

Most people already know what a conditional is. Basically, when a certain condition is fulfilled something happens (usually done via if and/or else).

int timer = 0

while(!viableCell())
{
   turn(2 * (atan (( (1 - 0.5) / (1 + 0.9)) * tan ((randomDouble(1) - 0.5)) * 180)) + 1) 
  
  ++timer #increment the timer
  
  if(timer == 50)
  {
    break # break is a function that for example ends a loop
  }
}

Conditionals can be used in a variety of way within and outside of loops but have the advantage that they can be used stop loops under certain conditions. Lets say you are running a complex construct of multiple nested for loops to find a certain value in your data. You don’t need to always iterate through all the data if you can create certain logical stop conditions.

Citation

For attribution, please cite this work as

Kürschner (2022, Nov. 14). Ecological Dynamics: Coding Basics 2: Loops and Functions. Retrieved from https://ecodynizw.github.io/posts/coding-basics/

BibTeX citation

@misc{kürschner2022coding,
  author = {Kürschner, Tobias},
  title = {Ecological Dynamics: Coding Basics 2: Loops and Functions},
  url = {https://ecodynizw.github.io/posts/coding-basics/},
  year = {2022}
}