## Exploring Big O Notation in Polyglot Notebooks

Mai 17, às 03:24

·

10 min de leitura

·

0 leituras

Polyglot Notebooks is a great way of running interactive code experiments mixed together with rich markdown documentation.

In this short article I want to introduce you to the `#!time`

magic command and show you how you can easily measure the execution time of a block of code.

This can be helpful for understanding the rough performance characteristics of a block of code inside of your Polyglot Notebook.

In fact, we'll use this to explore the programming concepts behind Big O notation and how code performance changes based on the number of items.

This article builds upon basic knowledge of Polyglot Notebooks so if you are not yet familiar with that, I highly recommend you read my Introducing Polyglot Notebooks article first.

## Recording Execution Time

Let's start off here showing how the `#!time`

magic command can be used to measure the execution time of a cell.

Before we do that, let's define a number of items variable in a C# code cell:

```
// The number of items for our loop
int numItems = 100;
```

We can then reference this variable in another cell:

```
// This is constant Time O(1)
Console.WriteLine(numItems);
```

When we run this cell, we'll see `100`

printed out below the cell.

However, if you wanted to instead see the amount of time the cell took to execute, you could add the `#!time`

magic command to the beginning of the code block as shown below:

```
#!time
// This is constant Time O(1)
Console.WriteLine(numItems);
```

Now when you run this cell you'll see a time it took the .NET Interactive kernel to run that cell. If you run the cell multiple times you'll likely see some variance in responses. These are a few values I saw:

- Wall time: 13.2689ms
- Wall time: 9.1655ms
- Wall time: 9.793ms

There are a few things I note about this.

First, the first run tends to be slower than subsequent runs. That's often a normal thing you see in dotnet in general with just-in-time compilation (JIT). I'm not sure if that's a factor under the hood for Polyglot Notebooks / .NET Interactive, but I wouldn't be surprised if it was.

Secondly, the wall time metrics reported below the cell are typically different and more precise than the cell's recorded time running itself.

This is where the real value of the `#!time`

magic command comes into play: it records just the time of your .NET code and returns the time with a greater degree of precision than the cell's user interface displays.

Finally, note that any cell output is still rendered below the cell. The wall time indicator is just an additional piece of information available to you.

## Exploring Big O Performance with the Time Magic Command

Now that we have the basics of the `#!time`

magic command down, we can use it to explore the performance characteristics of different types of algorithms in code.

This article isn't intended to be a broad introduction to Big O notation, and a proper exploration of Big O requires a class or two from a computer science curriculum, but here's Big O notation in a nutshell:

**Big O notation is a way of understanding the way that the duration of an algorithm scales as the number of items the algorithm is acting upon increases.**

### Big O(1) - Constant Time

The code we did earlier where we did a `Console.WriteLine(numItems);`

is *Big O(1) or constant time*. It doesn't matter how large `numItems`

is, it should take roughly the same amount of time to write its value whether `numItems`

is 1 or 2.1 billion.

Here's the full table of results I observed for various item levels:

Number of Items | Time in Milliseconds |
---|---|

1 | 15.5097 |

10 | 11.5248 |

100 | 10.6307 |

1000 | 11.7832 |

10000 | 12.1653 |

100000 | 11.5297 |

As you can see, there's some natural fluctuations, but the performance level stays more or less the same as the number of items increases. This variation is natural in programming due to small shifts in CPU utilization and RAM between runs.

We can represent this on a chart as a more or less constant performance level that doesn't shift as the item count grows:

### Big O(N) - Linear Time

Let's take a look at Big O(N) or linear time. This type of algorithm will have its time grow in a predictable and linear manner as `numItems`

increases.

A simple example of Big O(N) or linear time is a simple `for`

loop as shown below:

```
#!time
long sum = 0;
// Calculate the sum of all numbers from 0 to 1 less than numItems
for (int i = 0; i < numItems; i++)
{
sum += i;
}
```

Here our time measurements should follow a roughly linear line that increases as the number of items increases:

Number of Items | Time in Milliseconds |
---|---|

1 | 8.5396 |

10 | 9.6775 |

100 | 8.1355 |

1000 | 14.1006 |

10000 | 16.9402 |

100000 | 19.4948 |

While this data has a certain degree of noise to it, once we pass the natural variation levels a linear trend emerges where the more items we see, the time taken grows in a fairly linear and predictable manner as shown below:

*Note: the charts in this article do not strictly match the data but represent typical Big O notation curves*

### Big O(N Squared) - Exponential Time

Sometimes you need to have loops nested inside of other loops. This causes your performance to increase at an exponential rate.

We refer to this performance characteristic as Big O(N^2) or exponential time.

We generally try to avoid exponential time if we can, but it's not always possible. For example, sometimes you really need to look at combinations of every item with every other item.

Here's some exponential time code:

```
#!time
long sum = 0;
for (int i = 0; i < numItems; i++)
{
for (int j = 0; j < numItems; j++)
{
sum += i;
}
}
```

Note that for every item we are looping over all items again inside of its for loop. This results in the following performance characteristics:

Number of Items | Time in Milliseconds |
---|---|

1 | 7.3658 |

10 | 7.7903 |

100 | 15.7163 |

1000 | 15.6515 |

10000 | 324.2602 |

100000 | 26716.3727 |

When plotted on a graph, this exponential level of growth quickly becomes clear:

As you can see, exponential time should be avoided when possible, because even though it may perform *faster* than other routines at low item counts, large item counts will quickly impact its performance.

### Big O(log n) - Logarithmic Time

Logarithmic time, or Big O(log n) algorithms are more complex, but typically involve dividing a problem in half recursively until a solution is reached.

This approach has a higher degree of complexity, but scales far better at larger levels than even Big O(N).

The following code doubles `i`

every time through the loop, which should exhibit Big O(log N) performance.

```
#!time
long sum = 0;
for (int i = 1; i <= numItems; i *= 2)
{
sum += i;
}
```

I find the code above to be a bit contrived, but an equivalent type of performance would involve using the built-in `List<T>.Sort`

method.

```
using System;
using System.Collections.Generic;
Random rand = new();
// Add a series of random doubles to a list of numbers
List<double> numbers = new();
for (int i = 0; i < numItems; i++)
{
numbers.Add(rand.Next());
}
```

The code above uses linear time or Big O(N) to populate a list of numbers, but the code below uses the built-in Big O(log N) sort performance of List to sort those numbers:

```
#!time
numbers.Sort();
```

All together, these two separate algorithms share similar performance characteristics:

Number of Items | For Loop Time in Milliseconds | List Sort Time in Milliseconds |
---|---|---|

1 | 8.1706 | 8.3194 |

10 | 9.6122 | 11.9892 |

100 | 8.1899 | 11.7189 |

1000 | 8.9837 | 10.7112 |

10000 | 10.277 | 18.3251 |

100000 | 11.4038 | 20.616 |

As you can see, the duration of the operations increases as the number of items grows, but it does so at a less aggressive rate than linear or exponential time.

However, the inherent complexity of Log N operations may make them a less ideal choice than other algorithms if the item count is guaranteed to be small.

## Closing Thoughts

Big O is an intimidating concept for many new learners, but looking at simple code examples and being able to explore their performance using the `#!time`

magic command in Polyglot Notebooks helps it become much more manageable in my experience.

I wouldn't use the `#!time`

magic command to get mission-critical measurements or comparisons of code - I'd use a profiling tool like Visual Studio Enterprise or DotTrace for that.

However, the `#!time`

magic command is really handy for gaining quick insights of how your code performed at the time the cell was executed.

Using the measurements I was able to gather using the `#!time`

magic command, I gathered enough data to be able to plot a fairly representative graph of Big O performance characteristics.

I don't expect myself to use this command frequently, but when I need it, it's nice to have it.

## Continue lendo

DEV

Hoje, às 19:00

Tech Crunch

Hoje, às 15:00

Hacker News

Hoje, às 14:18

AI | Techcrunch

Hoje, às 14:00

TabNews

Hoje, às 13:15

Hacker News

Hoje, às 09:12

Mashable

Hoje, às 09:00

Tech Crunch

Hoje, às 08:30

DEV

Hoje, às 04:24

TabNews

Hoje, às 02:32