Startec

Startec

Exploring Big O Notation in Polyglot Notebooks

Mai 17, às 03:24

·

10 min de leitura

·

0 leituras

Polyglot Notebooks is a great way of running interactive code experiments mixed together with rich markdown documentation. In this short article I want to introduce you to the #!time magic command and show...
Exploring Big O Notation in Polyglot Notebooks

Polyglot Notebooks is a great way of running interactive code experiments mixed together with rich markdown documentation.

In this short article I want to introduce you to the #!time magic command and show you how you can easily measure the execution time of a block of code.

This can be helpful for understanding the rough performance characteristics of a block of code inside of your Polyglot Notebook.

In fact, we'll use this to explore the programming concepts behind Big O notation and how code performance changes based on the number of items.

Big O Chart

This article builds upon basic knowledge of Polyglot Notebooks so if you are not yet familiar with that, I highly recommend you read my Introducing Polyglot Notebooks article first.

Recording Execution Time

Let's start off here showing how the #!time magic command can be used to measure the execution time of a cell.

Before we do that, let's define a number of items variable in a C# code cell:

// The number of items for our loop
int numItems = 100;

Enter fullscreen mode Exit fullscreen mode

We can then reference this variable in another cell:

// This is constant Time O(1)
Console.WriteLine(numItems);

Enter fullscreen mode Exit fullscreen mode

When we run this cell, we'll see 100 printed out below the cell.

However, if you wanted to instead see the amount of time the cell took to execute, you could add the #!time magic command to the beginning of the code block as shown below:

#!time
// This is constant Time O(1)
Console.WriteLine(numItems);

Enter fullscreen mode Exit fullscreen mode

Now when you run this cell you'll see a time it took the .NET Interactive kernel to run that cell. If you run the cell multiple times you'll likely see some variance in responses. These are a few values I saw:

  • Wall time: 13.2689ms
  • Wall time: 9.1655ms
  • Wall time: 9.793ms

There are a few things I note about this.

First, the first run tends to be slower than subsequent runs. That's often a normal thing you see in dotnet in general with just-in-time compilation (JIT). I'm not sure if that's a factor under the hood for Polyglot Notebooks / .NET Interactive, but I wouldn't be surprised if it was.

Secondly, the wall time metrics reported below the cell are typically different and more precise than the cell's recorded time running itself.

Wall Time

This is where the real value of the #!time magic command comes into play: it records just the time of your .NET code and returns the time with a greater degree of precision than the cell's user interface displays.

Finally, note that any cell output is still rendered below the cell. The wall time indicator is just an additional piece of information available to you.

Exploring Big O Performance with the Time Magic Command

Now that we have the basics of the #!time magic command down, we can use it to explore the performance characteristics of different types of algorithms in code.

This article isn't intended to be a broad introduction to Big O notation, and a proper exploration of Big O requires a class or two from a computer science curriculum, but here's Big O notation in a nutshell:

Big O notation is a way of understanding the way that the duration of an algorithm scales as the number of items the algorithm is acting upon increases.

Big O(1) - Constant Time

The code we did earlier where we did a Console.WriteLine(numItems); is Big O(1) or constant time. It doesn't matter how large numItems is, it should take roughly the same amount of time to write its value whether numItems is 1 or 2.1 billion.

Here's the full table of results I observed for various item levels:

Number of Items Time in Milliseconds
1 15.5097
10 11.5248
100 10.6307
1000 11.7832
10000 12.1653
100000 11.5297

As you can see, there's some natural fluctuations, but the performance level stays more or less the same as the number of items increases. This variation is natural in programming due to small shifts in CPU utilization and RAM between runs.

We can represent this on a chart as a more or less constant performance level that doesn't shift as the item count grows:

Big O Chart Showing Constant Time

Big O(N) - Linear Time

Let's take a look at Big O(N) or linear time. This type of algorithm will have its time grow in a predictable and linear manner as numItems increases.

A simple example of Big O(N) or linear time is a simple for loop as shown below:

#!time
long sum = 0;
// Calculate the sum of all numbers from 0 to 1 less than numItems
for (int i = 0; i < numItems; i++) 
{
 sum += i;
}

Enter fullscreen mode Exit fullscreen mode

Here our time measurements should follow a roughly linear line that increases as the number of items increases:

Number of Items Time in Milliseconds
1 8.5396
10 9.6775
100 8.1355
1000 14.1006
10000 16.9402
100000 19.4948

While this data has a certain degree of noise to it, once we pass the natural variation levels a linear trend emerges where the more items we see, the time taken grows in a fairly linear and predictable manner as shown below:

Big O Chart Showing Linear Time

Note: the charts in this article do not strictly match the data but represent typical Big O notation curves

Big O(N Squared) - Exponential Time

Sometimes you need to have loops nested inside of other loops. This causes your performance to increase at an exponential rate.

We refer to this performance characteristic as Big O(N^2) or exponential time.

We generally try to avoid exponential time if we can, but it's not always possible. For example, sometimes you really need to look at combinations of every item with every other item.

Here's some exponential time code:

#!time
long sum = 0;
for (int i = 0; i < numItems; i++) 
{
 for (int j = 0; j < numItems; j++) 
 {
 sum += i;
 }
}

Enter fullscreen mode Exit fullscreen mode

Note that for every item we are looping over all items again inside of its for loop. This results in the following performance characteristics:

Number of Items Time in Milliseconds
1 7.3658
10 7.7903
100 15.7163
1000 15.6515
10000 324.2602
100000 26716.3727

When plotted on a graph, this exponential level of growth quickly becomes clear:

Big O Chart

As you can see, exponential time should be avoided when possible, because even though it may perform faster than other routines at low item counts, large item counts will quickly impact its performance.

Big O(log n) - Logarithmic Time

Logarithmic time, or Big O(log n) algorithms are more complex, but typically involve dividing a problem in half recursively until a solution is reached.

This approach has a higher degree of complexity, but scales far better at larger levels than even Big O(N).

The following code doubles i every time through the loop, which should exhibit Big O(log N) performance.

#!time
long sum = 0;
for (int i = 1; i <= numItems; i *= 2)
{
 sum += i;
}

Enter fullscreen mode Exit fullscreen mode

I find the code above to be a bit contrived, but an equivalent type of performance would involve using the built-in List<T>.Sort method.

using System;
using System.Collections.Generic;
Random rand = new();
// Add a series of random doubles to a list of numbers
List<double> numbers = new();
for (int i = 0; i < numItems; i++) 
{
 numbers.Add(rand.Next());
}

Enter fullscreen mode Exit fullscreen mode

The code above uses linear time or Big O(N) to populate a list of numbers, but the code below uses the built-in Big O(log N) sort performance of List to sort those numbers:

#!time
numbers.Sort();

Enter fullscreen mode Exit fullscreen mode

All together, these two separate algorithms share similar performance characteristics:

Number of Items For Loop Time in Milliseconds List Sort Time in Milliseconds
1 8.1706 8.3194
10 9.6122 11.9892
100 8.1899 11.7189
1000 8.9837 10.7112
10000 10.277 18.3251
100000 11.4038 20.616

Big O Chart Showing Log N Performance

As you can see, the duration of the operations increases as the number of items grows, but it does so at a less aggressive rate than linear or exponential time.

However, the inherent complexity of Log N operations may make them a less ideal choice than other algorithms if the item count is guaranteed to be small.

Closing Thoughts

Big O is an intimidating concept for many new learners, but looking at simple code examples and being able to explore their performance using the #!time magic command in Polyglot Notebooks helps it become much more manageable in my experience.

I wouldn't use the #!time magic command to get mission-critical measurements or comparisons of code - I'd use a profiling tool like Visual Studio Enterprise or DotTrace for that.

However, the #!time magic command is really handy for gaining quick insights of how your code performed at the time the cell was executed.

Using the measurements I was able to gather using the #!time magic command, I gathered enough data to be able to plot a fairly representative graph of Big O performance characteristics.

Big O Chart

I don't expect myself to use this command frequently, but when I need it, it's nice to have it.


Continue lendo

DEV

React MUI Content Security Policy
Material-UI is a user interface library that provides predefined and customizable React components for faster and easy web development, these Material-UI components are based on top of Material Design by...

Hoje, às 19:00

Tech Crunch

Taking the pulse on the Northeast seed market with Techstars' Kerty Levy
Techstars’ Kerty Levy knows a thing or two about where seed funding is, and where it might be going, in the Northeast. During a presentation at TechCrunch’s Early Stage in Boston last month, Levy took a brief...

Hoje, às 15:00

Hacker News

Mastering CSS Blend Modes
CSS mix blend modes provide an easy, yet powerful way to create visually interesting designs. Visual effects galore The modes allow you to manipulate how elements interact with each other. Which can lead in...

Hoje, às 14:18

AI | Techcrunch

3 Views on a16z's latest reported early-stage effort
a16z, a venture capital firm known for its large fund sizes and for shaking up the VC game when it piled into the industry back in 2009, is cooking up a new strategy to potentially bolster its deal flow,...

Hoje, às 14:00

TabNews

Dúvida sobre os "níveis" de experiência · Vnj
Olá galera, estou estudando programação ja faz uns meses e me veio a seguinte dúvida: o que te faz ser um júnior? Pelo que li em alguns posts daqui, seria ter algum contato com o mercado...

Hoje, às 13:15

Hacker News

ARM’s Cortex A53: Tiny But Important
Tech enthusiasts probably know ARM as a company that develops reasonably performant CPU architectures with a focus on power efficiency. Product lines like the Cortex A7xx and Cortex X series use we…

Hoje, às 09:12

Mashable

Save up to 70% on cables, power stations, and more in this Memorial Day sale
This is a great time to upgrade your charging setup. The following content is brought to you by Mashable partners. If you buy a product featured here, we may earn an affiliate commission or other...

Hoje, às 09:00

Tech Crunch

Startups should absolutely work with governments to support defense projects
Maëlle Gavet is the CEO of Techstars and was previously a senior executive at numerous large tech companies around the world. In these times of heightened tensions and global volatility, I believe startups...

Hoje, às 08:30

DEV

How to Track Gumroad Sales in Notion Using Notion API and Python
Introduction In this tutorial, you’ll learn how to track Gumroad1 sales in real-time in Notion2 using 🐍 Python. You will also learn, What are APIs? How to use Gumroad API? How to use Notion API? How run a...

Hoje, às 04:24

TabNews

Como criar um git/github (e as primeras configs) obs: no windows e com o vscode · NicolasdevNx
Olá, este "artigo" tem como objetivo ensinar como baixar e usar o git eo o github(para este não é neseçario o dowload) então vomos lá. 1:Acesse o site https://git-scm.com/downloads escolh...

Hoje, às 02:32