Big O Notation, algorithm efficiency: Internet vs. Pigeon transfer

Big O Notation

Today, we will cover two of my favorite topics: big O and algorithmic efficiency. I'm going to start off with a story. In Big O notation, the internet's data transfer is represented as O(n), while a pigeon's data transfer is represented as O(1). This means that the internet's transfer time increases linearly with the amount of data being transferred, while a pigeon's transfer time is constant regardless of the amount of data

Table Of Content 

Case Study of a Pigeon vs. slow intInternetansferring Data

Back in 2009, there was this company in South Africa, and they had a prolonged internet they were really, really frustrated by; they have actually two offices about 50 miles away, so they set up this little test to see if it would be faster transfer big chunk of data over the intInternetth their actual slow internet service provider or via carrier pigeon literally.  They took a bunch of USB sticks or one giant one strapped into a pigeon and flew it from one office to the next, and they got a ton of press over this test, and the media kinda love this stuff, right. And, of course, the pigeon won all right. It wouldn't be a funny story otherwise. And so they got a whole bunch of press over that, and people were like, oh my god, I can't believe that a pigeon beat the IntInternet.
But the problem was actually a ridiculous test because of the thing. Transferring data over the internet takes longer and longer with how much data there is with certain simplifications and assumptions; that's not the case with pigeons. Pigeons are slow or fast, but it always takes the same amount of time for a pigeon to transfer some chunk of data from one office to the next. It just has to fly 50 miles. It does not take longer because that USB stick contains more data, so of course, for a sufficiently large file, the pigeons will win.

Pigeon transfer data (constant time)

1gb = 60 minutes 
2gb = 60 minutes
3gb = 60 minutes 
1000 gb = 60 minutes 

So, in big O, we talk about this by describing the pigeon transfer speed as O of 1 or O(1), which is constant time concerning the input size. It takes little time with more input. Again, with certain simplifications and assumptions about the pigeon's ability to carry a USB stick. 

Internet transfer data (Linear Time) 

1gb = 30 minutes 
2gb = 60 minutes
3gb = 90 minutes 
1000 gb = 500 hours

But Internet time is. Internet transfer time is O of n or O(n). It scales linearly with the amount of input. Twice the amount of data will take roughly twice as much time. So that's what Big O is. 

How the runtime scales concerning some input variables

Big O is an equation that describes how the runtime scales pertaining to some input variables. I will talk about four specific roles, but first, let me show you what this might look like in code. 

This simple function just walks through an array and checks if it contains a particular value. You would describe this as O of n, where n is the size of the array. What about this method that talks through an array and prints all pairs of values in the array? Note that this will print if it has the elements 5 and 2 in the variety. It'll print both 2, 5 and 5, 2, So you would describe this probably as O of n squared  O(n2) where n is the size of the array. You can see that because it has n elements in the array, it has n squared pairs, so the amount of time it will take you to run that function will increase concerning n squared. Ok, so that's the basics of Big O.

How would you describe the runtime to mow a square plot of land?

Let's take another real-world example. How would you describe the runtime to mow a square plot of land? So, your square grass plot means you have this grass plot and need to harvest it all. What's the runtime of this? Well, it's an interesting question, but you could describe it in two ways. One way you can explain it is O of a, where a is the area in that plot of land. Remember that this is just an equation that expreexpressingime increases, so you don't have to use n in your euse other variables, and often, it makes sense.

So you can describe this as O of a where a is the area. You could also describe this as O of s squared where s is the amount of, which is the length of one side because it is a square plot of land, so the amount of area is s squared. So you must realize that O of a or O(a) and this O of s O(s) squared are saying essentially the same thing, just like when you described the area of a square. You could describe it as 25, or you could describe it as 5 squared because it has a square, and it doesn't make it bigger, so both ways of describing the runtime. It depends on what is clearer for that problem.

Four Important rules to know about Big O

(1) Different steps get added

The first one is if you have two different steps in your algorithm, you add up those steps, so if you have a first step that takes you to know O of a time and a second step that takes O of b, you add up those runtimes, and you get over a plus b. So, for example, you have the runtime that, you have this algorithm, that first walks through one array, and then it walks through this other array. It will be how long it takes to walk through each of them. So you'll add up their runtimes.

(2) Drop Contents

The second way, in the second rule, is to drop constants. So, let's look at these two ways to compress out the min and Max element in your array. One finds the min element and then the max element; the other sees them in the max simultaneously. Now these are fundamentally doing the same thing, they're doing essentially the exact same operations just doing them in slightly different orders.

Both of these get described as O of n, where n is the size of the array. Now, it's really tempting for people to sometimes see two different loops and describe them as O of 2n or O(2n), so you must remember that you drop constants and Big O. You do not tell them as O of 2n or O of 3n because you're just looking for how things scale roughly. Is it a linear relationship, or is it a quadratic relationship, so you always drop constants.

(3) Different input -> different variables

The third rule is that if you have different inputs, you usually use other variables to represent them. So, let's take this example where we have two arrays and walk through them to figure out the number of elements in common between them. Some people will mistakenly call this O of n squared or O(n2), but that needs to be corrected. When discussing runtime, if you describe things as O of n  or O of n squared or O of n log, n must have a meaning; it's not like things are inherently data or other. So when you describe this as O of n squared, it doesn't make sense because it doesn't. What does n mean? n is not the size of the array if there are two different arrays. What you want to describe instead is O of a times b. Again, fundamentally, Big O is an equation expressing how the runtime changes and scales, so you described it as O of a times b or O(a*b).

(4) Drop non-dominant terms

 The fourth rule is that you drop nondominant terms. So you have this code here that walks on the dominant array and prints out the most significant number, and then, for some reason, it prints all pairs of values in the array. So, the first step takes O of n O(n) time. The second step takes O of n squared time O(n2), so you could first get as a, you know, less simplified form,  you can describe to them as O of n+ n squared O(n+n2), but note that you compare this two O of n and the runtime, or the compare this to the run time O of n squared and the runtime O of n squared plus n squared. Both runtimes reduce to n squared, and what's interesting here is that O of n plus n squared should logically be between them. So, this n-squared thing dominates this O of n thing, so you drop that non-dominant term. It's the n squared that's really gonna drive how the runtime changes.

And if you want, nondominant a more academic explanation for when exactly you drop things and what exactly you can't, but this is layman's explanation. We've now covered the fundamental pieces of Big O. This is a basic concept to master for your interviews, so don't be lazy here. Ensure you understand this, and try many exercises to solidify your understanding. Good luck

Post a Comment