What Makes for a Good Beginner's Project?

You're able to create fun small scripts as described in some course. But when you want to create something "real-world," you don't know where to start. Between programs that are laughably simplistic and the ones that are gigantic monoliths, how can you easily write software that's "real-world" enough?

First, let's look at how the complexity of any program is defined, and how it relates to your skill. In truth, it is not true that larger programs are necessarily more complex than smaller ones. Let me explain.

You might agree that a 1000-line program is more complex compared to a 10-line piece of code. But it is possible for a 1000-line program to be just:

print("Hello 1")
print("Hello 2")
...
print("Hello 10,000")

The program above is not "complex." It's just one line repeated 1000 times.

However, it is possible for just a few lines of code to be extremely confusing. Look at the piece of pseudo-code below:

# Pseudo-code credit: Wikipedia
procedure F(A : list of items)
    n := length(A)
    repeat
        swapped := false
        for i := 1 to n - 1 inclusive do
            if A[i - 1] > A[i] then
                swap(A[i - 1], A[i])
                swapped := true
            end if
        end for
        n := n - 1
    until not swapped
end procedure

The code above might look complex to you. It is the pseudo-code for an algorithm called BubbleSort. The code aside, it also takes effort to understand the algorithm itself, its efficiency and correctness.

Thus, it is possible for a small amount of code to look complex, and a large amount of code to still be simple. So where does the complexity come from if not lines of code?

Complexity is the quantity of interrelationships between lines of code within a piece of software.

In the first example of endless print statements, we had a great quantity of lines, but each line was fairly independent. You could directly understand line 477 without having to have read the 476 lines that preceded it.

In the second example, however, you might find yourself reading those 13 lines of code repeatedly to understand the role of each line in the grand scheme of things. That's because each line has some relationship to every other line that may not be obvious in the first pass.

Imagine having a 1000-line code where every line was related to every other line, and you had to re-read those 1000 lines repeatedly to understand what each line actually does. Sounds scary? It is. Which is why one major aspect of software engineering is managing complexity, by using abstractions like functions, classes, modules or even APIs. For example, when you read code that's inside one function, you can understand it without needing to know what's inside other functions.

As a programmer, your job is to deal with complexities of software. Thus, if you're creating software that is NOT exercising your ability to deal with complexities, then you'll not grow.

Here are some examples that might be fun to make, but don't really push your boundaries a lot:

  1. Very simple scripts: These scripts serve as examples of language features. Their purpose is to teach rather than to do something meaningful.

  2. Scripts heavily using external libraries: A small script doing something cool because you off-loaded something to Pandas or Numpy won't push your boundaries. The complexity is being handled by Pandas or Numpy, not you.

  3. Copy-pasting existing code "while understanding it:" The process of thinking about how to write code is far more important than the code itself. When you copy code written by others, although you might understand the code, you've deprived yourself of the opportunity to design it.

Instead, try the following:

  1. Implementation of an algorithm, whichever it is, because the complexity factors here are quite high. It doesn't matter what the algorithm is. It's just like a jigsaw puzzle; the end actual picture can be anything, but the puzzle is defined by the shapes of the pieces themselves.

  2. Creating something basic from scratch rather than using libraries like Pandas or Numpy. Your code might be inefficient, but you still would learn a lot from implementing the "logic." And that's the most important.

  3. Changing an existing program to do something else or something more. Changing programs means you don't have the flexibility of starting from scratch. You're already constrained by prior decisions. This is where you will understand how abstractions like functions, classes, modules actually work to reduce complexity.