Given a positive integer, compute the sum of its digits. This very simple task ended up taking me in my first journey through Python’s internals.
The first solution that came to my mind was:
I could make it a one-liner:
So if I start with the number 1234, I’ll first convert it to the string “1234”. Then I will look at every character in the string convert it back to an integer and add that. That’s ugly, and it sounds so inefficient.
I could get the last digit of an integer doing n % 10. And the next-to-last digit is just the last digit of n/101:
Or an iterative version:
The second way is harder to read, but it’s better, right? Python has to know every digit of a number to convert it to a string, so I was duplicating work by looking at every digit again.
I decided to experiment calling the 4 functions 100,000 times each 2 with a 400-digit integer and comparing run times. My theory: there would be no real difference between two functions of the same way, but the second way would be faster then the first. My results:
Turns out my intuition was right about the first part, but very wrong about the second.
The answer lies in str(n).
Just like second_way, str(n) also does divisions and modulos to get every digit. But it is smarter. For a 400-digit number, instead of doing 400 divisions and modulos by 10 (like second_way) it first does 45 divisions by 10^9, getting 44 9-digit numbers and one 4-digit number. Then it does 400 divisions and modulos by 10 to extract the digits of those numbers.3 Those divisions are much faster because the dividend fits in a machine integer4 so there is way less overhead and the division is essentially just one machine instruction. On the other hand, almost every dividend in second_way does not fit in a machine int.5
Let’s apply this optimization to second_way:
And time it:
That’s faster than first_way!
This experience taught me that the smarter-looking code is not always better. Right after I wrote first_way I thought it was a stupid solution. Then I came up with second_way and I got ashamed of myself for having written first_way. Reflecting on why one solution was so much better than the other led me here.
The next time I need to sum the digits of an integer6 I will use the first way and be proud of it.
I spent a ridiculous amount of time deciding if the base case should be n == 0 or n < 10. If you have a strong opinion about it, please tell me! ↩
100,000 was enough to get a stable run time between several different runs. ↩
second_way does d divisions by 10 for a d-digit number, str does d/9 divisions by 10^9 and then d divisions by 10. ↩
For 32-bits machines, n fits in a machine integer when n < 2,147,483,647. Since 999,999,999 < 2,147,483,647, every 9-digit number fits in a machine integer. ↩
CPython uses 2 types of integers: PyInt (when the number is smaller then a machine int) and PyLong. PyInt divisions are essentially one machine instruction, for PyLong divisions there is overhead. Since CPython does not automatically convert a PyLong to a PyInt when it fits, every division in second_way is a PyLong division. int() forces the conversion. ↩
Okay, I never actually needed to add digits, but if it happens, I’m prepared. ↩