Limitations of 32-bit machines

From Nick Jenkins
Jump to: navigation, search

As the stability of modern machines improves, and the amount of data they process increases, the limitations of 32-bit machines are becoming increasingly annoying.

32-bit machines operate with (...drum-roll please...) 32 bits! That means that numbers that they use internally are limited to 32-bits. Each bit can be either 1 or 0 - in other words, each bit can one of two states - either on or off. So all 32 bits in numeric terms equals 2 to the power of 32, which equals 4,294,967,296 - or in computer-speak 4 Gigabytes. That seems like a pretty big number, right? The trouble is, it's no longer big enough.

There are four areas where this has annoyed me, and there are bound to be more with time:

  1. Ever copied more than 4 Gigabytes in Microsoft Windows? There is a counter that says how long something is going to take. Copy less than 4 Gigabytes, and this counter goes down normally (30 seconds, 20 seconds, 10 seconds, done). Copy more than 4 Gigabytes, and this time goes completely wacko (30 seconds, 20 seconds, 10 seconds, then 4351295 minutes!). How long is it really going to take? Who knows! So why the strange behaviour? Because the counter is at some point using a 32-bit number, so it doesn't work properly when working with more than 4 Gigabytes of data.
  2. Ever run the "uptime" command in Linux? It shows how long a machine has been running (e.g. "128 days 7 hours 12 minutes"). Seems to work pretty well, right? The trouble is, it's using a counter to remember how long the machine has been running, and that counter is limited to 32 bits. So how long before it stuffs up? Well, in Linux version 2.4 and below, it's using a rate of 100 Hertz, or 100 'ticks' per second. So to work out how many days before it stuffs up, we have: 4294967296 ticks / (100 ticks per second * 60 seconds per minute * 60 minutes per hour * 24 hours per day) = 497 days. Sounds like it should be long enough, right? Actually, no - I and many others maintain Linux servers with a longer uptime that 497 days, and it's annoying that the operating system is unable to represent that accurately after this seemingly-arbitrary cut-off point.
  3. Ever measure the amount of data received or transmitted in Linux? Start sending data, and then do a "cat /proc/net/dev" to show the number of bytes of data sent and received - and after 4 Gigabytes, this number will suddenly jump back to zero. Why? Limitations of 32-bit numbers again!
  4. Ever want to see the amount of data sent through an ADSL modem? Take the Netcomm NB1300 ADSL modem as an example. Look at the amount of data sent in the modem's configuration, and then transfer more than 4 Gigabytes. Look again - and most of the numbers are now negative. How can you send a negative amount of data? Answer: You can't - it's just another instance of 32-bit numbers being used when they're too small for the task at hand.

The fact is that these things are annoying to me now, and they'll be annoying more and more people as time progresses.

There are two ways of solving these problems:

  1. Work around the problem - e.g. two internal numbers can be used, to make 64 bits, for situations where large numbers are anticipated.
  2. Use 64-bit computers.

The work-around is problematic - it's potentially tricky (since 32-bit machines provide atomic operations on 32-bits, not 64-bits, so care has to be taken to ensure that the two parts are always consistent with each other). Furthermore, programmers are bound to miss some situations where 32-bits are not appropriate, and because of the nature of the problem, this usually only shows up after a lot of time has passed or data has been sent - potentially making the 'debug cycle' take a long time.

The real solution is to use 64-bit computers - that way programmers don't have to do anything special to take care of the cases where 32-bits are not enough. The trouble with this is 64-bit computers won't become really cheap until they are mass-manufactured in the way that 32-bit computers are. Currently, the vast bulk of people are running 32-bit chips, since they are very cheap and very powerful, and they have great software support. The chip manufacturers are having trouble selling 64-bit chips, as they are more expensive, and their software support is comparatively poor. The software writers support 32-bit machines far more than 64-bit machines, since they only care about what their customers are using. So the whole question of getting 64-bit machines out there and in general use is a bit of a chicken and egg problem. So although we definitely have the technology to make 64-bit machines that are relatively inexpensive, just about everyone is still running 32-bit hardware, that tends to exhibit these unexpected behaviours when working with big numbers.