Garbage collection

You don't get to decide when garbage collection is done…
Someone just comes along and does it whether you're ready or not

Garbage collection (GC) is an activity performed by a program to reclaim resources, usually memory, that are no longer used and make them available again.

In the case of memory, many languages - including both scripting languages such as Tcl or PHP and compiled languages such as C++ or Java - use a strategy where memory for a variable is allocated from a pool and then each time the variable is used a ''reference'' count is incremented to indicate in how many places the variable is used. This allows pointers or references to the variable to be passed around rather than copies of the value itself and this can save memory and be quite fast.

However, there needs to be some mechanism which detects when the reference count of a variable reaches zero so its memory can be recycled. This is garbage collection.

The same process is sometimes used for other resources such as file handles and threads, but these are much less common than memory garbage collection.

While a good idea, and necessary for some variable management approaches, GC suffers two problems when we're considering optimisation:

  • You don't know when or how often it'll happen, and
  • You don't know how long it will take each time it happens.

Garbage collection is not under the control of the programmer. It just happens. This means the time it takes can't be taken into account when optimising for speed.

Secondly, Wikipedia's article on the subject says, "Like other memory management techniques, garbage collection may take a significant proportion of total processing time in a program and can thus have significant influence on performance." (Wikipedia:Garbage collection (computer science))

Thirdly, language developers are under few or no constraints about how they implement garbage collection and how consistent they are about it. This means if you develop in a language which uses garnage collection you may find that your software suddenly performs differently when a new release of the language uses a changed garbage collection mechanism.

Mitigation

Where timing is important, languages which perform garbage collection may not be the best choices and more predictable and consistent languages such as C or even Assembler may be better.

You can also avoid some garbage collection by using statically-allocated variables as much as possible, but in some languages there's a limit to how far you can go with this approach. Object-oriented languages can be particularly problematic here when new objects are being created and destroyed.