20 watts
  • A human brain uses about 20 W of power.
  • A single A100 GPU card uses 400 W of power, H100 uses 700 W.
  • Rumor has it that GPT4 requires 128 such cards to run for one instance of the model. So a low estimate would be 50 KW to run inference on a current generation LLM.

  • Why is this discrepancy? One possible explanation is as follows.

  • Biological brains do not compute using the standard von Neumann architecture flipping bits guided by a synchonous sequential instruction set.
  • Brains use time as an important communication and computation resource. This utilisation of time goes beyond the simple clocking of a CPU or concurrency.
  • Spiking neurons perform communication and processing in space-time, with emphasis on time. In these paradigms, time is used as a freely available resource for both communication and computation.

  • Neuromorphic chips which could be used in bio-inspired neural netwroks are being researched. Here is one example:

  • Neuromorphic computing