ASCII & Unicode

A bit is a singular binary value. 0 or 1/on or off.

A byte is a set of 8 bits. E.g: 01101010

All characters in all languages, numbers and punctuation marks etc have to represented using bits and bytes of binary. This is done using character sets.

One example of a character set is: ASCII (American Standard Code for Information Interchange). This uses rows of 7 bits in 128 combinations to represent 128 different characters. Enough for the english language.

The first 32 codes are used to represent non-printing characters (space, return key etc). The rest are for letters, numbers and punctuation.



In some languages and character sets 128 different combinations was not enough, so Unicode was created. Unicode uses 16 bit sets to make up the characters. This means you can have 65,536 different combinations. Later, Unicode 32 bit was developed which can produce over a million characters. More then enough for all of the different characters from languages, punctuation and emoticons etc. 

Comments

Popular posts from this blog

CPU Fetch-Decode-Execute Cycle

OSI Network Model

Scheduling