<– 2 – Pixels, Images, and Adobe Photoshop | 4 – Storing Data: Spreadsheets and Databases –>

Learning Objectives

After completing this unit students will be able to:

  • Explain why files need to be compressed
  • Identify metadata and its importance in compressing data
  • Demonstrate how images, text, audio, and video can be compressed and decompressed
  • Identify and explain “lossy” compression techniques, such as JPEG
  • Identify and explain “lossless” compression techniques
  • Explain the differences between fixed length encoding and variable length (prefix free) encoding
  • Compress/Decompress text documents using dictionaries, Huffman Trees, and LZ77
  • Compress/Decompress images using run-length encoding and discarding data
  • Compare and contrast two types of video compression: interframe and intraframe
  • Explain how psychoacoustics plays a role in compressing audio

Suggested Reading

Important Vocab

  • Binary Tree – a data structure that can, at most, have two nodes or “branches”
  • Bit Depth –refers to the amplitude of the analog wave and specifically to the number of bits used for each sample
  • Bit Rate – the number of bits that can be processed per second
  • Codec– a computer program that encodes or decodes
  • Dictionary– a key in metadataexplaining the instructions to encode or decode compressed data
  • Discarding Data – a type of lossy compression that removes unneeded data with no way to get that data back
  • Fixed-length Code – blocks of code that are always the same size
  • Huffman Tree – a prefix-free binary tree that is the most efficient way to compress individual characters
  • Interframe Compression –a video compression that re-uses redundant pixels from one frame to the next, also known as temporal compression
  • Intraframe Compression – a technique used by compressing each frame of a video, also known as spatial compression
  • Lossless – data compression that does not lose data during compression
  • Lossy – data compression that loses data during compression
  • Metadata –additional data about the main data, usually at the beginning of a file
  • Prefix-Free Code – a specific type of variable-length code that does not use pauses
  • Psychoacoustics – a sub-branch of psychophysics that deals specifically with sound
  • Psychophysics – a branch of psychology that focuses on the fact that the human eye or ear can not perceive the loss of certain data
  • Redundancy – finding frequencies or patterns in code
  • Run-Length Encoding – looking for redundancy or patterns as runs in the code
  • Sample Rate – how often an analog signal is used when converting to digital, usually measured in kHz
  • Uncompressed– all the information from an original file in the same format
  • Variable-length Code – each data block can be a different length

<– 2 – Pixels, Images, and Adobe Photoshop | 4 – Storing Data: Spreadsheets and Databases –>