Multimedia

Multimedia

Introduction

  • Odds are you use it everyday, but what is it?

Audio

  • Computers are good at recording, playing back, and generating audio
  • Uses different file formats
    • File formats are just a way of storing 0s and 1s on disk so that certain software knows how to interpret it
  • MIDI
    • Way of storing musical notes for certain songs
    • Can do this for different instruments
      • Programs can render the notes for these instruments
  • GarageBand

    • Included with macOS

      https://cs50.harvard.edu/technology/2017/notes/3/garageband.png

    • This is the Star Wars theme in MIDI

      • Doesn’t sound quite as good as the actual version
      • Computer synthesizes the notes
        • Not an actual recording
        • Computer interprets notes in the MIDI file
  • MIDI is common in the digital workspace among musicians who wish to share music with each other.
  • Humans typically like to hear music preformed and recorded by humans
    • File formats for recorded music include:
      • AAC
      • MIDI
      • MP3
      • WAV
  • WAV is an early sound format, but still used
    • Uncompressed data storage allowing high quality
  • MP3
    • File format for audio that uses compression
      • Significantly reduce how many bits are necessary to store a song
      • Discards 0s and 1s that humans can’t necessarily hear
        • True audiophiles may disagree
      • Trade off between optimizing storage space and sacrificing quality
      • This compression is said to be lossy
        • Losing the quality in the compression process
  • AAC
    • Similar to MP3
    • May see when you download a song from iTunes
  • Streaming services such as Spotify don’t transfer a file to you but rather stream bits of information to you
  • How do we think about the quality of these formats?
    • Sampling frequency
      • Number of times per seconds we take a digital snapshot of what a person would hear
    • Bit depth
      • Number of bits used for these individual snapshots
    • Sampling frequency x bit depth = number of bits necessary to store one second of music
    • Audio file formats allow you to modify what these parameters are

Graphics

  • A graphic, what we see with multimedia, is really just a bunch of pixels both horizontal and vertical

    • All file formats are rectangular in nature, though transparent pixels can make images look to take on other shapes
    • In the simplest form, each of the dots or pixels is a bunch of 0s and 1s

      https://cs50.harvard.edu/technology/2017/notes/3/smile.png

    • To create a file format, we just need to determine a mapping

  • This image is only black and white, so how to represent color?

RGB

  • RGB stands for Red Green Blue
    • With information giving an amount of red, an amount of green, and an amount of blue, you can tell a computer how to colorize pixels
    • None of the colors yields a black pixel
    • All of the colors yields a white pixel
    • In between these two options is where we get all sorts of colors
  • Consider the three bytes: 11111111 00000000 00000000
    • If we interpret these bytes to represent colors, it appears we want all of the red, none of the green, and none of the blue
    • These 24 bits (3 bytes = 3 x 8 bits = 24 bits) represent the color we know as red!
      • If a computer wanted to represent this color, it would store these 24 bits
  • Consider the three bytes: 00000000 11111111 00000000
    • Green
  • Consider the three bytes: 00000000 0000000 11111111
    • Blue
  • Consider the three bytes: 00000000 0000000 0000000
    • Black
  • Consider the three bytes: 11111111 11111111 11111111
    • White
  • Can get many color variations by mixing the above colors in different quantities
  • When we talk about image formats, we typically don’t talk in terms of binary but rather something called hexadecimal (base-16, contains 16 digits)
    • 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, a, b, c, d, e, f
    • 0 is the smallest number we can represent in single digit
    • f is the largest number (value of 15) we can represent in a single digit
  • Consider the 8 bits: 1111 1111
    • Each hexadecimal digit represents four bits
    • One hexadecimal digit can represent the first four bits, another can represent the second four
      • Represent something with eight symbols using only two!
    • 1111 is the decimal number 15, which is f
    • Therefore, 1111 1111 in hexadecimal is ff
  • Red can thus be represented in hexadecimal as ff 00 00
  • Green can be represented in hexadecimal as 00 ff 00
  • Blue can be represented in hexadecimal as 00 00 ff
  • A lot of graphical editing software such as Photoshop use hexadecimal to represent colors

Bitmap Format

https://cs50.harvard.edu/technology/2017/notes/3/xp.png

  • This background for Windows XP was a bitmap file (.bmp)
    • A mapping or grid of bits much like the smiley face from before
  • Zooming in on this image show that it is just a grid of dots

    https://cs50.harvard.edu/technology/2017/notes/3/xpzoom.png

    • Notice the pixelation
  • Much like with audio, so too in the world of images do you have discretion over how many bits to use
    • How many bits to represent each pixel’s color?
    • Resolution is another factor
      • An image that is only 100 pixels scaled up only duplicates the existing limited information, resulting in a blotchy image
      • Would be better to start with image that has a higher resolution (more pixels)
  • A lot of repeated colors, so it seems silly to represent each color with the same number of bits

Image Compression

  • Graphical file formats can often be compressed
  • Can be done lossy or losslessly
    • With audio, we threw away audio information that the human ear can’t necessarily hear
      • This is lossy compression; throwing information away
    • Using fewer bits to represent the same information is lossless compression
  • Lossless compression

    https://cs50.harvard.edu/technology/2017/notes/3/apples.png

  • There is a lot of repeated blue in the first image

    • Using the same 24 bits to represent each pixel!
  • The second image is compressed and not what a user would see
    • The first column contains the color that the rest of the row (scan line) should have
      • Image contains instructions on how to repeat the color in a particular row
    • When a color is encountered that isn’t in the first column (the apple in this case), the instructions would list the colors for each non-repeated pixel
    • This uses less bits but makes the original information recoverable
  • Lossy compression

    https://cs50.harvard.edu/technology/2017/notes/3/sunflower1.png

  • This is a .jpg photograph that is somewhat compressed, but not easy to tell

  • Let’s say we want to compress this image further so that we can share it without going over a social media platform’s limit
  • It contains more complicated patterns of colors, so let’s try a lossy compression resulting in the following:

    https://cs50.harvard.edu/technology/2017/notes/3/sunflower2.png

  • Lossy compression means that I won’t be able to get that original image back

    • The compression throws away bits of information
      • “Does the sky really need this many shades of blue?”
      • “Does this leaf really need this many shades of green”
        • Replaces bits with only a few colors giving an approximation
        • I will not be able to know how clear the sky used to be from this information

Image File Formats

  • BMP
    • Originally used by Windows
    • Not super common these days
  • GIF

    • Low quality images
      • Only supports 8-bit color
    • Often used for memes

      • Can be animated
      • Like a video file with only a few images

        https://cs50.harvard.edu/technology/2017/notes/3/nyan.gif

  • JPEG

    • Supports 24-bit color
    • Losslessly compresses
      • Can minimize amount of compression to create high quality photos
  • PNG
    • High quality graphics
    • Supports 24-bit color
  • All these formats ultimately have an limited amount of information
    • Ultimate just store pixels and colors of when the image was taken

“Enhance”

  • Common for popular culture abuses of what it means to be a multimedia format
    • “Enhancing” means to make an image as clear as possible not matter what format it was saved in
  • David shows a clip of characters “enhancing” an image
    • The characters zoom into a pixelated frame of a video and somehow clear it up to see a reflection
      • Video is just a whole bunch of images being shown to us quickly (24 frames per second, etc.)
    • The pixelated image only contains information for those pixels
      • There is no way to obtain a clear image unless the original image was already at a high resolution
  • David contrasts this with an aware clip of Futurama

Video Compression

  • You can think of a video format as similar to a flip book
  • Video formats are just a bunch of images shown quickly in succession to create the illusion of motion
    • Not necessarily all information stored as png, jpg, gif, or even images
    • Algorithms and mathematics can help go from one frame to another
  • Opportunities for compression

    • Can leverage same image compression techniques for each frame (intra-frame coding)
    • Background of multiple frames can contain redundant information

      https://cs50.harvard.edu/technology/2017/notes/3/interframe.png

    • Compare current frame and next frame of video and determine what has changed

      • Store these differences
      • Key frames store a snapshot of time to remember what the video looks like
      • In each subsequent frame remember what has changed
        • Using algorithms and math, background is drawn
      • Key frames are stored multiple times to guarantee that frames can be recovered

Video File Formats

  • In the world of video, there are more solutions on how to store information
  • Video file formats are containers
    • Containers are digital container in which you can put multiple types of data
    • Can include a video track, audio track, a secondary audio track (for different languages), closed captions, …
  • AVI
    • Commonly used in Windows
  • DivX
  • Matroska
    • Open source container meant to be more versatile
  • MP4
    • Pretty much universal in all browsers
  • QuickTime
    • Commonly used in MacOS
  • Codecs
    • Ways of storing and encoding information
    • For video:
      • H.264
      • MPEG-4 Part-2
    • For audio:
      • Can be stand alone files or tracks in a container!
      • AAC
      • MP3

3D Video

  • Increasingly, 3D formats are becoming more common

    https://cs50.harvard.edu/technology/2017/notes/3/sanders.png

    • This is a 360 degree image of Sanders Theatre
      • A spherical image
      • Looks distorted in 2D
      • Like flattening a globe
  • Images can contain metadata
    • Information that viewers can’t see
    • Tells programs, applications, and browsers how to display the image
  • With sensors on a headset, users can experience virtual reality

    https://cs50.harvard.edu/technology/2017/notes/3/vr.png

  • More file formats are still on the horizon, but ultimately all of them boil down to storing 0s and 1s and why!

Did you find this article valuable?

Support Rahul Singh Parmar by becoming a sponsor. Any amount is appreciated!