CS286 -- Lab 4: Compressing



->Courses
->CS286
-->Homework
-->Lab 1
-->Lab 2
-->Lab 3
->Lab 4
->Electronic Submission
->Old Exams
->C++ Examples
->MSVC++ Info
->STL Info
->MFC/GUI Info
->Software
->Tentative Schedule
->Support Forum
->Course Policies

[Courses]
[Rich][Home][Rich]
[PHome]

Spring Quarter 2003

Objectives Addressed

  • Be able to apply asymptotic time complexity analysis to choose among competing algorithms.
  • Understand how graph and tree structures are implemented.
  • Be familiar with engineering applications for many of the fundamental algorithms discussed in the course.

Purpose

This lab assignment is designed to provide exposure to an example of a greedy algorithm and develop more advanced software design skills.

Assignment

You are to design a C++ program that will compress and uncompress the words.txt file. Your program should allow for two forms of compression:

  • Optimal fixed length codeword (FLC) compression
  • Optimal variable length codeword (VLC) compression using Huffman Coding

You may assume that the file consists of a total of 28 symbols (newline, end of file, and the twenty-six lowercase letters of the English alphabet). If you choose to make your program capable of handling a larger set of symbols, be sure to mention this in your report. Although not required, you may wish to consider using command line arguments for the user interface.

You should consider the following questions when preparing your report:

  • Which method produces a smaller file? Is this what you would expect?
  • Is it possible that an input file could be constructed that would cause the better of the two methods in this case to produce the larger output?
  • Would sorting the words.txt file before compression improve the results of either compression scheme?
  • What is the largest code word needed using Huffman compression (using the words.txt file)?

Design Decisions

While the concepts implemented in this assignment are not difficult, the details involved in implementing them are not trivial. Be sure to spend adequate time in the design stages before moving on. I have intentionally left many of those design decisions for you to discover.

Week 9 demo (due beginning of week 9 lab)

You will be required to demonstrate your FLC algorithm for compressing and uncompressing a text file. DEMO FILE

Lab report (due 4:00pm, Friday of week 10)

Each pair should submit one lab report. The report need not be self-contained. Your report should include:

  • Discussion (your analysis of the compression rates, answers to the questions, the reasons for defining the classes as you did, how you worked as a team, any problems you encountered, etc...)
  • Program output -- include the following:
    1. The frequency count of each letter found in words.txt (within <code><![CDATA[ and ]]></code> formatting commands)
    2. The VLC table which indicates the binary codeword used for each symbol found in the input file (also within appropriate formatting commands)
    3. The binary VLC compressed file of the words.txt file (as a separate file that is linked within the XML report).
  • CSP spreadsheet (Continue with the spreadsheet you used for lab 2.)
  • Documented source code

Test Files

Template Lab Report

As with any report you submit, correct spelling and grammar are required. In addition, your report should be submitted electronically following the Electronic Submission Guidelines. (You may wish to consult the sample report before submitting your report.) Be sure to keep copies of all your files, in case something gets lost. It may be wise to keep a diskette backup as well.

If you have any questions, consult the instructor.

© 2001-2003 Dr. Christopher C. Taylor Office: CC-27C Phone: 277-7339 Last Updated: Fri Apr 25 19:47:34 2003
I am responsible for all content posted on these pages; MSOE is welcome to share these opinions but may not want to.