601.229 (F19): Homework 1: Warmup
- Out on: Tuesday, September 3, 2019
- Due by: Friday, September 13, 2019 before 10:00 pm
- Collaboration: None
- Grading: Packaging 10%, Style 10%, Design 10%, Functionality 70%
Update September 9th: Link to Coding style guidlines.
Update September 5th: Clarified program exit code and packaging/Makefile requirements.
Acknowledgment
This assignment was originally developed by Peter Froehlich for his version of CSF.
Overview
This assignment is mostly a warmup exercise giving you a chance to review your C programming skills. You’ll also learn something useful about what’s actually in a file which will come in very handy for future assignments. The program is also easy to test because you can use existing UNIX tools to compare your output against; so test a lot! If you have trouble with this assignment, you’ll likely have even more trouble with future assignments.
Problem 1: Hexdumps (100%)
Start by reading up on what
hexdumps are. For this problem, you will write a program hex.c
that
produces a hexdump on standard output for data read from standard input.
Let’s start with an example:
$ ./hex
Hello
00000000: 48 65 6c 6c 6f 0a Hello.
The program was started, then the user typed the word “Hello” followed by return/enter, then CTRL-D was used to stop the input. The result shows the ASCII code for each character (in hexadecimal, so it’s guaranteed to be two digits wide for each character), including the newline character generated by the return/enter key. The formatting may look a bit strange, but the purpose of the “large gap” becomes apparent if we examine a longer input:
$ ./hex
This is a longer example of a hexdump. Marvel at its magnificence.
00000000: 54 68 69 73 20 69 73 20 61 20 6c 6f 6e 67 65 72 This is a longer
00000010: 20 65 78 61 6d 70 6c 65 20 6f 66 20 61 20 68 65 example of a he
00000020: 78 64 75 6d 70 2e 20 4d 61 72 76 65 6c 20 61 74 xdump. Marvel at
00000030: 20 69 74 73 20 6d 61 67 6e 69 66 69 63 65 6e 63 its magnificenc
00000040: 65 2e 0a e..
This time the user entered two sentences, then signaled end of input with CTRL-D. Again, we see the ASCII code for each character (including spaces and newlines). The formatting is set up so that regardless of the number of characters, we always have three “columns” of output:
- First the overall “position” in the input. Note that this is also a hexadecimal number, formatted to 8 digits.
- Then the ASCII values for each character in hexadecimal, at most 16 to a line.
- Finally a string-like representation of the data, with printable characters shown but non-printable characters (like newline or tab) replaced with a dot.
Note that there’s a single space between the colon after the offset and the ASCII values, but there are two spaces between the ASCII values and the string-like representation.
On Piazza you’ll find some starter code for this program. You can of course ignore the starter code and write the entire thing from scratch yourself, but we recommend you use the starter code: It contains a few important hints that you may not want to live without. Good luck!
Error handling
Any run-time errors, including (but not limited to) failing to open
the input file, or failing to read data from the data input source,
should be handled by printing an error message to stderr
and exiting
the program with an exit code of 1. (An exit code is the value returned
by the program’s main
function, or the value passed to the standard
library exit
function.)
Hints
- You can use the existing
xxd
program to check your output against. Runningxxd -g 1
and then typing into that should produce exactly the same output as yourhex.c
program. - Try to find a function that allows you to check whether a character is “printable” or not. Don’t try to decide that question yourself!
- You’ll need a small, fixed-size array for this problem, but the details are already laid out for you in the starter code.
- Remember to break up your code into smaller chunks using helper functions!
- One purpose of hexdumps is to examine binary files. You can try
this with your
hex
program itself:./hex <hex
will show you lots of interesting bits that you may not expect. - (We use “character” above when we maybe should say “byte” instead
because these days encodings such as Unicode interpret “character”
differently. Alas, in C, the
char
type is virtually identical to “byte” so it’s probably close enough to use “character” this way; at least in the context of a hexdump program.)
Testing
As noted above, the output of your program should be exactly identical to the
output of xxd -g 1
. If you have a file foobar
, then you can test your program
with the commands
./hex < foobar > actual.txt
xxd -g 1 < foobar > expected.txt
diff expected.txt actual.txt
If the diff
command doesn’t produce any output, then your program produced
the same result as xxd -g 1
. If the diff
command does produce output,
then think about why your program behaved the way it did, and how to
fix it. We highly recommend that you test your program thoroughly on a
variety of different input files.
Submitting
Submit the assignment to Gradescope. Specifically, upload a zipfile containing all files necessary to compile your submission as HW1. See the “Homework submission” link in the “General resources” section of the Piazza Resources page for more details.
If you base your code on the starter project posted on Piazza (hw1_skeleton.zip
), then you should be able to run the command make solution.zip
and then upload the resulting solution.zip
.
Important packaging requirements: Your program code should be in a
single C source file called hex.c
. Your Makefile
(which should be included
in your submission) should build an executable called hex
as its default
target: i.e., running the make
command in the directory with your
extracted submission files should result in compiling your code and producing
an executable called hex
. If you base your solution on the starter
code, then you shouldn’t need to do anything special to meet these
requirements.
Grading
For reference, here is a short explanation of the grading criteria; some of the criteria don’t apply to all problems, and not all of the criteria are used on all assignments.
Packaging refers to the proper organization of the stuff you hand in, following both the guidelines for Deliverables above as well as the general submission instructions for assignments on Piazza.
Style refers to C programming style, including things like consistent indentation, appropriate identifier names, useful comments, suitable documentation, etc. Simple, clean, readable code is what you should be aiming for. Please see the Coding style guidelines.
Design refers to proper modularization (functions, modules, etc.) and an appropriate choice of algorithms and data structures.
Performance refers to how fast/with how little memory your programs can produce the required results compared to other submissions.
Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous, ask for clarification! (It also refers to you simply doing the required work, which may not be programming alone.)
If your programs cannot be built you will get no points whatsoever. If
your programs cannot be built without warnings using the required
compiler options given on
Piazza we will take off 10%
(except if you document a very good reason). If your programs cannot
be built using make
we will take off 10%. If valgrind
detects memory
errors in your programs, we will take off 10%. If your programs fail
miserably even once, i.e. terminate with an exception of any kind or
dump core, we will take off 10% (for each such case).