Archive for the ‘huffman encoding’ Tag

SICP Section 2.3.4   Leave a comment

This section puts what we’ve learned about sets into practice implementing a method of encoding and decoding messages encoded with huffman code trees, and also a method of generating the trees.

;;Exercise 2.67
> (decode sample-message sample-tree)
(A D A B B C A)

;;Exercise 2.68
The encode-symbol procedure I wrote is straightforward, but takes a rather large number of steps to encode a symbol, because we have to search the set of symbols at each node for the correct branch to take. I’m not sure if there’s a better way to do this…

(define (encode-symbol symbol tree)
  (cond ((leaf? tree) '())
        ((element-of-set? symbol (symbols (left-branch tree)))
         (cons 0 (encode-symbol symbol (left-branch tree))))
        ((element-of-set? symbol (symbols (right-branch tree)))
         (cons 1 (encode-symbol symbol (right-branch tree))))
        (else (error "symbol not in tree - ENCODE-SYMBOL" symbol))))

;;Exercise 2.69
Successive-merge was actually fairly straightforward to code, however, when I first attempted to do this, I did not fully understand the actual tree generation procedure! Perhaps this is what the authors were warning about when they said it was tricky, I was certainly tricked initially :D

(define (successive-merge leaf-set)
  (if (= (length leaf-set) 1)
      (car leaf-set)
      (successive-merge
       (adjoin-set (make-code-tree (car leaf-set)
                                   (cadr leaf-set))
                   (cddr leaf-set)))))

;;Exercise 2.70
I never knew 50s rock was so repetitive ;)
Probably better than the junk that gets put out these days though…

(define freq-pairs (list (list 'A 2) (list 'BOOM 1) (list 'GET 2) (list 'JOB 2) (list 'NA 16) (list 'SHA 3) (list 'YIP 9) (list 'WAH 1)))

(define message '(GET A JOB
                  SHA NA NA NA NA NA NA NA NA
                  GET A JOB
                  SHA NA NA NA NA NA NA NA NA
                  WAH YIP YIP YIP YIP YIP YIP YIP YIP YIP
                  SHA BOOM))

(define rock50s-tree (generate-huffman-tree freq-pairs))

(define encoded-message (encode message rock50s-tree))
(define huffman-length (length encoded-message))

(define logbase2-of-8 3)
(define fixed-length (* (length message) logbase2-of-8))

fixed-length turns out to be 108 bits long, whereas the huffman-length is only 84 bits long. That’s a pretty fair length saving.

;;Exercise 2.71
I’m not really a fan of drawing large trees, so I’ll just give the code to generate it.

(define freq-pairs-5 (list (list '1 1) (list '2 2) (list '3 4) (list '4 8) (list '5 16)))
(define freq-pairs-tree-5 (generate-huffman-tree freq-pairs-5))

(define freq-pairs-10 (list (list '1 1) (list '2 2) (list '3 4) (list '4 8) (list '5 16) (list '6 32) (list '7 64) (list '8 128) (list '9 256) (list '10 512)))
(define freq-pairs-tree-10 (generate-huffman-tree freq-pairs-10))

The number of bits required for n=5 code are
Largest frequency: 1
Lowest frequency: 4
For n=10
Largest frequency: 1
Lowest frequency 9

It’s fairly obvious that the largest frequency will always have 1 bit, and the lowest frequency will have (n-1) bits.

;;Exercise 2.72
As we descend the tree we have to search the list of symbols at the node we are on, which is of the order O(n). We will need to descend n levels worst case, therefore n searches n deep is O(n2)

Advertisements