CS 124
Fall 2023

Problem Set 6

due by 10 p.m. on Friday, September 22, 2023

suggested self-deadline of Wednesday, September 20, 2023

Preliminaries

In your work on this assignment, make sure to abide by the collaboration policies of the course.

If you have questions while working on this assignment, please post them on Piazza or come to TA help hours. Also see the FAQ.

Make sure to submit your work on Gradescope, following the procedures.

Important note regarding test cases and Gradescope:

  • You must test each function after you write it. Here are two ways to do so:

    • Run your file after you finish a given function. Doing so will bring you to the Shell, where you can call the function using different inputs and check to see that you obtain the correct outputs.
    • Add test calls to the bottom of your file, inside the if __name__ == '__main__' control structure. For example:

      if __name__ == '__main__':
      
          print("mystery(6,7) returned", mystery(6,7))
      

      These tests will be called every time that you run the file, which will save you from having to enter the tests yourself. We have given you an example of one such test in the starter file.

  • You must not leave any print statements or function calls in the global scope. This will cause an error with the Gradescope autograder. Make sure all of your print statements are inside a function scope or inside the if __name__ == '__main__' control structure.

Problem 1: Caesar cipher and decipher

50 points; pair-optional and group-of-3-optional

This problem asks you to write two functions using aspects of functional programming that we have covered in class: conditionals, recursion, and/or list comprehensions. The guidelines that we gave you for Problem Set 3, problem 4 also apply here, except you are allowed to use list comprehensions instead of recursion. In particular, make sure that you use the exact function names that we have specified.

Begin by downloading the file ps6pr1.py and opening it in VS Code. We have given you:

Here are the descriptions of the two functions:

  1. Write a function encipher(s, n) that takes as inputs an arbitrary string s and a non-negative integer n between 0 and 25, and that returns a new string in which the letters in s have been “rotated” by n characters forward in the alphabet, wrapping around as needed. For example:

    >>> encipher('hello', 1)
    'ifmmp'
    >>> encipher('hello', 2)
    'jgnnq'
    >>> encipher('hello', 4)
    'lipps'
    

    Upper-case letters should be “rotated” to upper-case letters, even if you need to wrap around. For example:

    >>> encipher('XYZ', 3)
    'ABC'
    

    Lower-case letters should be “rotated” to lower-case letters:

    >>> encipher('xyz', 3)
    'abc'
    

    Non-alphabetic characters should be left unchanged:

    >>> encipher('#caesar!', 2)
    '#ecguct!'
    

    Hints/reminders:

    • You can use the built-in functions ord and chr convert from single-character strings to integers and back:

      >>> ord('a')
      97
      >>> chr(97)
      'a'
      
    • You can use the following test to determine if a character is between 'a' and 'z' in the alphabet:

      if 'a' <= c <= 'z':
      

      A similar test will work for upper-case letters.

    • We recommend writing a helper function rot(c, n) that rotates a single character c forward by n spots in the alphabet. We have given you a template for this helper function in ps6pr1.py that checks to ensure that c is a single-character string. We wrote rot13(c) in the video; rot(c, n) will be very close to rot13(c)! You can test your rot(c, n) as follows:

      >>> rot('a', 1)
      'b'
      >>> print(rot('a', 1))
      b
      >>> rot('y', 2)
      'a'
      >>> rot('A', 3)
      'D'
      >>> rot('Y', 3)
      'B'
      >>> rot('!', 4)
      '!'
      
    • Once you have rot(c, n), you can write a recursive encipher function.

    Once you think you have everything working, here are three more examples to try:

    >>> encipher('xyza', 1)
    'yzab'
    >>> encipher('Z A', 2)
    'B C'
    >>> encipher('Caesar cipher? I prefer Caesar salad.', 25)
    'Bzdrzq bhogdq? H oqdedq Bzdrzq rzkzc.'
    
  2. Write a function decipher(s) that takes as input an arbitrary string s that has already been enciphered by having its characters “rotated” by some amount (possibly 0). decipher should return, to the best of its ability, the original English string, which will be some rotation (possibly 0) of the input string s. For example:

    >>> decipher('Bzdrzq bhogdq? H oqdedq Bzdrzq rzkzc.')
    'Caesar cipher? I prefer Caesar salad.'
    

    Note that decipher does not take a number specifying the amount of rotation! Rather, it should determine the rotation (out of all possible rotations) that produces the most plausible English string. We have given you a helper function called letter_prob(c) that takes a single-character string c and returns a value that is based on the frequency with which that character appears in English texts. That function provides the basis for a number of possible approaches for judging the “Englishness” of a given rotation, but you are welcome to try an alternative approach. Use a short comment above your decipher function to describe your overall approach.

    Your function does not need to be perfect. Some strings have more than one English “deciphering.” In addition, it’s difficult if not impossible to handle very short strings correctly. However, your function should work almost all of the time on long stretches of English text (e.g., sentences of 8 or more words). You will not lose any credit for not getting the correct deciphering of a single word or short phrase.

    Here are two more examples:

    >>> decipher('Hu lkbjhapvu pz doha ylthpuz hmaly dl mvynla lclyfaopun dl ohcl slhyulk.')
    'An education is what remains after we forget everything we have learned.'
    >>> decipher('python')
    'eniwdc'
    

    Note that our version of decipher gets the second example wrong! (When you provide English text as the input, you should ideally get back the text itself–i.e., a string that has been deciphered using a rotation of 0–but that didn’t happen in this case, which is completely okay!) It’s possible that your version of decipher will return a different value here. It might even return 'python', but it’s okay if it doesn’t.

    Hints/reminders:

    • Since deciphering involves rotating the characters, you already have the function needed to create each of the possible decipherings!

    • We recommend that you start by using a list comprehension to create a list of all possible decipherings. You will consider all possible shifts – from 0 to 25 – that could be used to create a deciphering. Assign the result of this list comprehension to a variable.

    • Next, you should use the “list-of-lists” technique that we discussed in the video, lecture, and lab to give each possible deciphering a score. The best_word function from the Scrabble example is a good example of this technique.

      As discussed above, the score that you assign to a given deciphering should provide some measure of its “Englishness.” You are welcome to write additional helper functions to assist you in the scoring and in the other steps involved.

    • After you have scored all of the options, you should choose the option with the highest score. Here again, the best_word function from the Scrabble example is a good model for this.

    • Once you think you have a working decipher, try it on the following string:

      'kwvozibctibqwva izm qv wzlmz nwz iv mfkmttmvb rwj'
      

Problem 2: More algorithm design!

50 points; pair-optional or group-of-three-optional

This problem asks you to write three additional functions using the functional-programming techniques that we have discussed in lecture.

In VS Code, use the File -> New Window menu option to open a new editor window for your code, and save it using the name ps6pr2.py. Make sure to specify the .py extension.

In writing the functions described below, you should continue to follow the guidelines from problem 2.

Here are the descriptions of the functions:

  1. Write a function index(elem, seq) that takes as inputs an element elem and a sequence seq, and that uses recursion to find and return the index of the first occurrence of elem in seq. The sequence seq can be either a list or a string. If seq is a string, elem will be a single-character string; if seq is a list, elem can be any value. Don’t forget that the index of the first element in a sequence is 0.

    Important notes:

    • If elem is not an element of seq, the function should return -1.

    • You may not use the in operator in this function.

    Here are some examples:

    >>> index(5, [4, 10, 5, 3, 7, 5])
    2
    >>> index('hi', ['well', 'hi', 'there'])
    1
    >>> index('b', 'banana')
    0
    >>> index('a', 'banana')
    1
    >>> index('i', 'team')
    -1
    >>> print(index('hi', ['hello', 120, True]))
    -1
    >>> print(index('a', ''))      # the empty string 
    -1
    >>> print(index(42, []))       # the empty list
    -1
    

    Hints:

    • You will need more than one base case for this function.

    • To figure out the logic of the function, we strongly encourage you to begin by considering concrete cases. Ask yourself the types of design questions that we have asked in lecture and lab, and use the answers to those questions to determine what the function should do.

    • When deciding what to do after the recursive call returns, you will need to base your decision on the result of the recursive call.

  2. Write a function jscore(s1, s2) that takes two strings s1 and s2 as inputs and returns the Jotto score of s1 compared with s2 – i.e., the number of characters in s1 that are shared by s2. The positions and the order of the shared characters within each string do not matter. Repeated letters are counted multiple times, as long as they appear multiple times in both strings. For example:

    >>> jscore('diner', 'syrup')            # just the 'r'
    1
    >>> jscore('always', 'bananas')         # two 'a's and one 's'
    3
    >>> jscore('always', 'walking')         # one 'a', 'l', and 'w'
    3
    >>> jscore('recursion', 'excursion')    # everything but the 'r' in s1 is shared by s2
    8
    >>> jscore('recursion', '')             # 0 because s2 is empty
    0
    

    Notes/hints:

    • Unlike the words in traditional Jotto, which always have five letters, there are no restrictions on the lengths of the input strings s1 and s2.

    • If either s1 or s2 is the empty string, the Jotto score is 0.

    • You can write the function any way you like as long as you use functional programming (loops are not allowed since they have not been covered yet).

    • We encourage you to consider using one or more helper functions. In particular, one possible approach involves using a function like the rem_first function that we mention in class. However, you would need to rework that function so that it operates on a string rather than a list.

    • This line turns out to be useful:

      if s1[0] in s2:
      
  3. Write a function lcs(s1, s2) that takes two strings s1 and s2 and returns the longest common subsequence (LCS) that they share. The LCS is a string whose letters appear in both s1 and s2; these letters must appear in the same order in both s1 and s2, but not necessarily consecutively. For example:

    >>> lcs('human', 'chimp')          # 'h' comes before 'm' in both strings
    'hm'
    >>> lcs('gattaca', 'tacgaacta')
    'gaaca'
    >>> lcs('wow', 'whew')
    'ww'
    >>> lcs('', 'whew')                 # first string is empty
    ''
    >>> lcs('abcdefgh', 'efghabcd')     # tie! 'efgh' would also be fine
    'abcd'
    

    Notes/hints:

    • If either s1 or s2 is the empty string, the LCS is also the empty string.

    • If there are ties for the LCS, any one of the ties is acceptable. In the last example above, a return value of 'efgh' would also have been fine, because both 'abcd' and 'efgh' are common subsequences of the two strings, and both of these subsequences have a length of 4.

    • Here’s one possible strategy for the recursive case (after you have checked for your base case(s)):

      • If the first characters in the two strings match, include that character in the LCS, and process the remaining characters of the two strings.

      • If the first characters don’t match, make two recursive calls: one that eliminates the first character in s1, and one that eliminates the first character in s2. Your code will look something like this:

        result1 = lcs(s1[1:], s2)
        result2 = lcs(___, ___)
        

        where you should fill in the blanks in the appropriate way.

      • Return the better of these two results.

      • Caution: You must avoid having your recursive case make three recursive calls. We will test your code with inputs of up to length 20, and three recursive calls will not run in time!


Submitting Your Work

You should use Gradesope to submit the following files:

Warnings

  • Make sure to use these exact file names, or Gradescope will not accept your files. If Gradescope reports that a file does not have the correct name, you should rename the file using the name listed in the assignment or on the Gradescope upload page.

  • If you make any last-minute changes to one of your Python files (e.g., adding additional comments), you should run the file in VS Code after you make the changes to ensure that it still runs correctly. Even seemingly minor changes can cause your code to become unrunnable.

  • If you submit an unrunnable file, Gradescope will accept your file, but it will not be able to auto-grade it. If time permits, you are strongly encouraged to fix your file and resubmit. Otherwise, your code will fail most if not all of our tests.

  • If your submission times out on Gradescope, first check the last hint for the lcs problem.