Unit 5. Data Manipulation
Revision Date: Jun 11, 2020 (Version 3.0)Summary: Students are introduced to the theory of computation, computability, the halting problem, and advanced algorithms. In particular, they will learn about heuristic search used by artificial intelligence (AI) programs to play games.
Objective:
Students will be able to:
Overview:
Session 1
Session 2
Student computer usage for this lesson is: required
Links to videos and online tools as indicated in the lesson plan.
Alternative instruction could include the Towers of Hanoi problem and discuss the algorithm for solving it. Some demonstrations are available here:
Think-Pair-Share: In pairs, think about and try to answer each of the following questions:
Note: just give them a few minutes to try the factoring, but round them up to continue and discuss: which operations were much harder to perform than their inverse? Can you just invert the steps, and why or why not?
Make a connection to the previous lesson by comparing these to sorting algorithms, where some are speedy and efficient like Merge sort and Quick sort, and others are unusably slow, like Bubble sort. Highlight the difference that different problems have different lower bounds on optimal solutions, and that some problems like integer factorization have solutions but take too long to be solved in a practical way.
Discuss the definition of computation (in a theoretical sense) with your students. Computation is input plus processing to get output. A computer is one system that is a "model of computation" since it takes input, processes it, and produces output.
Another model of computation is called a Turing machine, named after Alan Turing (one of the most famous computer scientists). A Turing machine is a theoretical entity that has a tape of symbols (a line of 0s and 1s), a head that can read only one symbol at a time, and an internal state that can change based on instructions as the head reads symbols. Turing and a mathematician called Alonzo Church are responsible for the "Church-Turing" thesis, which says that a Turing machine can compute anything that a digital computer can. This is a fundamental idea of the theory of computation, and has the implication that anything one computer is capable of doing is possible to be computed by another, given enough resources (time and memory).
Parallel computing is a computational model where the program is broken into multiple smaller sequential computing operations, some of which are performed simultaneously. Distributed computing is a computational model in which multiple devices are used to run a program. Comparing efficiency of solutions can be done by comparing the time it takes them to perform the same task.
Now discuss the idea of computability with your students. Ask your students to answer or think-pair-share: are there things it is impossible for a computer to compute? A decision problem that can be solved by an algorithm that halts on all inputs in a finite number of steps is decidable. The most classic "undecidable" (non-computable) question is called the Halting Problem. The Halting Problem is: make a program that can tell if any other program will halt (terminate at some point eventually) or will loop forever and never end. This can be done for specific algorithms (instances of the Halting Problem) but not for the general problem regarding all possible algorithms.
The Halting Problem is impossible for a computer to compute, which you can prove (informally) by paradox. Suppose you did have a program that solved the Halting Problem, called HALT(X), which takes the code for some program X as input and says "yes" if X terminates or "no" if X loops forever. Then you could write a new program that uses HALT inside it, which we will call PARADOX(X). First PARADOX(X) will run HALT(X) and if the result is no, PARADOX will halt, but if the result is yes, then PARADOX will loop forever. But here is the problem: what if we use the code for PARADOX as the input to PARADOX, running PARADOX(PARADOX)? If it says that PARADOX halts, then PARADOX runs forever, and if it says PARADOX runs forever, then PARADOX halts. This problem is a paradox and does not make sense because the premise, that a program called HALT could exist, must be wrong! Therefore, the Halting Problem is impossible for a computer to solve.
Video explanation with optional student simulation
Think-Pair-Share:
This session concerns advanced algorithms, in particular heuristic search, which is commonly used in artificial intelligence. Refresh your students' minds on the definitions of computation, computability, and undecidable problems. Additionally, mention the properties we consider when we compare algorithms:
Introduce the idea of heuristic search, which is a class of algorithms used in many artificial intelligence programs. A heuristic is something that is used to find a good solution in a reasonable time, and a heuristic search algorithm is an algorithm that uses heuristics to determine how to search through some space.
A great way to introduce heuristic search is first to discuss game trees. A game tree is a structure that is used to represent the "space" of a game that an algorithm wants to search through.
Think of a game like chess: you make a move, the opponent makes a move, and the process continues until the ending conditions have been met (one player in checkmate or stalemate). A game tree is a mathematical structure used by AI and heuristic search algorithms to model the moves made in a chess game. At any turn, we can make a "tree" by drawing the root node as representing the current state of the board and drawing one branch under it for every possible move. In Tic-Tac-Toe, if you are the starting player, then the root node represents a blank board, and there will be nine branches, one for each possible move (each space where you could place your mark). Following a branch in the game tree takes you to a new node that represents the configuration of the game that results from having taken that move. In Tic-Tac-Toe, if I am the first player and place my X in the center space, I have "followed" that branch down the tree to a new node that represents the board with an X in the center space. The opponent then uses this node as the root of their game tree, and has a branch for each of their possible moves.
Think-Pair-Share: Have your students pair off and play a game of Tic-Tac-Toe and try to draw the game tree as they play it, drawing the nodes for each move they made and every potential branch from those nodes. Bring them back into discussion and ask them what if they had to draw out every node followed down every branch? Now ask them to imagine the game tree for chess, which has 20 possible moves on the first turn, 400 on the second, and many, many more as the game goes on. How can an artificially intelligent program learn to play chess when there are so many (too many) options? Chess actually has around 35^{100 }nodes in its tree and 10^{40} legal states.
Heuristic search on game trees is one way AI programs are able to play games like chess. How good are computer game players?
Typically games modelled with game trees are 2-person games, players alternate moves, and they are zero-sum (meaning one player's loss is the other's gain). More complicated elements in such games may have include: hidden information (like other players' hands), chance (dice), or multiple players.
How does an AI program use heuristic search to play a game? Typically in these steps:
The key problems are:
For evaluation, some function is typically coded or learned over time.
Refer to the "Advanced Algorithms" slides in the lesson resources folder for examples of uninformed search. For an activity, you may want to create a game tree for Tic-Tac-Toe and have your students walk through how each of the following algorithms would operate over it.
Uninformed Search are algorithms that work without a heuristic, using no information about the likely "direction" of the goal node. Algorithms include:
For any games with variety and complexity, certainly for chess and even checkers, uninformed search is simply too slow because it is exhaustive. This problem is another example, like with sorting, where the efficiency of our solution matters a great deal. To get programs to play games, we need them to be efficient and intelligent about the number and quality of moves they consider.
Informed Search algorithms each follow some heuristic that uses information about the game to determine smart directions to explore. Examples include:
Advanced classes may wish to discuss local search algorithms, such as hill-climbing and genetic algorithms (in the "Advanced Algorithms" slides in the lesson folder).
Thinking about game trees again, we want to select the branch that takes us to a node with the maximum evaluated state. But there is a catch: the opponent gets to make moves, too. That is, every other branch in our game tree is the opponent's turn. How does the AI program account for the other player?
Perhaps most logically, the way AI programs do so is to assume the other player will play optimally. Just as the AI will take the branch that leads to the state with the greatest evaluation, it assumes the other player takes the branch leading to the state that will maximize their position. In other words, the AI searches through their game tree by following the branch with the maximum value on their turn, and following the branch with the minimum value on the opponent's turn. This algorithm is called minimax and is the basis of nearly all AI that play 2-person zero-sum games.
Questions in the AP Classroom Question Bank may be used for summative purposes.
Sixty of the 80 questions are restricted to teacher access. The remaining 20 questions are from public resources.
Questions are identified by their initial phrases.
A certain computer game is played between a human
Under which of the following conditions is it m...
Which of the following programs is most likely to
Which of the following statements is true?
For the Halting Problem proof, it is important that students can translate the solution that is on the video into a representation that makes (some) sense to them. Acting out the inputs and outputs of the set of machines is an approach worth trying.
The following "Checks for Understanding" could be used to guide the students towards the three learning objectives.
Objective: Students will identify some Advanced Algorithms that Exploit Inverse Operations Efficiency.
Objective: Students will identify some Advanced Algorithmic Techniques.
Objective: SWBAT discuss at least one example of a computing problems that is unsolvable
Students will be able to summarize -- in their own words or with simple models -- the proof of the Halting Problem.
Students will be able to identify the sensitivity of cryptography to the difficulty of factoring large numbers.