CSC 121, Spring 2017, Large Assignment #2, Part 2 script.

We'll see how well we can predict words in Jane Ausin's “Pride and Prejudice”.

> source("lga2-defs2.R")
> 
> text <- 
+   scan("http://www.cs.utoronto.ca/~radford/csc121/pride-and-prejudice.txt","")

Separate text into first and second halves.

> text_1st_half <- text[1:(length(text)/2)]
> text_2nd_half <- text[(length(text)/2+1):length(text)]

Predict with the first method, showing the fraction of guesses that are correct.

> guesses_with_method_1 <- predict_method_1 (text_1st_half, text_2nd_half)
> mean (guesses_with_method_1 == text_2nd_half)

[1] 0.03397

Predict with the second method, showing the fraction of guesses that are correct. Also, show how much time this method took.

> system.time (guesses_with_method_2 <- predict_method_2 (text_1st_half, text_2nd_half))

   user  system elapsed 
 69.133   5.214  74.379

> mean (guesses_with_method_2 == text_2nd_half)

[1] 0.1152

Predict with the improved second method, showing how much time this method took. Also verifies that it produces the same answer as the original version.

> system.time (guesses_with_method_2b <- predict_method_2b (text_1st_half, text_2nd_half))

   user  system elapsed 
 13.645   0.618  14.271

> all (guesses_with_method_2 == guesses_with_method_2b)

[1] TRUE