The exercise is to write the program DifferentWords, which prints out the set of different words in a given file, and their number. For example, if given the file containing
This is some sample text. Some text is sampled for this purpose, but this text is merely a sample.
then the output will be
a but for is merely purpose sample sampled some text this
Number of words: 11
Use the file sometext. There should be 2443 different words.
Write a class class BarChart with methods
public void add(double value)
public void draw(Graphics g)
that displays a chart of the added values. You use it by: creating a new one;
adding some values; calling the draw method to draw it. You can assume that
all the values added are positive. Hint: you must figure out the maximum of
the values. Set a coordinate system so that the x-range equals the number of
bars and the y-range goes from 0 to the maximum.
A problem with predictive text on mobiles is that there are clashes: for example,
the words "good" and "home" clash because they both have
the key signature 4663. Nokia has asked you to investigate this problem, and
your first task is to figure out the largest class of words all of which clash
together, starting from a given body of text. Roughly speaking, your program
has to read in each word of the body of text, compute its key signature, and
store it along with the other words it has encountered with the same key signature.
Then, when all the words have been read, it looks for the key signature with
the biggest set of words stored against it, and prints out that set.
Hint: think about how to organise the data. One way is to keep it as
a TreeSet of WordSigs, where a WordSig is a word and its signature. Your WordSigs
would be ordered by the signature field. Another more efficient way would be
to use HashMaps, but you'd have to read beyond what was said in this lecture.
As in Exercise 1 we want to print out the words in a file, but for this program we also want to print out their count, in ascending order of occurrence. Thus, for the input file
This is some sample text. Some text is sampled for this purpose, but this text is merely a sample.
then the output will be
[a=1, but=1, for=1, merely=1, purpose=1, sampled=1, sample=2, some=2, is=3, text =3, this=3]
© 2001 Mark Ryan and Alan Sexton