Stream programming excercise: Buzzword Bingo
This exercise is a formative assessment: it will not count towards the final mark. You must demonstrate your solution to one of the demonstrators during the lab session in order to get feedback.
Buzzword Bingo is a game in which two players compete to find text that is full of buzzwords. (Inspired by the inventor of the internet, not to mention Algoreithms, Al Gore.) You can choose your own list of buzzwords, or use the sample buzzword file provided. Note that a buzzword can contain spaces. Each line counts as one buzzword.
Each player supplies an URL of a web page they choose. The program then connects to the URL and scans the web page for the keywords from the buzzword file. Every time the program finds an occurrence of one of those buzzwords, a score is updated and at the end, the player with the highest score wins.
To make it even easier, here are some example steps which would be necessary for creating this program:
- Design your program to read the buzzword.txt file using I/O streams and store each term (either String or StringBuffer should do nicely for storage).
- Construct a regular expression for the buzzwords and compile it to a matcher.
- Open a connection to an URL, say for example, "http://www.cs.bham.ac.uk/~hxt/2010/19343/" and using Streams download the content of the web page and store it as well (again, String or StringBuffer should be sufficient). This is easily accomplished in Java. Please consult the Java API and here is an article detailing URL connections.
- Match repeatedly using find, and update the score accordingly.
- Repeat for the second player and display the score on screen. The player with the highest score wins.
Please note that this all that you are required to do. Below we supply some implementation ideas which aren't necessary for this exercise. They are there in case you finish the exercise faster and you wish to further experiment with streams.
Some optional implementation ideas
Once again, you are not required to follow these and we even urge you not to unless you are certain that you have fulfilled the minimum mandatory requirements for the exercise. If you did, and you would like to extend your knowledge on streams and regexps, the ideas below might inspire you.
- The model solution will contain a networked version so that the two players can play remotely. This is NOT required at all and is there just for demonstration purposes. More precisely, we are trying to show you the various types of streams Java has available. In Java, networking is simplified and completely platform independent and it is quite easy to create a server and client so that the game can be played accross the network. The messages passed between the server and the client will be synchronization messages. For example, the server could inform the client what score it calculated for itself and the client could answer with its own score. There are further ideas here on which you could expand. For example, the client and the server could synchronize the chosen list of buzzwords (perhaps even merge them). This would ensure that the client or the server isn't cheating by altering their buzzword.txt file before scanning the website for terms.
- You can optionally calculate Buzzword density: This is done by dividing by corelating the score to the length of the webpage. This could be done if one would wish the game to be fair. More precisely, the current specification of the exercise allows each player to choose an arbitrary website. One of the players could raise their chances to win by supplying a very long website compared to the other player. In order to be fair, a ratio would have to be calculated and make the scores proportional to the length of the website to avoid this problem. However, that would imply using regexps first to strip out the HTML from the website your program downloads.
- As suggested by Maxim Strygin, the mark of a good programmer consists in anticipating the possible uses of the program. Considering that the Internet is an international medium where you can find websites in all possible languages using all possible character sets, it would be wise to make your streams aware of the different encodings. Although this is not frequently mentioned it is indeed a detail which a professional programmer should not miss. See the following link detailing the problem.
Model solution
Simple model answer: BuzzBingoMultiple.java.