Information Studies 277 -- Information Retrieval Systems: User-Centered Designs

Spring 2005

Course Web page:
http://polaris.gseis.ucla.edu/pagre/is277.html

Assignment for weeks 3, 4, 5, 6, 7, 8, and 10.

For all but the first couple of class meetings, we will be spending the first half of the class reading documents that have been marked up with particular languages: XML (Week 3), XML document type definitions and XML Schema (Week 4), XHTML (Week 5), RDF (Week 6), RDF Schema (Week 7), an application of RDF called RSS (Week 8), and OWL (Week 10). To prepare for each of these classes, please find a Web page that uses the markup language of the week. Thus, for example, in Week 3 you will bring an XML page to class, in Week 4 you will bring an XML document type definition or XML Schema page, and so on. RSS pages are extremely easy to find, so for week 8, learn to use an RSS reader (also easy to find) and bring in an RSS page that relates in some way to your term project. Because so few marked-up educational resources are available on the public Web, there is no assignment for Week 9. Note that we will not spend any class time reading OWL-S documents except during the lecture in Week 10.

Because this is a paperless class, you will "hand in" your document by linking to it from a Web page that you create for the purpose. Here is an example of what such a Web page might be like:

http://polaris.gseis.ucla.edu/pagre/is277-example.html

Try not to bring in documents that use markup languages from later in the class. Thus, for example, all RDF documents are also XML documents, but you should not bring in an RDF document during Week 3. Likewise, all OWL documents are RDF documents, but you should not bring in OWL documents before Week 10. You can readily determine whether a given document is an RDF or OWL document because it will almost certainly mention RDF or OWL namespaces near the very beginning.

Also, the vast majority of RDF documents on the Web are RSS documents. We will be reading RSS documents in Week 8, so please do not bring RSS documents for Week 7. Again, you can readily determine whether a given document is an RSS document because it will almost certainly mention an RSS namespace near the very beginning.

You are very welcome to bring documents each week that pertain to the domain about which you are writing your term paper. Thus, for example, if your term paper is about classics, then you should certainly locate all of the important XML and semantic web research projects in the humanities generally and classics in particular. The web sites of those projects, if they are any good, should include marked-up semantic web documents pertaining to that domain. Alternatively, their research papers might include some markup. (It is better to get a naturally occurring marked-up document than a research paper that includes and explains the markup, but I realize that that might be hard.) Your paper for the class will probably end up linking to many of the pages that you bring to class from week to week.

These seven marked-up documents will be 25% of the course grade. I will drop the lowest of the seven grades.

A good way to find pages for this assignment is to use the "filetype" field in Google. For example, if your domain is geology, you might type the following into Google.com:

   geology filetype:xml

This will, more or less, only retrieve Web pages that include the word "geology" and whose filename ends in ".xml".

For week 6, when the assignment is to locate an RDF page that is not an RSS or OWL page, you can type this into Google:

   geology -rss -owl filetype:rdf

The "-" means that pages that include the words "rss" or "owl" will not be returned.