Saturday 29 October 2011

Coursework Part 1

Understanding Web 1.0 in the context of Public Libraries:

Introduction:
While the internet has had a disruptive effect on commercial areas such as publishing and the music industry, it has helped pull public libraries out of the 90’s lull, returning them as centres of education and providing users with access to electronic services such as the World Wide Web. In this essay I will be exploring the aspects of Web 1.0 which have affected library services by following the process of information, from the webpage itself to the end result and assessing its relevance to the user.

A person attempting to find a book in a public library would approach the OPAC (Online Public Access Catalogue) terminal and use methods of Information Retrieval to retrieve the information from the bibliographic Database, of which is usually hosted online and can be accessed through the Internet and the World Wide Web.

Internet and the World Wide Web:

Firstly, how is a functioning webpage created in order for it to be accessed by someone using an internet browser?

To convert a page of text into a webpage, it needs to be written in a mark-up language, one example of this is HTML (another is XML) which is an SGML based language. The simplest webpage that could be made with HTML would look like:
Title of

Once designed as above, this can be saved as an .html or .htm file and hosted on a server and can be accessed by identifying its URL. For example, the URL for the Westminster Libraries E-Catalogue is:


The http stands for HyperText Transfer Protocol, the most recognisable protocol (alongside FTP and POP3). The ‘elibrary.westminster.gov.uk’ section is the DNS or Domain Naming System. The rest is the local path to the server folder. In this case, the internet browser acts as a Client and sends a request to the computer at the address in the URL. The Server constantly runs an http ‘daemon’ which listens out for Client requests and once it has detected a message the Server sends a response, in this case the information for the webpage.

Looking at the page source for the above website, it is mostly HTML with some scripts written in Javascript and most of the design is part of an external cascading style sheet (CSS), which is linked to the .html file, but under its own .css file extension.

Databases:

To manage the entries within the library catalogue, they would be compiled within a database, which stores all entries centrally so there are no inconsistencies. Unlike spreadsheets, a database can contain millions of separate entries and the OPAC database above is no exception. However, a user of the library is more likely to use natural language to navigate the OPAC (Van Riel, 2008) and it would usually be the task of the information professional to query the database.

A database a compilation of data tables, which are two dimensional tables of data formed into columns and rows. A column is the field, for example Author, whereas a row makes up a complete record spanning across the fields (for example: Fields, Factories_and_workshops, Kropotkin, Peter_Alexseivich, 1912 – each would fall under a column but are across a single row).

Where a spreadsheet is used for compiling the data in one place for calculations, graphs etc, a database is used to answers specific queries. SQL (Structured Query Language) is a common language for communicating with database management software. SQL can be used for building databases and data tables but it is most commonly used for querying the database with commands such as SELECT, FROM and WHERE.

If the data you have stored is not homogenous or is unstructured, then it may be more appropriate to access the data via other means, such as information retrieval.

Information Retrieval:

Information retrieval is the field exploring information seeking behaviour and its relevance to the user (whether it satisfies their information needs). Quite often in a public library, a user does not know precisely what they are looking for (Van Riel, 2008), therefore the information needs to be indexed before being entered into the database, along with the metadata and keywords which the user may be searching for.

There are times when a keyword being searched for is particular to the result, but may be confused under natural language searches. Take the example of roman numerals, in this case Star Wars: Episode I, it is possible to search for this and find it through Best Match, but alternatively the query could be modified with +I to make sure the numeral
is included (Clegg, 2006), or the whole query could be entered as a phrase using quotation marks: “Star Wars: Episode I”.

It is quite likely that someone accustomed to using internet search engines would intrinsically use a Best Match technique, using natural language queries and then modifying the search terms if the results aren’t of relevance. However, there are other search modes which could be used. For example Boolean keyword searching, which notably removes stop words (i.e. all, the) and uses logical operators and positional operators.

Logical operators are commands which refine the results you would get; for example, Anarchists AND Communists would return results which including both terms together, whereas Anarchists OR Communists would return results with either, and Anarchists NOT Communists would return results about Anarchists alone. Positional operators allow you to retrieve results which have keywords in relation to each other; for example, ADJ would return results with the keywords side by side whereas SAME would return results in the same bibliographic record.

Conclusion:

It always comes down to precision. An incorrect set of parameters in an SQL query would result in an error message, a badly worded search engine query would bring you irrelevant results and the smallest slip in the coding of HTML would result in a broken section of the webpage.

In 2008, Google announced that they would be updating their programming in order capture database content (Devine and Egger-Sider, 2009), which would blur the lines between internet search engines and data retrieval from databases. If successful, it would become easier for users to procure relevant and precise information which satisfies their query. But of course, no matter how simple a task becomes, human error will always be present and it will still be up for the information professionals to return precise queries.



Bibliography and References:

  • http://philsci-archive.pitt.edu/2536/1/iimd.pdf (accessed 27/10/11)
  • Andy MacFarlane, Richard Butterworth, and Jason Dykes Lecture (2011) Lecture 02: The Internet and the World Wide Web London: City University.
  • http://elibrary.westminster.gov.uk/uhtbin/webcat (accessed 25/10/11)
  • http://www.w3.org/People/Raggett/book4/ch02.html (accessed 26/10/11)
  • http://www.w3.org/TR/html4/intro/sgmltut.html (accessed 26/10/11)
  • http://www.isgmlug.org/sgmlhelp/g-sg.htm (accessed 26/10/11)
  • Andy MacFarlane, Richard Butterworth, and Anton Krause (2011) Lecture 03: Structuring and querying information stored in databases
  • http://sqlzoo.net/w.htm (accessed 26/10/11)
  • Andrew MacFarlane (2011) Lecture 04: Information Retrieval
  • Van Riel, R, Fowler, O and Downes, A (2008) The Reader-friendly Library Service. Newcastle upon Tyne: The Society of Chief Librarians.
  • http://library.indstate.edu/about/units/instruction/key.pdf (accessed 26/10/11)
  • Clegg, B (2006) Studying Using The Web. New York: Routledge.
  • Devine, J and Egger-Sider, F (2009) Going Beyond Google. London: Facet Publishing.

Blog URL: http://teapotverseslife.blogspot.com/

All images created by Shaun Condon for the purpose of this blog.

No comments:

Post a Comment