Saturday, 29 October 2011

Coursework Part 1

Understanding Web 1.0 in the context of Public Libraries:

Introduction:
While the internet has had a disruptive effect on commercial areas such as publishing and the music industry, it has helped pull public libraries out of the 90’s lull, returning them as centres of education and providing users with access to electronic services such as the World Wide Web. In this essay I will be exploring the aspects of Web 1.0 which have affected library services by following the process of information, from the webpage itself to the end result and assessing its relevance to the user.

A person attempting to find a book in a public library would approach the OPAC (Online Public Access Catalogue) terminal and use methods of Information Retrieval to retrieve the information from the bibliographic Database, of which is usually hosted online and can be accessed through the Internet and the World Wide Web.

Internet and the World Wide Web:

Firstly, how is a functioning webpage created in order for it to be accessed by someone using an internet browser?

To convert a page of text into a webpage, it needs to be written in a mark-up language, one example of this is HTML (another is XML) which is an SGML based language. The simplest webpage that could be made with HTML would look like:
Title of

Once designed as above, this can be saved as an .html or .htm file and hosted on a server and can be accessed by identifying its URL. For example, the URL for the Westminster Libraries E-Catalogue is:


The http stands for HyperText Transfer Protocol, the most recognisable protocol (alongside FTP and POP3). The ‘elibrary.westminster.gov.uk’ section is the DNS or Domain Naming System. The rest is the local path to the server folder. In this case, the internet browser acts as a Client and sends a request to the computer at the address in the URL. The Server constantly runs an http ‘daemon’ which listens out for Client requests and once it has detected a message the Server sends a response, in this case the information for the webpage.

Looking at the page source for the above website, it is mostly HTML with some scripts written in Javascript and most of the design is part of an external cascading style sheet (CSS), which is linked to the .html file, but under its own .css file extension.

Databases:

To manage the entries within the library catalogue, they would be compiled within a database, which stores all entries centrally so there are no inconsistencies. Unlike spreadsheets, a database can contain millions of separate entries and the OPAC database above is no exception. However, a user of the library is more likely to use natural language to navigate the OPAC (Van Riel, 2008) and it would usually be the task of the information professional to query the database.

A database a compilation of data tables, which are two dimensional tables of data formed into columns and rows. A column is the field, for example Author, whereas a row makes up a complete record spanning across the fields (for example: Fields, Factories_and_workshops, Kropotkin, Peter_Alexseivich, 1912 – each would fall under a column but are across a single row).

Where a spreadsheet is used for compiling the data in one place for calculations, graphs etc, a database is used to answers specific queries. SQL (Structured Query Language) is a common language for communicating with database management software. SQL can be used for building databases and data tables but it is most commonly used for querying the database with commands such as SELECT, FROM and WHERE.

If the data you have stored is not homogenous or is unstructured, then it may be more appropriate to access the data via other means, such as information retrieval.

Information Retrieval:

Information retrieval is the field exploring information seeking behaviour and its relevance to the user (whether it satisfies their information needs). Quite often in a public library, a user does not know precisely what they are looking for (Van Riel, 2008), therefore the information needs to be indexed before being entered into the database, along with the metadata and keywords which the user may be searching for.

There are times when a keyword being searched for is particular to the result, but may be confused under natural language searches. Take the example of roman numerals, in this case Star Wars: Episode I, it is possible to search for this and find it through Best Match, but alternatively the query could be modified with +I to make sure the numeral
is included (Clegg, 2006), or the whole query could be entered as a phrase using quotation marks: “Star Wars: Episode I”.

It is quite likely that someone accustomed to using internet search engines would intrinsically use a Best Match technique, using natural language queries and then modifying the search terms if the results aren’t of relevance. However, there are other search modes which could be used. For example Boolean keyword searching, which notably removes stop words (i.e. all, the) and uses logical operators and positional operators.

Logical operators are commands which refine the results you would get; for example, Anarchists AND Communists would return results which including both terms together, whereas Anarchists OR Communists would return results with either, and Anarchists NOT Communists would return results about Anarchists alone. Positional operators allow you to retrieve results which have keywords in relation to each other; for example, ADJ would return results with the keywords side by side whereas SAME would return results in the same bibliographic record.

Conclusion:

It always comes down to precision. An incorrect set of parameters in an SQL query would result in an error message, a badly worded search engine query would bring you irrelevant results and the smallest slip in the coding of HTML would result in a broken section of the webpage.

In 2008, Google announced that they would be updating their programming in order capture database content (Devine and Egger-Sider, 2009), which would blur the lines between internet search engines and data retrieval from databases. If successful, it would become easier for users to procure relevant and precise information which satisfies their query. But of course, no matter how simple a task becomes, human error will always be present and it will still be up for the information professionals to return precise queries.



Bibliography and References:

  • http://philsci-archive.pitt.edu/2536/1/iimd.pdf (accessed 27/10/11)
  • Andy MacFarlane, Richard Butterworth, and Jason Dykes Lecture (2011) Lecture 02: The Internet and the World Wide Web London: City University.
  • http://elibrary.westminster.gov.uk/uhtbin/webcat (accessed 25/10/11)
  • http://www.w3.org/People/Raggett/book4/ch02.html (accessed 26/10/11)
  • http://www.w3.org/TR/html4/intro/sgmltut.html (accessed 26/10/11)
  • http://www.isgmlug.org/sgmlhelp/g-sg.htm (accessed 26/10/11)
  • Andy MacFarlane, Richard Butterworth, and Anton Krause (2011) Lecture 03: Structuring and querying information stored in databases
  • http://sqlzoo.net/w.htm (accessed 26/10/11)
  • Andrew MacFarlane (2011) Lecture 04: Information Retrieval
  • Van Riel, R, Fowler, O and Downes, A (2008) The Reader-friendly Library Service. Newcastle upon Tyne: The Society of Chief Librarians.
  • http://library.indstate.edu/about/units/instruction/key.pdf (accessed 26/10/11)
  • Clegg, B (2006) Studying Using The Web. New York: Routledge.
  • Devine, J and Egger-Sider, F (2009) Going Beyond Google. London: Facet Publishing.

Blog URL: http://teapotverseslife.blogspot.com/

All images created by Shaun Condon for the purpose of this blog.

Monday, 10 October 2011

DITA - Week 3

When I attempted to query the database with the examples shown in the lecture, I quickly found that I was using the wrong words and descriptions for the entities and the data tables. For example, I am asked in Task 1.1 to use the field 'Company Name' which I tried as 'company name', 'CompanyName' and other variations before having to ask for help.

I used the command "show tables;" to find the first set of data tables, then specified that I wanted publishers in order to find the exact term for 'Company Name' using the command "desc publishers;".

+--------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------+------------------+------+-----+---------+-------+
| pubid | int(10) unsigned | NO | PRI | 0 | |
| name | varchar(100) | YES | | NULL | |
| company_name | varchar(100) | YES | | NULL | |
| address | varchar(100) | YES | | NULL | |
| city | varchar(100) | YES | | NULL | |
| state | varchar(100) | YES | | NULL | |
| zip | varchar(30) | YES | | NULL | |
| telephone | varchar(30) | YES | | NULL | |
| fax | varchar(30) | YES | | NULL | |
| comments | text | YES | | NULL | |
Which gave me the first answer, Company Name is written like company_name. Now on with the tasks.

Task:

Develop SQL queries to return following information :

  1. A list of the PubID, Name, Company Name and City for all publishers based in the city of New York
  2. A list of all fields for publishers named Prentice Hall.
  3. A list of the Title, Year and ISBN for all titles published in 1994.
  4. A list of the Title, Year, ISBN and PubID for all titles published since 1980 in year order
  5. A list of all fields in the Titles table for books whose title begins with the word 'database' (regardless of upper/lower case letters)
  6. A list of all fields in the Titles table for books whose title with the word 'database' anywhere in the title (regardless of upper/lower case letters)
  7. A list of the title, Year Published and ISBN for all books with 'SQL' in the title written since 1990 in date order
  8. A list of the Company Names of all publishers who have published books on programming since 1990
  9. The name of the publisher who published a book with ISBN 0-0280074-8-4
  10. The name of the author who wrote "A Beginner's Guide to Basic" listing also, the ISBN and name of this book.
Answers:

1) select PubID, Name, Company_Name, City
from publishers
where City = "New York";

2) select * from publishers where name = "Prentice Hall";

3) select title, year_published, isbn
from titles
where year_published = 1994;

(For 1.3, I managed to get the result, but the entries where too large to fit on the screen. Also, the MySQL shell couldn't fit all the characters from the result on screen, I tried to rearrange them with the added \G or \g on the end of the command, but the result was still off screen.)

4) I originally tried "select title, year_published, isbn, pubid from titles where year_published >1979;" but found that the date was out of order (again the result was too large to fit on the screen). Instead I tried

select title, year_published, isbn, pubid
from titles
where year_published >1979 order by year_published asc;

... and found my result.
5) select *
from titles
where title like "database%";

6) This is similar, only to get an 'any' result I add another wildcard.

select *
from titles
where title like "%database%";

7) "select title, year_published, isbn from titles where title like "%SQL%", year_published >= 1990 order by year_published asc;" was my original guess, however it was not correct. Whereas the entities such as title and year_published can be separated by a comma, the last bit of "%SQL%" and "year_published" had to be written with an and.

select title, year_published, isbn
from titles
where title like "%SQL%" and year_published >= 1990 order by year_published asc;

8) select distinct company_name
from publishers, titles
where title like "%programming%" and publishers.pubid = titles.pubid and year_published >= 1990;

Had real trouble with this one, I knew I had to knit two tables together but struggled to remember learned lessons such as the SELECT bit can have entities seperated by a comma, but from then on I must seperate with AND. Also, finding out the right combination of primary key and foreign key was a step for the imagination.

9) select company_name
from publishers, titles
where publishers.pubid = titles.pubid and isbn like "0-0280074-8-4";

Similar to the previous one, I tied the two parts together with primary and foreign keys, the rest was rather simple.

10) I needed help with this one. With the previous question, I was only attempting to query the company name, whereas here I have to find the author's name as well as the isbn and title of the book. However, I did get a satisfactory answer.

select author, titles.isbn, title
from authors, title_author, titles
where authors.au_id = title_author.au_id and title_author.isbn = titles.isbn
and title =”A Beginner’s Guide to Basic”;

The part I still am trying to comprehend is the where section, with knitting together two parts of the table.

Sunday, 9 October 2011

Adding CSS

I stumbled over this for a while, in my last post I mentioned how I was worried by CSS, well I had a reason.

However, I believe I have it.

There are two ways of adding CSS, External and Internal Style Sheets. If I were to add it internally, I would in the HTML for every page - seeing as I have only 3 pages and am struggling, I have opted for External.

An example taken from W3 Schools for an Internal Style Sheet is this:

(head)
(style type="text/css")
hr {color:sienna;}
p {margin-left:20px;}
body {background-image:url("images/back40.gif");}
(/style)
(/head)
So the 'style type' bit identifies it as being CSS rather than something like JSS Javascript Style Sheets which Netscape 4 uses but no one else. Then the rest of the text defines the page, so colour, margin, background image etc.

For External Style Sheets I had to create a .css file. I opened notepad and copied the Simple Style Sheet offered in the exercises for week 2 and saved it as stsh1.css (style sheet one), saved it to a folder called CSS. Then I had to link my pages to it.

(Actually - at this point - it automatically 'worked' when I opened up the page to test it, turns out as I'd been playing around with the CSS I had enough on there to make it work locally. Only when I tried to upload it to the University computers that it crashed and died again.)

In the HEAD section of every page I created, I added this:
(LINK REL="STYLESHEET" TYPE="text/css" HREF="css/stsh1.css")
This is the link to the CSS file I have saved and this simple line is what turned my page from white space to something slightly more palatable.

Now I am afraid to play around with it for f34r of breaking it.

Monday, 3 October 2011

DITA - Week 2

The Internet and the Web:

Task One: "Find out about three of the following tags or elements and attributes and parameters that can be used within them."

Paragraphs and Line Breaks:

I feel comfortable with the tags for paragraph (p) and linebreaks (br) which are used to group text on the page, a new line in notepad doesn't translate to a new line in a webpage unless a linebreak or paragraph tag is added. A (hr) tag indicates that a line will be put across the webpage.

Ones I am not so familiar with are Meta Information, Tables, Unordered Lists and Ordered Lists, while I can assume what they do, I have never used them.

Meta Information:

From what I can tell, Meta Tags will not show up in the main body of a webpage, it is also advised to put them into the HEAD section of the HTML. Their function is more behind the scenes than anything else, with smart keywords allowing for search engine optimisation to be more efficient.

Here are a few examples that I have found:

(meta name="description" content="Free Web tutorials" /)
(meta name="keywords" content="HTML,CSS,XML,JavaScript" /)
(meta name="author" content="Hege Refsnes" /)
Pretty self explanatory, keywords for search engine optimisation and so the browser can find the page through a keyword search, I would place myself as the author of the webpage and the description would be a quick few words that describes my new webpage.

Unordered / Ordered List:

For the task ahead of me I have written a brief example of an unordered and an ordered list. A brief unordered list of demands, and an ordered list of the faux rebellion's demands.
(ul)
(li)More cheese on top of pasta bakes(/li)
(li)Less tax on space station duties(/li)
(li)Total acceptance of the new world order(/li)
(/ul)

(ol)
(li)Rebellion(/li)
(li)?????(/li)
(li)Profit!(/li)

One limitation of ordered and unordered lists that I have noticed is it's limited to just bulletpoints and numbers, I wonder how these could be customised to include Roman numerals, for example, which may lay in another computer code.

Task Two:
Producing Some HTML:

As with last week, I have decided to do something a little different from the set task, rather than create a webpage for myself about myself, I have created a fake rebellion with some silly aims.

Firstly, I created a .first page which will be the starting point to my webpage, I did this by using the original HEADTITLEBODY template provided. With some reasonable looking HTML code I imported that into EditPlus2, which highlighted the tags and made things a lot easier to do.

Unfortunately, I did not heed the advice of the tutors and mistakingly left out the closing / for one of my tags, oh the embarassment.

Once I had my first page, I created an index page and linked the first page to it. I created the lists as shown above and played around with some of the meta tags.
(META NAME="Author" CONTENT="MrTeapot")
(META NAME="Keywords" CONTENT="New World Order, Rebellion, Take Over")
(META NAME="Description" CONTENT="A massive yarn on you all.")
I was rather impressed as I had only just discovered Meta Tags hidden in the page source and had no clue how to use them, but was soon adding tags to my pages.

CSS

I was worried by the introduction of CSS, and as I had no time to tackle it in lesson, I skipped that part. My webpage will be designed in my own time, I wished to publish my webpage for all to enjoy and skipped to the next task *slap on the wrist for me*.

Publishing

This was the most complicated part for me, understanding how webpages online are managed.

Firstly, I Mapped my W: Drive to allow me to just drag and drop any files I want into my public folder, which can be accessed online.

To actually publish, I had to open a program called Telnet in order to do this. Once I had connected to unix.city.ac.uk and logged in, I could use a single keystroke to publish my html documents and access them again with the URL that leads to my public folder.

Phew. Time to go back and do some CSS stuff now.