Sunday, November 14, 2010

Reading Notes- Week 11, Nov 22, 2010

Web Search Engines: Part 1 and 2
The links for Part 1 and Part 2 about search engines led me to the homepage for Computer, the flagship publication of the IEEE Computer Society. I searched for both the articles within the journal and found them, but couldn't access to them. All I could see were the abstracts. So I went to the ULS website and found them by looking through the electronic journals. Just thought I'd throw that out there in case anyone else was having trouble accessing the articles from the links provided.
I thought these articles provided a good overview of how search engines and web crawlers work. I had never heard of the "politeness delay" Hawking mentions in Part 1, but it makes sense because I guess overworking the machinery would put too much stress on it. Also, I liked the term "politeness delay." Considering how many steps go into crawling web pages, I am amazed at how fast you get search results back. It's hard to believe all that is going on in a fraction of a second. I am just so impressed by how quickly and efficiently search engines work. With all the junk out there on the web, I might expect to get a lot more hits that were irrelevant, but clearly a lot of thought has been put into designing these search engines, and they generally do a good job.

Current developments and future trends for the OAI protocol for metadata harvesting
The Open Archives Initiative sounds really interesting. It seems to allow various groups to collect their own metadata and then share it through service providers. But toward the end of the article the authors described how even through everyone's using Dublin Core, there are still differences in how data are being entered. Will our field ever be able to reach a standard for interoperability? Or are there just too many archives and too many libraries out there with too many diverse and unique collections to make this possible? Maybe that's not even the problem. Is communication between different institutions the issue?

The Deep Web: Surfacing Hidden Value
I am constantly amazed by how large the internet is. And I feel like because it is so gigantic I can't even imagine how gigantic it is. According to this article, when people use search engines, they are only searing 0.03 percent of the internet. That's crazy! How many pages are out there that you might want to see but never will? The article states: "Traditional search engines can not 'see' or retrieve content in the deep Web — those pages do not exist until they are created dynamically as the result of a specific search. Because traditional search engine crawlers can not probe beneath the surface, the deep Web has heretofore been hidden." It was interesting to read how the web has evolved and how search engines have evolved with it. Does anyone have any future predictions for the future of internet search engines? Will they ever penetrate the deep web?

Monday, November 8, 2010

Week 10 Comments

My comments for Week 10, Nov 15th, are below:

http://bds46.blogspot.com/2010/11/muddiest-point-118.html?showComment=1289274488305#c8845671163837204619

http://lostscribe459.blogspot.com/2010/11/week-10-reading-assignments.html?showComment=1289274196514#c8175607923355329919

Muddiest Point for November 8th class

XML sounds like it's really flexible and can do a lot. In what situations wouldn't you use XML?

Reading Notes- Week 10, Nov 15, 2010

Digital Libraries: Challenges and Influential Work
It sounds like there are a lot of interesting projects going on in digital library land. The article gave some examples but didn't really go too in depth about the different projects. Has anyone used any of these before? What did you think of your digital library experience?

Dewey Meets Turing
This article clearly laid out the sometimes complicated relationship between computer scientists and librarians. There was a lot of discussion about one group or the other worrying about getting their toes stepped on. I can understand these concerns. But I think that there will always be rooms for both groups because they do different tasks, have different focuses, and different skill sets. Does anyone think that these two groups might one day merge though? Will librarians of the future have to have the technical knowledge of computer scientists?

Institutional Repositories
This article details the challenges and benefits of institutional repositories. One thing that stuck out to me in light of conversations we've had in LIS 2000, was the chance that it might shift control of what scholars publish from scholars to institution. The issue of control popped out to me, since in 2000 we've talked so much about digital repositories being open access and giving authors more freedom to publish what they want.

Monday, November 1, 2010

Week 9 Comments

Here are my comments for Nov 8th.

Zach's LIS 2600 blog
http://pittlis2600.blogspot.com/2010/11/week-nine-reading-notes.html?showComment=1288659674811#c1926928485211520581

James McNeil's blog
http://jrm170.blogspot.com/2010/11/118-reading-responses.html?showComment=1288659755871#c8157705901944824138

Muddiest Point for November 1st class

I am confused about how Jeipu got to the Notepad document so that he could type in HTML commands and then make it a web page. Do you just create a new text document? And then... how do you export this or something so that it becomes a website? I didn't understand that step. Thanks!

Reading Notes- Week 9, Nov 8, 2010

Introducing the Extensible Markup Language (XML)
Is the link that's posted in CourseWeb for the correct site? When I clicked on this one I was taken to a page entitled "The Brighton University Resource Kit for Students" written by a man named John English. But I guess I can briefly comment on it... I think the idea of providing free information for the purposes of education is really admirable. Textbooks are so expensive. And, in this case, the internet used to be really expensive. Providing students with free resources, I'm always a fan of that. Furthering education will only lead to positive things for the individuals who are learning, and the society that they are able to impact with what they've learned.

A survey of XML standards: Part 1. January 2004
The article says that the current version of XML was translated into English. What does this mean? Is it explained in simpler terms somewhere? XML is based on SGML and is supposed to be a simplified version of that. All of the links to tutorials and other sources look extensive. I like that the author didn't try to reinvent the wheel, but just led you to tutorials and instructions that already exist. I might have missed this, but XML sounds a lot like HTML to me...what is the difference? I'm sure we've covered this in class or somewhere before, but they're both markup languages. Does it have to do with compatibility?

Extending your Markup: a XML tutorial by Andre Bergholz
This article maybe answers my questions from the above article. It says that XML lets you "meaningfully annotate text." I see under "addressing and linking" some specific differences are listed. You can do certain things with XML that you can't so with HTML. The article explains XML in depth. But I think I still don't really see the differences between HTML and XML. Maybe it would help if I saw someone actually using it instead of just the figures shown in the article.

XML Schema Tutorial
Just like the tutorial we looked at last week from W3schools, this one looks very comprehensive. I don't really know what else to say about it except that it looks thorough and like a good resource for anyone interested in learning about XML and seeing lots of examples.