Schedule On One Page (2009)

This page contains all the exercises from the whole semester.


01 Introduction

by samooresamoore (08 Sep 2009 00:48; last edited on 15 Sep 2009 19:46)

Description of what the course is about, what we'll do in the class, why students should take it, why all their friends should take it.

Class held on 09/09/2009. (student notes; possible questions).

My notes

  1. Introduce myself.
  2. Go through the class pitch on SlideShare
    • At the end of the pitch have a discussion about the merits of the class, what they're thinking, what they think sounds good, what sounds confusing.
    • Slideshow
  3. Take role; use the Photo Roster
  4. Course wiki
  5. Discuss interesting parts of class from my perspective:
    • practical
    • class wiki (that is open, evolves, contains blog so that we all can teach each other)
    • twitter (I'm drsamoore)
    • individualized learning
    • project-centric
    • your own wiki
    • so much changes from year-to-year; for example:
      • Yahoo/Microsoft Live to Bing
      • Lots of tools disappeared
      • Existing tools have improved
      • New tools have appeared
      • Lots of information available on the Web; I capture stuff that's of interest to me on delicious — specifically, under the bit330 tag.
      • Lots of news and blogs available
    • the class is revised from last year
      • 4 totally new days
      • 3 mostly new days
      • every other day has changed non-trivially (because of technological changes, if nothing else)
      • we learned lots about what tools are good, bad, and indifferent
    • web-based almost completely
    • little reading but lots of doing
    • lots of small tasks; can't fall behind
  • Office hours: MTW 3:30-4:30 in the Winter Garden
bit330topics2009.png

To do

  1. Get a twitter account if you don't already have one.
    • If you have a cell phone for which you are not charged unlimited texting, then I'd like you to set this up with twitter.
    • I have set it up (for now) so that my phone will receive tweets from 9am-9pm.
    • Go to Settings/Devices (from the home page) after you have set up your account.
    • Also upload a photo so that I can see who is tweeting me.
    • We will do more with this in future classes.
  2. Sign up as WikiDot member
  3. Apply for membership to the BIT330 web site.
  4. Sign up for class notes or questions (as described on the assignments page).
  5. Sign up for search industry updates (as described on this page).

02 Web Search

by samooresamoore (11 Sep 2009 15:18; last edited on 06 Oct 2009 15:12)

Discuss basics of Web search and why students should use multiple search tools (rather than just Google).

Class held on 09/14/2009. (student notes; possible questions).

At beginning of class

  1. Someone should take notes and post them for today's class (sign up here)
  2. By now you should have done the following. This is not optional. This is not to do later. These sign-ups are due today.
    1. Become a member of twitter. Follow me, drsamoore.
      • How will I use twitter? How can you use twitter?
    2. Become a WikiDot member
    3. Become a member of the the BIT330 web site
    4. Read through the major pages of the wiki and look through the rest of it
    5. You will want to sign up for
  3. The structure of today's class is going to be fairly standard for the rest of the semester:
    • Start off with announcements and taking questions and comments.
    • Lecture for a bit. This lecture will be something of an overview and will provide the motivation and background for the exercises that you will complete and the assignments that you will have to work on.
    • Provide some time for you to start on your exercises (which will allow you to explore the specifics of the topic that my lecture introduces).

My notes

  1. Go through “At beginning of class”
  2. Take role.
  3. Make presentation
  4. Go through the search syntax page
  5. Talk about blogging
  6. Point out the exercises for today; they should start working on these as soon as I'm done talking.
  7. Point out the “To do after class” section on this page

To do after class

  1. Finish the Web search exercises.
  2. Think about the following questions:
    1. Search tools can differ by their functions: generating results, exploring results, and monitoring changes. How do Google, Bing, and Ask differ along the first two dimensions (we'll explore the third later)?
    2. Look at any one of the search engines we used today (other than Google). Analyze it in terms of the "search experience" that it provides.
  3. Read Life before Google.
  4. You should be very, very familiar with the search syntax page by the next class. I'm going to update it to include Ask very soon; and I'm also going to update the Yahoo search information.

Resources

  1. Life before Google
  2. Search syntax — for Google, Bing, and (soon) Ask.
  3. Today's slides on SlideShare and as a PDF

03 Wikidot And Twitter

by samooresamoore (15 Sep 2009 23:33; last edited on 23 Nov 2009 15:13)

We'll go over techniques and tricks related to using this wiki, which will also be the host of your term project wiki. We will also learn a bit more about twitter, enough to get started using it.

Class held on 09/16/2009. (student notes; possible questions).

Before class

  1. Keep your cell phone out but put it on vibrate.
  2. Open your Web browser
  3. Make sure you are on these pages:
  4. Take note of these pages:
    • Search related feeds (from "Content" menu)
    • RSS feed items (from "Content" menu)
    • Search syntax (from "Content" menu) — updated for Ask.com information
    • On Monday, the "Grades" menu will lead to an online database in which you will "turn in" your assignments.
  5. Note the new menu structure
    • Dynamic "Schedule" menu
      • After the wikidot tutorial, be sure to look at how the top menu is put together (i.e., look at the code itself).
  6. This course Web site is your course Web site.

At beginning of class

  1. Take role.
  2. Talk about assignments
    1. Term project topics
    2. Assigned/due dates
      • Notes — notes to be posted by the end of the class day
      • Questions — questions to be posted (at least first draft) by one week later
      • General blog entries — write-up to be posted by the following class
        • These will be posted on your own wiki. We'll see how to do this today.
      • Industry updates — write-up to be posted on the day listed
        • These will be posted on the course wiki. Ditto.
    3. The biggest challenges with this class
      • Knowing what to do
      • Staying familiar with the Web site
      • Completing the daily exercises and frequent blog assignments so as to not fall behind
      • Coming up with an interesting topic for your term project

My notes

Twitter

Some background

  1. Today we're going to complete the setup of our twitter accounts and make sure that we know the basics of how to use it.
  2. Twitter is not an IM Client — "The basic idea behind Twitter is to produce occasional status updates, not hold personal conversations. Conversations with more than one person are exactly what Twitter is for and these should be encouraged, but if it is obvious that there is only one other participant take it off-Twitter to an IM client." (from The ultimate guide to everything twitter, below)
  3. Uses of twitter for this class
    • Ask me questions about class (use #bit330 in message)
    • Ask me questions about BBA program (use #rossbba in message)
    • I'll remind you about something you need for an upcoming class or some assignment that is due
  4. Message types (in the US, all messages go to 40404)
    • To update your twitter status: "message goes here"
    • Direct message to a user: "d username message goes here"
      • This is like a direct text message. The message does not appear in your twitter log; it only appears in your direct message outbox.
    • Bring a message to someone's attention: "@username message goes here"
      • This message appears in your twitter log but is brought to the attention of the person you identified.
    • To start following someone: follow username
    • To stop following someone: leave username
    • To get a user's messages on your phone: on username
    • To stop getting a user's messages on your phone: off username
    • To stop all messages from going to your phone: off
      • Actually, you'll still get direct messages; in order to stop getting direct messages as well, send off again.
    • To nudge a person to update twitter: nudge username
      • This is encouragement for someone to update their twitter status.
    • To get statistics about your account: stats
    • To invite a non-twitterer to join the fun: "invite 404 555 1212"
  5. Hashtags
    • Words that are preceded by the hash #
    • The hashtag for this class is #bit330
    • Use the hashtag whenever you are referring to this class. It will make the message easier to find later. You'll see this in a later class.
  6. The twitter exercises for today's class can be found here. Do them now.

Wikidot

  1. Today we're going to learn about wikidot, get an idea of how to use it, get more familiar with working with a wiki.
  2. Wikidot is the host of the course Web site, but it's also going to be the host of your term project Web site.
  3. You are a member of the course Web site, and you will be the administrator of your own term project site. This means that, while you have total and complete control over your own site, you also have the ability to edit and create pages (but not delete them) within this course Web site.
  4. I will expect that you will be a very good user of this wikidot site by the end of the semester. Maybe you won't be an expert, but you'll be able to make a wiki that is filled with properly formatted content and useful navigation.
  5. The wikidot exercises for today's class can be found here. Do them now.
    • If you have any questions, you can either send them on twitter or put them at the bottom of this page under "Common questions".

During and after today's class

  1. Complete all of the twitter tutorial and the wikidot tutorial before next class.
    • This includes your twitter account setup (including headshot loaded and identifying personal info), sending a few twitter messages, your wikidot account setup (again, including headshot loaded and identifying personal info), a test blog (on your wiki), a blog listing page (on your wiki), and an introductory blog (on the course Web site).
  2. Add your information to the student list page before next class.
    • Please stop kicking other people off of a page when editing. I'm going to have to come up with different methods of doing these assignments but, until then, please be polite. Thank you.
    • This should all be done by Sunday night at 10pm. This is the first part of your participation grade. (I won't always tell you this, but I'm reminding you this first time.)
  3. Think about what you might do for the term project.

Resources

Wikidot

  1. Videos and other resources about wikidot
  2. Twitter tutorial
  3. Wikidot tutorial

Twitter

  1. Twitter tutorials
  2. Twitter resources
  3. Story sources

Common questions

If you have questions about wikidot or twitter, add them to the list below. Someone (preferably a student) will answer your question.

About wikidot

  1. What does the pound sign mean in the menu bar (top menu bar)?
    • It means that the menu item cannot be followed — generally it's a menu header or divider.
  2. How do you delete pages?
    • Use the "site options" button at the bottom of the page.
  3. How do you edit pages?
    • Use the "edit" button at the bottom of the page.
  4. How do you list the pages on the site?
    • Use the button on the side bar that says "List all pages".
  5. Can you have the top menu bar automatically update with the most recent blog entries (or whatever)?
  6. Is it possible to dynamically generate a table from other pages within a Web site?
    • Yes, and you should look at the code I use to generate the "Announcements" or "Schedule" or "Blogs" on the start page.
  7. How do you put an image hosted on mFile on your page?
    • The instructions are discussed on this page.

04 Search Techniques

by samooresamoore (19 Sep 2009 22:25; last edited on 07 Oct 2009 14:01)

We go over several standard search techniques and strategies.

Class held on 09/21/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” info
  2. Lecture through the slides (as a PDF)
  3. Talk through the examples
  4. Go through “At end of lecture”

At beginning of class

  1. Look at announcements made since the previous class
  2. If you're going to ask me a question via twitter or email, first do the following:
    • Look at my previous twitter messages at drsamoore
    • Look at my recent announcements on the wiki
    • If you ask me about a wiki page, use http://bit.ly to send me the link to that page so that I can look at it.
  3. Do not wait until the last minute to start your assignments.
    • There are technical issues that you have to learn related to wikidot. This can't be taught very well over twitter. Maybe you've noticed this?
  4. My office hours are in the Winter Garden on MTW from 3:30-4:30 (or, generally, when students stop coming by, so I might leave early if no one is there or I might leave late if I'm busy).
  5. Check who is doing what:
  6. Blog template trouble
    • Many of you are having issues with this. You should have the following pages on your wiki (substitute for myWiki and nameOfMyFirstBlog):
      • myWiki.wikidot.com/blog:_template
      • myWiki.wikidot.com/blog:nameOfMyFirstBlog
      • myWiki.wikidot.com/bloglist
    • You need to understand the relationship among these three pages.
  7. Also, let's fix some blog formatting issues.
    • First paragraphs, mainly.
    • But also paragraph separation.
  8. File history information
  9. Grade tool (what to do???)
  10. Your first possible blog entry (on today's exercises) could be turned in next class (see the schedule-2009 for details on the timing of blog entries)
  11. From two classes ago: Why do search engines return different results?

My notes

Search techniques

These are most of the search techniques that we'll cover in today's class.

  1. Special search syntax — This is the tool that you have at your disposal that allows you to target your searches on specific parts of documents. Since different text in different parts means different things and perform different functions, you can use these operators to raise the precision of your queries.
    • Full text search engines
      • Title — intitle:
      • Site — site:
      • Top-level domain — site:
      • URL contents — inurl:
      • Links — link:
  2. Unique words and phrases — The use of multiple unique words and phrases are a key both to reducing the number of documents that are retrieved and raising the precision of your queries. Further, using multiple words and phrases increases the chances of retrieving content-filled documents (that is, increasing the number of “meaty” documents).
    • They can be used to focus in on more specialized pages that would use those terms
    • Gather related words using summaries
    • Use search engines to find related words
      • Example at Ask.com (both “Narrow your search” and “Expand your search”)
      • Google
        • Google Suggest feature
        • “Related searches” at bottom of search results window
      • Yahoo
        • Yahoo Search Assist feature
        • “Also try” at top or bottom of search results window
      • Yahoo Directory (we'll cover this in a future class) can point in the right direction
    • Use means queries
  3. Query specificity
    • Narrow to more general: this is when you have a real good idea of what you're looking for.
    • More general to narrow: this is when you don't know what you're looking for.
  4. Alternative naming
    • People
      • Using different name forms can return different information
      • Sometimes you have to use other information to differentiate two identically named people
      • Also, search specifiers can help target the information (intitle, site type, include, exclude)
    • Places
      • Use addresses (streets, zips, area codes, phone numbers)
      • Use "official"

Sites

This is the best summaries of the major general search engines that I could come up with. I have also linked to several useful help pages for each site.

  • Google
    • The best, most reliable, fastest, most wide-ranging general purpose search engine. Nice features: Showable "Options" on the left with lots of choices (especially time-related and Related Searches switch). When you're serious about searching, you have to make at least one stop here.
    • Useful pages
  • Yahoo
  • Ask
    • A great search engine for exploring a topic. Nice features: the "Related searches" on the right, the binoculars hiding the page preview and page statistics; also larger images appear on mouse-over. Notice there are sponsored results at the top and bottom of the page.
    • Useful pages
  • Bing
    • A search engine that focuses on the user experience during the search. Nice features: "More on this page" and "Popular Links" in the pop-up bar on the right; "Related Searches" immediately available on left.
    • Useful pages

Useful settings

Each of these search engines provides a way to set up an account and, thereby, set up preferences. I generally use the following preferences:

  • 30-50 results per page — I like the ability to scan more information more quickly
  • Filtering (moderate on Google) — don't want this stuff popping up in the middle of class or a group meeting
  • Open search results in new browser window — this keeps the search results up and available so that they're not so easily lost or closed
  • Turn on search suggestions — I find these to be amazingly useful as I structure queries.

In-class examples

For most of the following I will (by default) use Google as the search engine as a demonstration of the search technique. For the most part, each of these search engines (other than Bing) could have been used.

Special search syntax example: Information about tigers

  1. tigers (31.9mm)
  2. tigers -"Detroit Tigers" (29.0mm)
  3. tigers animal (4.61mm)
  4. animal intitle:tigers (1.45mm)
  5. Tigers (the animal but not any sports teams):
  6. Information from an organization
  7. Information from an organization or a government
  8. Information from a zoo

Unique words and phrases

  1. Bunch of birds example
  2. Use "means" and "definition" queries: Hydrocephalus
  3. Related words: Investment guidance
  4. Fun with quotes
  5. Lyrics

Query specificity

  1. Dog breed information
  2. Dog breed disease information

Alternative naming

People

  1. George Washington information
  2. Stephen Hawking (as a name example)
  3. Levi Strauss (since there are two/three of them)

Places

  1. Pizza places in Ann Arbor
  2. The Sears Tower (as a landmark)

At end of lecture

  1. Start working on today's exercises. The exercises are on this page. You should work on them for no more (but, probably, no less) than another hour outside of class; we will have more time in the next class after the lecture to continue working on them before going on to that day's exercises.

05 More Search Techniques

by samooresamoore (22 Sep 2009 17:38; last edited on 23 Nov 2009 15:16)

We go through the exercises related to search techniques; we also discuss evaluating sources from the Web.

Class held on 09/23/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
    • This should include you adding your information to the Grades database.
  2. Lecture (but no slides today) going through “My notes”
  3. Work on exercises
    1. Finish the exercises from last class
    2. Then work through the exercises for this class
  4. Work on experiment:

At beginning of class

Before class starts (for you to do)

  1. Check who is doing what:
    • Notes & questions
    • Special blogs (these should be written and posted on this howcanifindit site)
    • Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
  2. Look at recent writings for the class:
  3. Stay up with recent information on these pages:
  4. Content update:
    • We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.

Information for me to cover

  1. Grade stuff
    • The Grade Database is evolving, and I need you to help out.
    • If you want a grade for this class, I need you to do the following:
      1. Choose Grades from the Class Info menu.
      2. Use the Student menu to enter your information.
      3. Use the Add Grade Record to create your grade record; all you will have to do is enter your uniqname and save it. This gives you a database record where I can put your grades.
      4. If you have written a blog entry or notes or questions, then enter this information under Add Wiki Assignment.
    • Next steps for me are as follows:
      • To create more reports so that you can see grades that I have entered as well as your summary grades for the semester. I'll keep you posted.
      • To enter your participation grades so far.
      • To grade the blogs you have written.
  2. You should be thinking about the topic for your term project.
    • Be sure to read the description of the assignment.
    • A student from last year's class re-iterated my point that doing a project about a sports team would not be the best use of your time.
    • Also look at the list of industry sectors that you might select.
    • On day 8, which is October 5, you are turning in the first status report for your term project. On that date, you need to have decided on the topic, you need to have discussed your topic with me, described the topic on the start page of your wiki, and updated your information (that is, indicated the title of your wiki) on the class wikis list of student wikis.
      • Every single one of you needs to meet with me next week during office hours!!
  3. Another note for your term project. Your term project reports will include a section on information sources (as we will discuss today). Part of this will be an evaluation of the quality of the information sources that you identified. You will want to describe how you evaluate the sources, and indicate on the report your evaluation of each one of them. This will not be a separate deliverable but should be integrated into the final report.
  4. There are so many blog opportunities from these two classes (i.e., today and Monday). If you want to blog on both classes, you don't have to choose something from “last class” and then something from “this class”. These are both the same topic; you can use any two things you want to blog on from both classes. It doesn't matter if they were the same or different days. (Again, you don't have to blog today, or last class. But you'll have to blog sometime, and you might as well start sooner rather than later.)
  5. If you want to know how I format anything on any of the wiki pages that I have, you can just look at the source yourself.
  6. Talk about the experiment.
  7. Questions about any or all of this?

My notes

Discussion

  1. Long term research projects, or more difficult queries, require another level of effort and analysis.
    • Gather and save as much information as you can.
      • Use information from the search results, page characteristics, and contents of the results pages.
        • Look for names, contents, concepts, URLs, page titles, unique words, dates, places, facts, etc.
      • Create a wiki site to keep information and links.
    • Sometimes finding a set of related nouns and unique names can help you find what you need.
      • Use Google Sets
      • Use the queries ["type of X"], ["there are * types of X"], ["compared to X"], ["X vs." OR "X versus"]
  2. Evaluate the potential validity of the Web page from which you get information.
    • Facets to evaluation
      • Location of the page
      • Speaker's identity
      • Speaker's motivation
      • Credibility of sources
      • Speaker's history
      • Speaker's reputation

In-class examples

Candy bars

Types of things

Resources


06 RSS Introduction

by samooresamoore (27 Sep 2009 23:40; last edited on 17 Dec 2009 19:49)

We are going to introduce the topic of RSS feeds, blogs, subscribing to RSS feeds, and the basics of searching for RSS feeds.

Class held on 09/28/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Go through today's slides (as a PDF).
  3. Demonstrate feed readers: Google Reader and bloglines.
  4. Work on exercises.

At beginning of class

Before class starts (for you to do)

  1. You probably need to talk with me about your project during office hours (MTW, 3:30-4:30).
  2. You should go to the Grades database for this class.
    • If you haven't already done so, enter your personal information under "Student".
    • If you haven't already done so, enter your uniqname under "Add Grade Record".
    • If you have completed a wiki-based assignment (blog entry, industry update, notes, or questions), then enter that information under "Add Wiki Assignment".
    • Do this now.
  3. I will grade blogs by next class. Why? I want to have more turned in before I assign these grades.
  4. Check who is doing what:
    • Notes & questions
      • Notes posted by the end of the class day.
      • Questions (at least the first draft) posted by one week later.
    • Special blogs (these should be written and posted on this howcanifindit site)
      • Post these by the beginning of class on the day you sign up for.
    • Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
      • You can post these by the following class (since they are about what we do in class).
    • Assigned/due dates
      • Notes — notes to be posted by the end of the class day
      • Questions — questions to be posted (at least first draft) by one week later
      • General blog entries — write-up to be posted by the following class
        • These will be posted on your own wiki. We'll see how to do this today.
      • Industry updates — write-up to be posted on the day listed
        • These will be posted on the course wiki. Ditto.
  5. Look at recent writings for the class:
  6. Stay up with recent information on these pages:
  7. Content update:
    • We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.

I'll go over

  1. Problems with wikidot (and the class wiki) over the weekend.
    • Web sites go down. They come back up. They work most of the time. But they don't work all of the time.
    • Your work planning must take this into account.
    • Use this very small assignment as a learning opportunity to apply to your tasks for the rest of the semester (or, possibly, your life).
  2. I will give you the results of the experiment on Wednesday. I've delayed this so that I can get the results of students who did not complete the assignment in time.
  3. What this class should be like so far
  4. Any questions about this class so far this semester? Where we're going? Anything at all?

My notes

The Internet is changing all the time. New resources are being added at a phenomenal pace in millions of different sites. You can't keep up with everything on your own. You need help.

It's all about getting computers to work for you, to work while you're not using it. Use the computer to search through information so you don't have to. Use the computer to deliver information to your email inbox or to a specific Web page so you don't have to go get it. You don't have to remember to do the query.

You still have to define the search. You probably have to spend more time up-front when defining the query.

  1. RSS
    1. What it stands for
      • Really Simple Syndication
      • Rich Site Summary
      • RDF Site Summary
    2. RSS is an application of XML.
    3. RSS is an open definition so anyone can use it.
    4. RSS is a standard widely adopted by millions of Web sites
    5. If you have a Web site that is updated relatively frequently, it makes sense to put these updates into an RSS feed.
  2. Compare HTML and XML
  3. For our purposes, what are the benefits of XML (and, hence, RSS)
    1. Can easily be translated into HTML for display purposes
    2. Can specify "fields" that can be searched
  4. So, what does this mean for RSS?
    1. RSS a common representation for lots of databases and lots of Web sites
    2. This common representation means lots of tools can be specially written to work with that standard (send it, search it, slice it, dice it)
  5. So, what does this mean for you?
    1. Saved time
    2. Saved attention
  6. Classes of RSS feeds
    1. Blogs
    2. Newspaper articles
  7. Online feed readers
    • Why not feed reader application?
  8. Where can you find RSS feeds
    1. In RSS feed directories (with search)
    2. In searchable subject indices of RSS feeds (with browsing)
    3. On RSS-enabled Web pages
    4. Created keyword-based feeds at search engines
  9. Types of RSS feeds
    1. Static feeds
    2. Keyword-based feeds

Resources

Online feed readers

Where can you find RSS feeds

  1. Top lists
  2. In RSS feed database (with search)
  3. In searchable subject indices of RSS feeds (with browsing)
  4. On RSS-enabled Web pages
  5. Created keyword-based feeds at search engines

RSS feeds from Wikidot

  1. RSS feeds for a separate page
  2. Be notified: RSS feed guide

07 RSS Lab

by samooresamoore (29 Sep 2009 19:20; last edited on 23 Nov 2009 15:04)

We are going to work through more exercises allowing you to explore RSS feeds and related tools.

Class held on 09/30/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. No lecture today.
  3. Work on exercises for today.
  4. Complete the experiment before this Sunday. You should put your results here.

At beginning of class

Before class starts (for you to do)

  1. Term project
    • I thoroughly enjoyed meeting with most of you during the last 10 days or so and discussing your term projects. I'm really looking forward to seeing these projects develop.
    • Make sure that you have updated the student list page if you have talked with me about your term project topic. Do this now.
    • The first status report is due by the beginning of class on October 5.
  2. You should go to the Grades database for this class.
    • If you have completed a wiki-based assignment (blog entry, industry update, notes, or questions), then enter that information under "Add Wiki Assignment".
  3. You can check to see if I have recorded any grades for you on the SiteMaker page.
    • I have graded everything that was submitted correctly through 9/26/09. I'm catching up!
    • All blog entries, industry updates, notes, and questions are points out of 10.
    • A “9” grade on a blog is what I would call a “normal, high-quality, well-written, informative blog entry.” A “10” means that you exceeded this standard. Your entry was somehow more informative, more insightful, more engaging (don't discount this — I very much welcome reading an interesting well-written entry with a good story integrated into it) than my expectations.
    • If you get a “10” on a blog entry, I want you to copy your blog entry from your wiki to my wiki. Create a page with the same name (i.e., “blog:XXX”) but it should be in the class wiki. Do this as soon as you see your grade. Thanks. This gives other people the chance to learn about 1) what you wrote about in your blog, and 2) what a well-written blog entry looks like.
  4. Check who is doing what:
    • Notes & questions
      • Notes posted by the end of the class day.
      • Questions (at least the first draft) posted by one week later.
    • Special blogs (these should be written and posted on this howcanifindit site)
      • Post these by the beginning of class on the day you sign up for.
    • Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
      • You can post these by the following class (since they are about what we do in class).
    • Assigned/due dates
      • Notes — notes to be posted by the end of the class day
      • Questions — questions to be posted (at least first draft) by one week later
      • General blog entries — write-up to be posted by the following class
        • These will be posted on your own wiki. We'll see how to do this today.
      • Industry updates — write-up to be posted on the day listed
        • These will be posted on the course wiki. Ditto.
  5. Look at recent writings for the class:
  6. Stay up with recent information on these pages:
  7. Content update:
    • We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.

I'll go over

  1. Feedback about blogs
    • word usage lesson: it's versus its
    • spelling lesson: "definitely"
    • spelling lesson: lose/loose and choose/chose
  2. Specifics about grading criteria
    • Notes — useful, complete, formatting makes easy to scan.
      • If we don't have a lecture, then summarize what a student should have learned from the in-class exercises.
    • Questions — variety, depth of coverage, usefulness of questions.
      • Again, if we don't have a lecture, then questions should come from what students should have learned from the in-class exercises.
    • Blogs — informs the reader, personal reaction, insight, context, detailed.
  3. Web search experiment results
  4. As for today's experiment… you should complete the experiment before this Sunday at 5pm. You should put your results here.
    • Note that your results are supposed to go in alphabetical order by family name. Please do this. And if you see that someone before you has messed up — go ahead and fix it!!

08 News Search

by samooresamoore (02 Oct 2009 15:45; last edited on 17 Dec 2009 19:46)

We learn about the major news search tools, as well as how to integrate them with your knowledge of RSS.

Class held on 10/05/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. I'll lecture for a bit using some slides (as a PDF)
  3. Work on today's news search exercises.

At beginning of class

On your own

  1. Read the current to-do list on the course home page.
    • I will keep this up-to-date for each class. I hope this will make it easier for you to figure out just what it is that you're supposed to be doing (and when) during the semester.
  2. No grades (too much other prep going on)

Resources

News search

Print newspapers


10 Real Time Information

by samooresamoore (07 Oct 2009 02:41; last edited on 17 Dec 2009 19:39)

We learn about real time information exchange and social networks, and get an introduction to real time search tools.

Class held on 10/07/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. I'll lecture for a bit — but no slides today
  3. Work on today's exercises

At beginning of class

On your own

  1. Read the current to-do list on the course home page.
  2. No grades (too much other prep going on)

What I'll cover

  1. I loaded all of the slides from this semester for each day I did a lecture.
  2. RSS search experiment results
  3. Search engine analysis assignment

Notes

  1. When Twitter results are included with other results, they overwhelm everything else.
Communication channels
Conversation
Channel Mobile Private Public based Length Concurrent
email ? x x varied
chat x x x x short x
microblog x x x ? short
texting x x x varied
Facebook x x x x varied ?
blogging x medium
Uses of Twitter
For business For personal
polls stay in touch with friends
advertise events share photos, videos
follow what people are saying about your product monitor what's really current
reminder of events and information get answer to a question
bring attention to Web items, YouTube
provide personal touch with customers & clients
twitter.jpg

Resources

General

  1. bit330 realtime at delicious
  2. The ultimate guide for everything Twitter
  3. An Illustrated Guide To Using Twitter
  4. Hashtags explained
  5. 50 Useful Twitter Tools for Writers and Researchers

Real-time search

  1. Scoopler (about)
  2. OneRiot (about)
  3. Collecta — realtime search for blog posts, articles, comments, twitter, flickr, twitpic, youtube
  4. MicroPlaza — "welcome to your personal micro-news agency. Discover relevant information filtered by the people you follow."
  5. Surchur — "the dashboard to right now";
  6. Addict-o-matic — "inhale the web"; instantly create a custom page with the latest buzz on any topic
    • seems best for actively tracking a topic versus coming to it with a random query
  7. Yauba — search that tries to be all things to all people
  8. FeedMil (about) — "real-time feed search"

General search based on real-time information flow

  1. Topsy — "a search engine powered by tweets"

Twitter search

  1. Search at Twitter
  2. Tweetzi (help) — "real-time Twitter search & trends"
  3. TweetMeme — "hottest links on twitter"
  4. TweetGrid (how-to, FAQ) — "create a Twitter search dashboard that updates in real time."
  5. CrowdEye (about) — "what all the twitter is about"
  6. Twitter Power Search
  7. BackType (about) — "a conversational search engine"
  8. TwiST — "Twitter Search Tool"
  9. Tweetag
  10. Combining Twitter search with regular Web search
    • Twiogle — "search twitter & google at the same time"
    • BingTweets — "fusing twitter trends with bing insights"

Twitter trends

  1. TwitScoop (watch the live buzz cloud; search; hot trends)
  2. Twazzup — "search twitter. get real insights."
  3. Twendz (about) — "exploring Twitter conversations and sentiment"
  4. Twopular — "trends on Twitter aggregator"
  5. Twemes (about) — "twitter memes — global tags for twitter" (hashtag grouping and searching)
  6. Twitt(url)y (about) — "We track and rank what URLs people are talking about on Twitter."
  7. Retweet Radar — "Finding trends in the mountains of information 'retweet'ed on Twitter."
  8. Trendistic (help) — "see trends in twitter"
  9. TweetVolume — find out how frequently specific words appear in tweets.

Distribution

  1. TwitPic — "share photos on twitter"
  2. TweeTube (about)— "sharing stuff on Twitter"

Local search

  1. NearbyTweets — finds tweets in your neighborhood
  2. AskTwitr — look at tweets on a map
  3. TrendsMap — "real-time local Twitter trends"

Twit search

  1. Twittorati — "Twittorati tracks the tweets from the highest authority bloggers, starting with the entire Technorati Top 100 and soon including many more of the web's most influential voices."
  2. TwitSeeker (about)— "who you're looking for…by what they're talking about"
  3. Twibs — Twitter business directory
  4. LocalTweeps — "a ZIP-code level Twitter directory"

Other Twitter tools

  1. BackTweets — "search for links on Twitter"
  2. Cloud.li — real time cloud generation of search results
  3. TwitterFall — "Twitterfall is a way of viewing the latest 'tweets' of upcoming trends and custom searches on the micro-blogging site Twitter. Updates fall from the top of the page in near-realtime."
  4. PollDaddy — polls on Twitter
  5. TwitLinks — "The latest links from the worlds top tech twitter users."
  6. MySkyStatus — Real time flight tracking updates. Shows your flight's location to twitter and facebook followers

Organization

  1. TwTask — manage to-do lists from Twitter
  2. Twit2Do (faq) — "create to-do lists with twitter"
  3. Postica (faq, twitter) — create and share sticky notes across the Web

Other real-time tools

  1. uberVU (tour, tools) — "easy way to find and follow conversations"
  2. ReadTwit (about)— converts Twitter feed into an RSS feed (with URLs un-shortened, filter users in/out of feeds, and filter out #hashtags).
  3. Near real-time search on Google
  4. Enable real-time updating of RSS feed readers from publishing blogs
    1. RSSCloud
    2. PubSubHubBub

Using Twitter

  1. TweeTree (about) — "we built this site to help us better use Twitter"
  2. TweetCloud — "what's being said?"; gives you a tag cloud that helps summarize the tweets of a specific user. Can help you decide whether or not to follow a twit.
  3. Twitter Karma — helps you see whether or not you are following your followers, whether your followers are following you. (Click on the "Whack!" button after logging in to Twitter.)

12 Research Sites

by samooresamoore (20 Oct 2009 17:59; last edited on 17 Dec 2009 19:41)

We learn about several kinds of academic research sites. We also learn what the Deep Web is, why we need to care about it, and how we might go about accessing it.

Class held on 10/21/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. I'll lecture for a bit (no slides today).
  3. Work on exercises.

At beginning of class

On your own

  1. Read the current to-do list on the course home page.
  2. No grades (too much other prep going on)

What I'll cover

  1. Project stuff
    • RSS feeds vs current events stuff
  2. Grading stuff
    • Dean
    • Priority: status report feedback

My notes

  1. General Web search
    • Suggests that all information can be searched within one system
    • Easy and self-explanatory
    • Has only a limited understanding of "structure"
  2. The Invisible Web
    • "Invisible" to the general search engines since they don't index it
    • You'll hear about the "Invisible Web" or the "Deep Web" — same thing
    • Pages that are invisible
      • Disconnected page
      • Page consisting primarily of images, audio, video
      • Flash, Shockwave, compressed files
      • Content retrieved as a result of filling out forms
      • Real time information (ex: stock quotes)
      • Pages that are proprietary
    • Significance of the Invisible Web
      • Bergman's widely-cited statistic is that there are 550 billion documents in the invisible Web
        • Others believe it's more like 20-100 billion
      • Estimated that there's about 300K Web sites with queryable databases
      • 60 of the largest Deep Web sites containing about 750 terabytes of data
  3. Academic Web-based search
    • More academic content is moving to the Web exclusively
    • Part of general trend from print to electronic
    • Much of this is contained in the Invisible Web
  4. Explain how search engines work
    • General
      • Crawlers go out and send information back to the central database
      • Queries go against the central database
      • SE company expertise is in design of the index and design of the query process (including input interface and output formatting and reporting)
    • Academic
      • Crawlers go out, find a database, and what? Index the query interface page? Send some standard queries to the index and save the results?
  5. Should you consider using Google Scholar?
    • Pros
      • A cross-database (federated) search engine
      • Returns snippets from articles (and sometimes abstracts)
      • Indexes the full text (actually, part of the full text) and not just the abstracts and subject terms
      • Can link to your own school's library
    • Cons
      • Secretive about its coverage of specific publishers, journals
      • Limits it searches to the first 100-120K of a page
      • Hasn't been updated much (at all?) since its launch
      • Returns far fewer documents than the native search engines
      • Searching by field is fairly unreliable and counter-productive
  6. What do we want from an academic search engine?
    • Comprehensive
      • Contains lots of journals over lots of topics
      • Goes far back in time
      • Up-to-date
    • Integrated across databases
    • Integrated into a database
    • Transparent as to what it contains or doesn't contain
  7. Recommendation
    • Use Google Scholar
      • as a way to find free, online versions of articles you already know you want
      • like you use Wikipedia — as a good starting place for exploring
    • Use the other Deep Web search tools — Scirus, Turbo10, plus the LII.
    • To do a complete search, you should definitely talk to a librarian and use the Library's immense set of resources.

In-class demonstration and discussion

  1. Google Scholar (the gorilla in the room)
    1. Basics
      1. intitle:"carbon trading" — 472 (271 citations in 2008)
        • Cited by
        • Referenced by (under “Related articles”)
        • Web search
        • Availability at UM library (set up under "Scholar Preferences")
        • "Recent articles" vs. "All articles"
    2. Weird logic — that appears to have been fixed in 2009!
      1. the — 10.6 million records (2.03 billion in 2008)
      2. a — 10.8 million records (13.1 million in 2008)
      3. a OR the — 11.2 million records (13.6 million in 2008)
    3. Subject groups
      1. intitle:Vietnamese — 11,000 records (9,690 in 2008)
      2. allintitle:Vietnam — 98,900 records (816,000 records in 2008) (all subject areas)
      3. allintitle:Vietnam — 23,600 records (29,100 records in 2008) (with all of the subject areas checked)
      4. allintitle: Vietnam OR Vietnamese — 109,000 records (104,000 in 2008; notice that this is less than the 816,000 found for Vietnam alone above)
      5. allintitle: Vietnam OR Vietnamese — 30,000 records (141,000 in 2008) (with all of the subject areas checked)
      6. Publication year strangeness
        1. intitle:Vietnam 1435-2008 — 20,200 records
        2. intitle:Vietnam 1960-2008 — 20,900 records
        3. intitle:Vietnam 2010-2050 — 2 records
  2. Scirus (deep web search competitor)
    1. title:"low carb" "low fat" "weight loss" — 560 hits
      • Ability to filter on the left (sources, file types)
      • Recommendations of refining your search on the left
      • Save or email the results.
      • Sort by relevance or date.
      • Similar results
  3. Google Books (book-based)
  4. UM Library (library-based)
  5. Biznar (specialized deep web search)
  6. BNet (another specialized search tool)
    1. carbon trading
      • Content types to right
      • RSS feeds
  7. Wolfram|Alpha (computational knowledge)
  8. Yahoo Directory (Web site directory)
    • Explore Business sites

Possible blog entries

There are two possible blog entries related to this class — you can write one, both or neither of these. But I would find these interesting.

  1. Write a blog entry on what you observed, what you learned and found interesting, focusing on information that other students might find useful.
  2. Go talk to a Ross librarian. Tell them your topic and ask what 3 to 5 databases or tools that you might find most useful given that topic. See what databases they might tell you to focus on. Use them for a while. By the end of the semester, write a blog entry describing how the information you find in these databases differs from what you would find in the Web at large or what you found in the Deep Web search tools we were introduced to above.

BTW, I would find it rather remarkable if you didn't have in your term project a section or group of resources or something related to information a person could get in a library's database (compared with Deep Web and the Web itself).

Resources

Research tools

Primary

The following sites are traditional Deep Web search sites. Each one of these takes a different way of accessing documents in the Deep Web so they're each worth trying.

  1. Google Scholar
  2. Scirus
  3. IncyWincy — the invisible Web search engine
  4. DeepDyve

Library- and book-based

Each of these tools provides a different way of accessing information in books. Lots of resources are being thrown at Google Books so we should definitely keep our eyes on it as more books enter the system.

  1. University of Michigan Library
  2. Google Books
  3. Amazon Advanced Book Search — Yes, I am including Amazon, the book seller, on this list.
  4. WorldCat

Specialized Deep Web search

Each of these is a deep web search engine but the underlying document sets are specialized.

  1. Green Info Online
    • Review on Peter's Reference Shelf
    • Be sure to look under "Search Options", "Advanced Search", and "Visual Search"
    • At the top of the screen, be sure to look at "Publications" and "New Features!"
    • "GreenFILE offers well-researched information covering all aspects of human impact to the environment. Its collection of scholarly, government and general-interest titles includes content on the environmental effects of individuals, corporations and local/national governments, and what can be done at each level to minimize these effects. Multidisciplinary by nature, GreenFILE draws on the connections between the environment and a variety of disciplines such as agriculture, education, law, health and technology. Topics covered include global climate change, green building, pollution, sustainable agriculture, renewable energy, recycling, and more. The database provides indexing and abstracts for approximately 384,000 records, as well as Open Access full text for more than 4,700 records."
  2. BNet — management, strategy, work life skills & advice for professionals. This is more of a collection of useful business-related information but I couldn't figure out where else in this course to let you know about it. So here it is.
  3. Biznar — deep web business search
  4. Mednar — deep web medical search
  5. ScienceResearch.com — "the world's science all in one place"
  6. Science.gov

General reference and answers

Each of these sites provides access to sets of facts and answers to questions. The first is a computational knowledge engine and the other sites have well-organized sets of traditional articles and entries about specific topics.

  1. Wolfram|Alpha
  2. Information Please Almanac
  3. Encyclopedia.com
  4. Britannica
  5. Wikipedia

Secondary deep web sites

These are worth peeking at if you need some more information. Each one of these provides reliable resources.

  1. InfoMine (UCal, Riverside)
    • Isn't being updated any more but still seems useful
  2. Directory of Open Access Journals — 1673 (1262 in 2008) journals are searchable at the article level, 319,861 (211,294 in 2008) articles.
  3. Bing

Web directories

The purpose of each one of these sites is to provide an organized and categorized sets of Web sites that have been evaluated for usefulness. Each one of these is worth looking for to see if you might get lucky.

  1. Yahoo Directory
  2. Google Directory
  3. Intute — "Helping you find the best websites for study and research"
  4. Librarian's Internet Index
    • Overview: describes who they are, what they do, and what you might expect to get from looking at their site.
  5. Internet Public Library

Pay sites

Each one of these sites is quite useful but they require you to pay so I'm guessing you are out of luck; however, when you get out to the working world remember that these exist. You might be able to gain access to them through your employer.

  1. Web of Science
  2. Scopus
  3. OECD Factbook 2009

In development

I have this listed here just so that I can remember to look at it in future years to see if it has evolved into something more useful than its current condition.

  1. Q-Sensei
    • Includes the Library of Congress (I believe).
  2. DeepPeep
    • About
    • "DeepPeep is a search engine specialized in Web forms. The current beta version tracks 13,000 forms across 7 domains."

Dead

Each one of these was a viable deep web search engine but now they are not worth investigating or don't exist in any form.

  1. CompletePlanet — 70K databases (but appears to be dead as of 2004!)
  2. Turbo10
  3. Microsoft Live Search Academic — closed down in May 2008.
  4. OAIster: find the pearls
    • Integrated into WorldCat in October 2009

Articles

  1. Exploring a 'Deep Web' that Google can't grasp, NYTimes, February 22, 2009
  2. Accessing the Deep Web
  3. Exploring the academic invisible Web
  4. Google Scholar revisited by Peter Jascso, Online Information Review, 32:1, 2008, pp. 102—114.
  5. The Deep Web: Surfacing hidden value
    • As summarized by the editor of The Journal of Electronic Publishing: "Michael K. Bergman, whose BrightPlanet company offers a new approach to search engines, examines the wealth of information that is available only on dynamically created Web sites, those that don't exist except as relational databases until someone seeks information from them. As more sites adopt the dynamic approach to pages, they are creating a challenge for standard search engines. This article looks at some alternatives."
  6. Search engine technology and digital libraries: Libraries need to discover the academic internet
  7. Google Scholar -- a new data source for citation analysis, by Anne-Wil Harzing, February 5, 2008 (7th version).

E-books

  1. Google Book Search
  2. Project Gutenberg
  3. American Memory (by the U.S. Library of Congress)
  4. Million Book Project
  5. Google Electronic Text Archives

Other


13 Change Notification Tools

by samooresamoore (25 Oct 2009 14:24; last edited on 23 Nov 2009 15:05)

We are going to discuss different tools that can notify you in different ways and in different circumstances when some specific thing has changed on the Web: email alerts, page monitoring software, and RSS feed-manipulation software.

Class held on 10/26/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. I'll lecture for a bit (no slides today).
  3. Work on exercises.

At beginning of class

On your own

  1. FYI, I added a scanned copy of the diagram I created for the real-time information class.
  2. Read the current to-do list on the course home page.
  3. No grades (too much other prep going on)
    • I am about 1/4 the way through the status reports. I'm working on them, I promise.
    • I haven't graded any blogs in a very long time.

My notes

change-notification.jpg

Monitoring changes

  1. Email alert service
    • Monitor entire site
    • These are set up by the Web site and you subscribe to them
    • No false positives
    • Sometimes you want email (cell phone! or even Messenger)
  2. Page monitors
    • Monitor specific pages (but not sites)
    • Lots of false positives unless keyword based
  3. RSS feeds
    • Problem: False positives
      • Unless keyword based or filtered somehow
    • Focused RSS feed — If you’re lucky, there is a keyword-based, or specific-topic defined, RSS feed available for a site you can subscribe to.
      • Specific sites (findable in all the usual ways)
      • Dapps at Dapper.net
      • Pipes at Yahoo Pipes
    • General RSS feed: If there's simply a general RSS feed (such as "Yahoo breaking news"), then you should run that feed through a keyword tool:
    • The following are useful if there's no RSS feed available on a page but you would like to set one up:
      • FeedYes: I would try this first since it's the easiest to use when setting up a feed.
      • Feed43: This is more powerful but more difficult to use.
      • Dapper: This is another powerful tool.
  4. Why not just use RSS
    • Some sites don't have RSS feeds
      • So use site-based email alerts
      • Or use a tool to make an RSS feed
    • Some information isn't site based
      • So use search-based email alerts
    • Some information is too fine-grained to be covered by RSS feeds
      • So use page monitors

Email alerts

Finding email alerts

  1. Search for email alerts
    • Query: "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
    • Google results (189 million in 2009) (77.2 million in 2008) (60.2 million in 2007)
    • Yahoo results (464 million in 2009) (392 million in 2008)
  2. More specific search for email alerts
    • Query: inurl:mail OR inurl:alert "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
    • Google results (88,100 in 2009) (115,000 in 2008)
    • Yahoo results (280,000 in 2009) (242,000 in 2008)
  3. Science email alerts
    • Query: science "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
      • Google results (38.6 million in 2009) (17.1 million in 2008) (2.34 million in 2007)
      • Yahoo results (72.3 million in 2009) (69.9 million in 2008)
    • INURL query
  4. Copper email alerts
    • Query: copper "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
    • INURL query
  5. So, think about how you might apply this both to a company you are interested in or an industry you are interested in

General email alert services

  1. Yahoo Alerts
    • Some types of alerts
    • Be sure to look over the whole list of categories of alerts.
  2. Google Alerts (help)
    • All of this is based on submitting queries
      • Once a day
      • Once a week
      • "As it happens"
    • Broad-ranging alerts
      • Web & comprehensive alerts
    • More specific
      • Keyword-based alerts for news, blogs, video and groups
    • Can receive as email or as an RSS feed

Page monitoring software

Overview

Page Monitors were the next big thing five years ago. It is a program or web based program that you download. Each day (or whatever time period you want to set) it downloads the webpage, and if it's different it will send you an email. Some tell you what has changed while others just tell you that it has changed.

At first, you might not be that impressed with page monitors. But after realizing that it can be used for a lot more than news, it can be quite a useful tool. WatchThatPage.com is the best free site.

WatchThatPage has a limit of 250 characters for the URL. Also, shortened URLs (from tinyurl.com or bit.ly) do not work. To get around these problems, use TrackEngine, where neither of these problems exist.

Web-based

  • WatchThatPage
    • Free (for any number of pages), or $20/year for priority service
    • Can highlight changes in pages
    • Changes sent in an email
    • Keyword matching
    • This site doesn't appear to be updated any more (4+ years)
  • TrackEngine
    • Free for 5 bookmarks, or $20/year for 10 pages, or $59/year for 50 pages
    • Highlights new content in HTML email
    • Monitors changes daily
    • Does do keyword matching
    • This site hasn't been worked on for 7+ years
  • Other possible sites: InfoMinder, ChangeDetect, Trackle

Windows software

Feed creation software

Overview

Make a feed

From other feeds
From a page
  • Dapper
  • FeedYes
  • Feed43
    • Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
      • Define Extraction Rules – By finding the specific places (within the code) of the information that you’re looking to have monitored by the RSS feed. There are directions for what specific code to use in the program.
      • Then click extract
      • Then you can give it a title, description, url, etc
      • Then put in where the title, date, etc are etc
    • If these sites are updated once a month, its too much of a hassle to make one of these (use a page monitor). But if it is updated daily and you want to monitor it, then it might be a good idea to make one!
    • Free, or $29/year for 20 hourly updates
    • My feeds

Examples

Email filtering

  1. Gmail
    • Limit around 7.2GB (4.5GB in October 2007)
    • Can use a filter
      • To forward just some emails (to different people?)
      • To apply a label to emails
    • Plus addressing
      • A powerful method that can be applied to Email alerts is using “plus addressing” service when you sign up for an Email alert (e.g. from some query), i.e. tell them that your address is dummy+moc.liamg|reifitnedIyreuQemos#moc.liamg|reifitnedIyreuQemos instead of the normal address moc.liamg|ymmud#moc.liamg|ymmud. Thus, if you get this address to your mail account, you can filter it by what comes after the plus! This is a extremely helpful since it makes it easier to filter emails.
      • Description for GMail
      • Use a different address for each email alert
      • Helps you filter
      • Helps you track who is selling your email address
  2. Defining a filter
    • Keep definition to a minimum, as simple as possible
    • Test, test, test

Tools you now have at your disposal

  • Method to follow to find site-based email alerts
  • Tools to create search-based email alerts
  • Tools to monitor Web pages for any changes to their contents
  • Tools to apply keyword-based filters to RSS feeds
  • Tools to convert tabular Web page content to an RSS feed

Your term project

Email alerts and your term project

You should do the following for your project wiki:

  1. You should figure out some way that you are going to document the email alerts that you use in your email account to route your incoming alerts. Maybe print the alert page to a PDF file and link it to your wiki? Maybe take a screenshot of your email inbox and highlight the email alerts?
  2. In either case, you are going to want to have a section in your wiki called "Email alerts".
  3. On this page you should describe each of the email alerts that you used: the page from which you subscribed to it, why it is useful, and if there are any keywords (or such) that you used to generate it.

All of the above also applies to your page monitors, any feeds you create using FeedYes/Feed43/Dapper, and any feeds you filter using FeedRinse or Yahoo Pipes.

Possible blog topics

You do not have to write a blog. These are suggested blog topics if you were to write one. There are lots of possibilities in this class.

  • Describe different ways that you found these tools useful (or not useful).
  • Describe how you used Yahoo Pipes, possibly differently than how we have described them here.

Hints about possible test questions

You're definitely going to be held responsible for the following topics:

  1. What WatchThatPage (as an example of a page monitor) can do
  2. What Dapper can do
  3. What Feed43 can do and how its search patterns work
  4. What Yahoo Pipes can do and how feeds can be manipulated (for example, Fetch Feeds, Union, Filter, Sort)
  5. Under what circumstances would you use each one of these tools (as opposed to another)

14 Custom Search Engine

by samooresamoore (27 Oct 2009 13:57; last edited on 17 Dec 2009 19:42)

We discuss custom search engines, and how you can build your own.

Class held on 10/28/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Explain what custom search engines are
  3. Work on exercises

At beginning of class

  1. Assignments, you, and me
  2. Changes (this topic, this class)

My notes

  1. What is it
    • A search tool that uses Google's search engine (as the back end) but that you can instruct the following ways:
      • Look in a certain list of URLs (pages, whole sites, or subsets of sites)
      • Avoid a certain list of URLs
      • Append a set of terms to any user-supplied query
      • Customize its looks (within bounds)
  2. How can it be used
    • Its own Web page
    • An iGoogle widget
    • Embedded in random Web page
  3. Why would you use it
    • Captures creator's knowledge of the field

In previous years students have explored and learned about several search engines: Topicle, Eurekster Swicki, and RollYO; however, for the last year or so these sites have been completely displaced and dominated by the Google Custom Search Engine so that is the only tool we are going to look at today.

Blog topics

  1. Describe how useful or not Google Custom Search Engine is for your site.
  2. Describe how you chose the sites to include in your custom search engine.
  3. Compare and contrast 2 (or more) different custom search engines.

Resources

  1. Topicle
  2. Eurekster Swicki
  3. Rollyo
  4. Google Custom Search Engine
  5. BuildASearch
  6. Yahoo Search BOSS — Build your Own Search Service
  7. 3 Guides to FireFox Quick Searches (Smart Keywords)

15 Project Day

by samooresamoore (02 Nov 2009 17:48; last edited on 02 Nov 2009 17:48)

We worked on our projects

Class held on 11/02/2009. (student notes; possible questions).

We just worked on our projects.


16 Wikidot Day

by samooresamoore (04 Nov 2009 15:03; last edited on 04 Nov 2009 16:24)

We go through some special features of Wikidot while also quickly discussing some tools for investigating the popularity of certain Web sites.

Class held on 11/04/2009. (student notes; possible questions).

Web site popularity

Wikidot issues

  1. Question from student: "We have blog:blogtitle and we can see them all by going to bloglist. Can we make a template for other things, like to list feeds or reviews?"
    • Create (e.g.) news:_template (and have the contents be something like "blog:_template")
    • Create (e.g.) news-stories (and have the contents be something like "bloglist")
    • You can also look at my code on the homepage for the list of announcements (annc), schedule items (sched09), and blogs (blog).
  2. Images not showing up
    • Make sure the image name is allOneWordWithNoSpaces.
  3. Header images
  4. Help with coding Wikidot

17 Image Search

by samooresamoore (06 Nov 2009 14:31; last edited on 23 Nov 2009 15:06)

We discuss and explore the variety of image resources and search tools available.

Class held on 11/09/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Go through diagram explaining page monitors & RSS filters
  3. Work on exercises (tweets)

At beginning of class

  1. No new grades.
  2. Tags on your pages (on this site and your own site).

My notes

  1. Diversity of image search tools
    • Basic image search on the Web
    • Search for images related to news stories
    • Search for images on flickr
    • Search for images by "similarity"
    • Search for images of a person's face
    • Search for images related to what's going on right now
    • Search high quality for-pay or for-free images

Basic image search

  1. Google Images"Ann Arbor" example, only large black & white of Ann Arbor
    • Can search by size, type of image, color, and description, of course
    • Under advanced image search, you can search for images related to news content, that have faces in them, that have a specific file type, or that are from a specific domain (.edu or a specific site)
  2. Ask Imageslarge B&W photos of Ann Arbor
    • Can search by size, filetype, color
  3. Yahoo Imageslarge B&W of Ann Arbor
    • Can search by size, color, domain
    • "Travel overlay" — consider Las Vegas, NV (look in left column)
  4. PicSearch (and advanced search page)

News image search

  1. Images at Yahoo News — you can't specify to search for images but they appear at the top of the results page
    • This works as a direct link — just replace "football" with your own search term
  2. Images at Google News — again, you can't specify to search for images directly; you'll have to click on Images on the left side
    • This works as a direct link — just replace the "detroit+lions" with your own search term

Flickr search tools

  1. Flickr
  2. Compfight: a flickr search tool
  3. Behold: a Flickr search tool

Similar images

  1. Pixolu — search for [eiffel tower]
    • Searches Google, Flickr, and Yahoo
  2. At Google Images, you can search for "similar images"
    • Starbucks coffee and then click on "Find similar images" on the image you like
  3. I wish they were better:
    1. GazoPa — find similar images
    2. TinEye — find similar images

Face search

  1. Exalead
  2. Google Images
  3. I wish these were better:
    1. Picitup — image search (plus face search and "similar" search)
    2. FaceSaerch

Real time image search

  1. PicFog
    • Just watch the images go by on the home page
    • Or you can also search
  2. TwitCaps
  3. PicBrk

Stock photography

These are recommended by Presentation Zen.

  1. Inexpensive (but good)
    1. iStockPhoto
    2. Getty Images - free for students?
    3. DreamsTime
    4. Fotolia
    5. StockXpert
  2. Free (but not bad)
    1. MorgueFile
    2. Flickr's Creative Commons images
    3. ImageAfter
    4. StockXchng
    5. EveryStockPhoto
    6. FreePixels
  3. Cyclops: a stock-photo image search site
  4. EveryStockPhoto
    • eiffel tower
    • Advanced search allows for license, shape, and "safe search"
  5. US Library of Congress images

Blog ideas

Resources

  1. Where to find free images and visuals...
  2. LIFE magazine photo archive hosted by Google

18 Geography Based Sites

by samooresamoore (08 Nov 2009 14:53; last edited on 17 Dec 2009 19:45)

We discuss all types of geography-based search tools and resources.

Class held on 11/11/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Go through diagram explaining the tools we'll be looking at today
  3. Work on exercises (tweets)

My notes

Here's what we're doing today:

  • Mostly just you exploring some amazingly cool and useful Web resources.

International, country-specific Web search engines

  1. SearchEngineColossus.com: "International Directory of Search Engines"
  2. Yahoo International: Yahoo home pages from countries around the world

Google Maps

  1. Google Maps (tour, popular content, featured content)
  2. Google Maps Mania: "An unofficial Google Maps blog tracking the websites, mashups and tools being influenced by Google Maps."
    1. 100 things to do with Google Maps mashups: the most fun you can have with maps.
  3. Related tool: Google Earth

Travel

  1. Google Sightseeing: "Google Sightseeing takes you on tour of the world as seen from satellite, using the free Google Earth program, or Google Maps in your web browser. Each weekday your guides James and Alex present new weird and wonderful sights as suggested by readers."
  2. Articles analyzing the industry
  3. Standard travel search tools
    1. Expedia
    2. Orbitz
    3. Travelocity
      travel-websites-large.tiff
  4. Newer general travel search tools
    1. Kayak: travel aggregator; searches 140+ travel sites with one search (review, another review)
    2. TripWolf: worldwide travel guide (review)
    3. UpTake: "your first step on a great trip"; "search over 1000 travel websites and 20M opinions at once" (review, review)
    4. WeGo: "Wego searches through 100+ travel sites in the time that it takes you to search one. We’ll help you find the best prices and connect you to the best place to buy." (review)
    5. Goby: "Create your own adventure" (review)
      travel-websites-small.tiff
    6. Bing Travel: "Bing Travel Price Predictor tells you whether fare prices are expected to go up, down or stay the same."
  5. Hotel, home, and hostel search
    1. Sprice: "smart prices to go…anywhere" (actually, focuses on hotels in SE Asia, India, Europe) (review)
    2. Hotelicopter: "elevate your search" (review)
    3. LetMyBed: "more places to stay" (review)
    4. Unusual hotels of the world (review)
  6. Specialized travel information
    1. Ixigo: travel in India

Local search

  1. Biggest players
    1. Ask City: search for businesses, movies, and events
    2. MSN City Guides
    3. Mapquest Local: restaurants, events, news, weather
    4. Yelp: restaurant reviews (around 250K daily visitors)
  2. Newer entrants
    local-search-websites.tiff
    1. Outside.In: "What's happening. Where you are. Right now." (description, review)
    2. BooRah: "restaurant reviews, menus, pictures, and more" (review)
    3. When.com: "where to go, what to do, local events" (review, another)
    4. WebLocal: Canadian local search (review)

Road trips and driving

  1. Driving directions
    1. Mapquest directions (search cheat sheet)
    2. Mapquest RouteBuilder
    3. Multimap driving directions
    4. Google Maps driving directions
  2. Driving itineraries
    roadtrips-websites.tiff
    1. RoadsideAmerica Maps: "Find oddities and tourist attractions and plan trips more quickly"
    2. MileByMile: free road map RV itinerary guides
  3. Traffic information
    1. Waze: "real-time maps and traffic information based on the wisdom of the crowd" (review)
  4. Outside the U.S.
    1. Streetmap.uk: Great Britain street and road maps
    2. Australian driving directions: Australian travel maps, street directory, driving directions, and aerial photographs
    3. European driving directions (ViaMichelin)
    4. Mappy.com: European maps, route plans, and address guide (by country; look in upper left of window)
    5. TheAA.com: routes, maps, and directions for U.K. and Europe
    6. Zoombu: "Find the best way from your home to your destination", door-to-door journey planner for Europe (review)
  5. Public transit
    1. How to use Google Maps to plan a trip by public transportation (cities included)
    2. HopStop: Provides door-to-door subway and bus directions and maps for NYC. Currently expanding to other major cities. Very popular in Manhattan.
    3. NYC subway
    4. Google Transit: plan a trip using public transportation
    5. PublicRoutes: "get public transit, driving directions, and maps" (review)

Maps

Entertainment

  1. Google Moon Map

Information

  1. Historical Maps at the Perry-Castaneda Library Map Collection at the University of Texas
  2. Maps of current interest (from Perry-Castaneda Library Map Collection)
  3. Perry-Castaneda Library Map Collection (Univ of Texas)
  4. NationalAtlas Map Maker: build your own layered map with a wide variety of information
  5. WorldMapper: "a collection of world maps, where territories are re-sized on each map according to the subject of interest. There are now nearly 600 maps."
    • Animation: be sure to check out this animation
  6. World Sunlight Map: "Watch the sun rise and set all over the world on this real-time, computer-generated illustration of the earth's patterns of sunlight and darkness. The clouds are updated every 3 hours with current weather satellite imagery."
  7. National Geographic Atlas Explorer: "investigate our world"; a visual guide to global trends
  8. EarthPulse: State of the Earth 2010
  9. EarthTools: "find places, latitude/longitude, sunrise, sunset, elevation, local time, and time zones"

Commerce

  1. AuctionMapper: search eBay for auctions (info)
  2. Oodle: buy and sell locally (classifieds)
  3. LiveDeal: online local marketplace

Real estate

  1. Zillow: "your edge in real estate"
    real-estate-websites.tiff
  2. ZIP Realty: "your home is where our heart is"
  3. Trulia: real estate search
  4. RealtyTrac: "foreclosure real estate listings"
  5. Roost: "homes for sale and MLS listings"
  6. HomeFinder: "homes for sale, real estate listings & foreclosures"
  7. ActiveRain: "world's largest real estate network"
  8. Smaller sites
    1. PropSmart: real estate search (and community)
    2. HousingMaps: a mashup of Google Maps and Craigslist
    3. Enormo: "Every property. Everywhere" (review)

Interactive tools

  1. GMap Pedometer: plan your walking trips and measure their length
  2. Wikimapia: a mashup of Google Maps and Wikipedia. Completely addicting to explore.
  3. MapMyRun: a tool to plot your running route and see what the distance was

Clocks

  1. The World Clock - Time Zones: Current local times around the world
  2. Greenwich Mean Time: use this to set your clock to the right time whereever you are
  3. World Time Zone: find the time using a map

Mobile tools

  1. Mapquest Wireless
  2. Google adds local to mobile web search

19 Video Search

by samooresamoore (14 Nov 2009 17:17; last edited on 23 Nov 2009 15:07)

We discuss several different ways to search for videos.

Class held on 11/16/2009. (student notes; possible questions).

Class structure

  1. Go through "At beginning of class" information
  2. Go through diagram explaining the tools we'll be looking at today
  3. Work on exercises (tweets)

At beginning of class

  1. I'm gathering your tweets: geography class

My notes

The numbers in parentheses are the average daily visitors in the most recent months (as determined by Google Trends). In each section the sites are generally listed in order of popularity.

Video

online-video-market-share.tiff

Video search tools come in three basic varieties:

  1. General Web search based on video description and tags
  2. "Deep video" search (my term) based on an analysis of the audio and video content of the video
  3. Video directories in which human experts have classified videos (or video series) by their general content

Each of these two varieties of search tools can be applied to different targets:

  • Site-specific search
  • Web-wide search

So this means that we have six types of tools that we might consider:

Web Site
General Web A B
Deep video C D
Directories E F

Now, even for these six types of tools, we can still have sub-categories of video search tools that differ based on their "target" content. For example, some sites search specifically for entertainment content, others for podcast-type content, and others for academic content.

Finally, sites can differ on dimensions other than those discussed above, most commonly:

  • Uploads — does the site allow videos to be uploaded
  • Host — does the site host videos or is it just a search site
  • Social — does the site allow visitors to tag and/or comment on the videos

Video search

All of the following sites allow you to search for videos around the Web.

Site Search Scope Uploads Host Social
Bing Video General Web No No No
Google Video General Web No Yes No
Yahoo Video General Web No No No
Blinkx Deep video Web No No No
Truveo Deep video Web No No No
Pod-o-matic General, Directory Web Yes Yes Yes
VideoSurf Deep video Web No No No
YouTube General Site Yes Yes Yes
Daily Motion General Web Yes Yes Yes
MegaVideo General Entertainment Yes Yes Yes
Metacafe General Entertainment No Yes Yes
Veoh General Web Yes Yes No
Hulu General Shows No Yes No
Clicker General Web, Shows No No No
  1. Bing Video
  2. Google Video
    video-search-sites.tiff
  3. Yahoo Video
  4. Blinkx: video search engine (100K)
  5. Truveo: "search video across the Web" (40K, down from 400K 12 months ago)
    • Description: "Truveo video search lets you search and find videos from across the Web. Use Truveo to find all types of online video including hit television shows, full-length movies, breaking news clips, sports highlights, music videos, or the latest viral videos. If you are looking for a specific video, Truveo video search can help you find exactly the video you want. Truveo can also help you browse through video across the web and discover new videos that you might like."
    • sample… so many things to look at on this page:
      • On the left results by channel that you can filter with
      • In the center you can choose "Top ranked", "Most recent", "Most popular", and "Highest rated" — FTW!
      • On the right you get results from Bloomberg. Why Bloomberg? I have no idea.
      • Clicking on the Search button you can choose to search Channels, Categories, or Shows — try it out and see how the results differ
    • Help
    • Articles
  6. Pod-o-matic (10K)
  7. VideoSurf (read this article) (7K)
    • Browse categories of news
    • Browse people in the news, for example, J
    • sample — look at all of the information on this page:
      • In the left column, lots of different filters.
      • In the center column, the results of the query with a short description, age of the clip, and a film-strip with shots of what is happening throughout the video
      • You can also move your pointer over the film strip and get an option to show the faces of the people in the video — amazing!
      • You can sort in multiple ways, and you can limit the results to videos added by time period
      • Description: "VideoSurf is video search engine that has created a better way for people to search, discover, and watch online videos. Using computer vision VideoSurf has taught computers to “see” inside videos to let users find and watch videos that they really want to see. Whether you’re looking to watch funny videos or scary videos, movie clips or TV full episodes, the hottest new music videos or breaking news clips, VideoSurf’s video search engine is the place to go to find the videos you’ll love."
  8. YouTube
  9. Daily Motion: "Dailymotion is about finding new ways to see, share and engage your world through the power of online video. You can find - or upload - videos about your interests and hobbies, eyewitness accounts of recent news and distant places, and everything else from the strange to the spectacular." Site is at 1.4M visitors per day, but it has lost 1/2 of its traffic in the last 18 months.
    • sample; you can sort these results in many different ways
  10. MegaVideo: "I'm watching it." (1.4M)
    • This is almost exclusively entertainment videos.
    • sample
  11. Metacafe:
    entertainment-video.tiff
    "Metacafe is one of the world's largest video sites, attracting more than 40 million unique viewers each month (comScore Media Metrix). We specialize in short-form original content - from new, emerging talents and established Hollywood heavyweights alike. We're committed to delivering an exceptional entertainment experience, and we do so by engaging and empowering our audience every step of the way." (500K)
  12. Veoh: "Veoh is a revolutionary online video service that gives users the power to easily discover, watch, and personalize their entertainment viewing experience. With a simple broadband connection Veoh gives you free access to all of the great TV and film studio content, independent productions, and user-generated videos on the Web." (1.1M)
    • sample
      • Notice the ability to limit by category and sort in different ways
      • Also, under the Advanced button you can filter by length
  13. Hulu: "help people find and enjoy the world's premium video content when, where and how they want it" (600K)
  14. Clicker: "What's on online"
    • Description: "Clicker is the complete guide to Internet Television. Our mission is to make it simple for you to find the right show, right now. … Clicker catalogs all broadcast programming online, along with TV-quality Web originals, from these silos and delivers them in one seamless, organized experience so you can easily discover what's available to watch (and what isn't) online, where to watch it, and what's worth watching."
    • Articles
    • sample; notice the features on the page:
      • Across the top, you can filter by whether the source came from TV, Web, Music or Movies
      • Across the top, you can sort by relevance, popularity, or airdate
      • Down the right, you can see two things:
        • The source results
        • The categories from which the results come — if you click on one of them, then you re-run the query and just get the results from that category

Podcasts

The term podcast has not been well-defined though there are some elements that are generally agreed-upon. As stated in Wikipedia, "A podcast is a series of digital media files (either audio or video) that are released episodically and downloaded through web syndication." A single episode of a podcast can be thought of as a talk-radio show or an editorial on TV news or a TV news segment. These podcasts are delivered to listeners by subscribing to them; some are delivered on a regular basis while others are not so regular.

Part of the trouble with this type of search is that the term itself isn't agreed-upon. Alternative terms are vodcast (referring specifically to video podcasts), vidcast, netcast, audio blog, blogcast, or DIY radio. This is not a good situation. A related problem is that these terms generally can be used to refer either to a single episode or the whole series.

The other major difficulty with podcast search is that the content of a specific episode is less well captured by tags and descriptions than by the content of the episode itself. As stated above, a podcast generally consists of an audio or video file — it does not contain a searchable text translation of that content. Every podcast series generally has a text description. So, if you're looking for a podcast about the U.S. economy, you're in luck; however, if you're looking for a podcast that specifically mentions the U.S. foreign trade balance, you have a much more difficult time.

This generally means that a superior podcast search engine would provide the following:

  • Aggregate podcasts (by whatever name), and only podcasts, at the site
  • Provide the ability to search over the text description of the podcast series
  • Provide the ability to search over the text translation of the podcast episode.

The third feature requires that the search engine implements a feature that grabs the podcast episode off the Internet and uses voice-to-text translation on the file. Unfortunately, this is a very computationally expensive task. Blinkx provides a search engine that uses speech-to-text technology on all of the videos that it indexes; it currently has over 35 million hours of video on its site. EveryZing is a company that provides multimedia search tools (including speech-to-text tools) for Web sites that want to provide this type of search on its content.

Except for iTunes, Google, YouTube and Bing, all of the following are niche sites. That doesn't mean that the small sites aren't worthwhile; it just means that this is not a popular class of Web sites. This section is a good set of sites for looking for business and news related sites.

podcast-sites.tiff

Many podcast search engines and sites have existed over the last few years, but none have succeeded to any significant extent. iTunes is probably your best bet as a first place to look for podcasts but, for your particular topic, you should give the other sites a try as well.

If you're going to search on a general video search engine (such as Google Video or Bing Video), then you should use the following query:

theTopic podcast|vodcast|vidcast|netcast|"audio blog"|blogcast|"DIY radio"

Other

  1. Yubby — find, collect, and publish from 30+ video sites
  2. YoVisto — "academic video search" (article)
  3. CastTV: "one stop watching" (40K)
    • sample; look at all the options on this page:
      • Down the left column are a lot of different filters for the search
      • Down the center column are all the results of the query
  4. Vuze: "find, download, and play high quality and HD video" (review) (40K)
  5. MeFeedia: "watch video from around the Web" (review) (25K)
  6. VideoSift (9K)
  7. LiveLeak: "redefining the media" (5K)

Music

Search

  1. Using Google Web Search: Add [music:] before any music related query (artist, song title, lyrics); article
  2. YouTube
  3. Yahoo Audio Search (click on "options" to be able to search on format, duration, source) (help)
  4. Yahoo Video (using video search to search for music)
  5. Altavista Audio Search (search by length and file type)

Music search and music providers

music-sites.tiff

  1. iTunes
  2. Last.FM: "social music revolution" (review) (250K)
  3. Rhapsody (50K)
  4. Pandora (400K)
  5. New
    1. Moozikk — "music search made simple. find, listen, save and share your favorite songs." (article)
    2. Noiset — "Noiset.com is a music search engine where you can search for music albums, artist biographies and songs. Using Noiset, you will be able to find your favourite artist's discography, browse entire album collection, listen to song previews and find free download links. Noiset scans several up-to-date music blogs hosted on blogspot or wordpress and collects rapidshare, megaupload and mediafire download links for you."

Music blogs

  1. The Hype Machine — follows music blog discussions (20K)
  2. Search for audio blog or music blog or mp3 blog:

Other

  1. Musipedia — "a searchable, editable, and expandable collection of tunes, melodies, and musical themes"

20 Metasearch

by samooresamoore (16 Nov 2009 02:06; last edited on 23 Nov 2009 15:08)

We discuss and evaluate metasearch engines.

Class held on 11/18/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Go through highlights of some of the amazing search tools that are available.
  3. Work on exercises

At beginning of class

My notes

The search community has a tool called a meta-search engine. These tools derive their functionality from other search engines. A complicating factor is that there are two very different types of tools that are called meta-search engines. The search community does not recognize these subtypes, but they are very real and quite significant:

Integrated search for multiple search engines
the meta-search engine uses an algorithm to combine the results from multiple search engines into one result list
Unified interface for separate search engines
the meta-search engine provides a single interface, enabling easy and quick switching between the results of different search engines

These are very different tools. In the first case, the functionality provided by the meta-search engine revolves around an informed method of combining the results of multiple search engines, while in the second case the functionality is simply based on providing a unified interface while simply passing through the results from one external search engine at a time.

Resources

Integrated search for multiple search engines

  1. Info.com (about, review) (60K -> 40K)
    • examples: carbon trading (web), carbon trading (images), carbon trading (reference)
    • other: separate columns for Web results & sponsored results
    • Features
      • Integrates Google, Yahoo, Bing, Ask, About
      • Separate tabs for Web, Research, News, Images, Video, Health, Shop, Classifieds, Flights, Jobs, Hotels, Movies, Audio, Yellow Pages, White Pages, Webmail
  2. StartPage (about, review)
    • examples: cannot link directly to results
    • other
      • explicit aggregation (with stars), refinement of results with user input
      • formerly known as ixQuick
    • Features
      • Integrates All the Web, Ask, EntireWeb, Exalead, Gigablast, MSN, NBC, Open Directory, Qkport, Wikipedia, Winzy, Yahoo
      • Separate tabs for Web, International phone directory, Video, Pictures
  3. Search.com (review, help, search tips) (110K -> 40K)
  4. InfoSpace (200k -> 500k)
  5. Clusty (about, review) (8k -> 3K)
  6. DogPile (about, review) (140k -> 70k)
    • examples: carbon trading (web), carbon trading (images)
    • other: favorite fetches (on home page), related searches
    • Features
      • Integrates Google, Yahoo, Bing, Ask
      • Separate tabs for Web, Images, Audio, Video, News, Yellow Pages, White Pages
  7. URL.com
    • examples: carbon trading
    • Users can contribute to the results ranking process
  8. Scour
    • examples: carbon trading
    • Features
      • You can change the ordering of the results by clicking on an icon at the top of the results
      • You can see how each search engine ranked each item at the right of each item in the list
  9. Others

Unified interface for separate search engines

Results from multiple sites available separately

  1. Soovle (click "secrets" in upper right)
    • other: top searches,
      • automatic refinement of search in multiple search engines in real time (really cool)
      • Be sure to hit the right arrow key after you have entered a search (but before pressing enter)
    • Features
      • Separate searches for Google, Yahoo, Ask, Wikipedia, Amazon, Answers.com, YouTube
      • Just Web search
  2. Search.IO
    • other: latest searches
    • Features
      • Separate searches for 8-10 sites in each category
      • Separate tabs for Audio, Blogs, Books, CSS Galleries, Fonts, Images, Jobs, Lyrics, News, People, Recipes, Search Engines, Social Bookmarks, Stock Photos, Torrents, Tutorials, Videos, Web 2.0 Sites
  3. Joongel (review)
    • examples: carbon trading (web), carbon trading (images)
    • Features
      • Integrates the "10 leading Websites" in each category
      • Separate tabs for General Search, Images, Music, Videos, Shopping, Social, Q&A, Health, Torrents, Gossip
  4. Zuula (about, help)
    • other: tracks recent searches
    • Features
      • Separate searches for Google, Yahoo, Bing, Gigablast, Exalead, Alexa, Entireweb, Mahalo, Mojeek (for the Web; others for the other categories)
      • Separate tabs for Web, Images, Video, News, Blog, Jobs

Results from multiple sites returned simultaneously & separately

  1. LeapFish (about, review) (2k -> 5k)
    • carbon trading
    • Features
      • Separate searches for Google, Yahoo, MSN
      • Separate tabs for Web, News, Answers, Videos, Images, Shopping, Blogs
  2. Kosmix (about)
    • carbon trading
    • Features
      • Multiple stories returned in multiple different types of searches, all displayed on one page
      • Good way to get overview of a topic at the beginning of an investigation

Internet Start page with focus on unified search interface

  1. MrSapo
    • Features
      • Separate searches for 10-20 web sites per category
      • Separate tabs for General, Images, Video, News, Social, Files, Reference, Academic, Business, Tech, Shop
  2. Symbaloo
    • Features
      • Just one search engine per category

21 Social Sites

by samooresamoore (21 Nov 2009 17:52; last edited on 23 Nov 2009 16:24)

We discuss social news and bookmarking sites.

Class held on 11/23/2009. (student notes; possible questions).

Class structure

  1. Go through "At beginning of class" information
  2. Go through metasearch results
  3. Work on exercises

At beginning of class

  1. Test information
  2. Timing of the test — Wednesday, December 2?
  3. Cut-off date for modifications (questions, notes, etc.)
  4. Nigel Melville

My notes

Introductory information

  1. Describing with tags
    • Taxonomies
    • Folksonomies
  2. Types of social sites
    • Social News (based on tagging)
      • Technology
      • Search & Internet marketing
      • For researchers & scientists
    • Social Bookmarking (based on voting)
    • Social activity (e.g., shopping)
  3. Features & dimensions
    • Voting
    • Current (e.g., "what's hot")
    • Time periods (e.g., "last hour", "today", "last week", "last month")
    • Topic categories (e.g., "business", "entertainment")
    • Tags (both personal and shared)
  4. Discuss the sites

General market size information

The following series of charts shows the relative traffic for several social news and/or bookmarking sites.

digg-reddit-delicious-stumbleupon.tiff

From the first, you can see that the traffic at these sites have shrunk over the last 18 months. About mid-2008 you can see a big jump in the traffic for Delicious — this is when it was renamed from del.icio.us to its current name. This first chart shows the three largest social news sites (except for Yahoo! Buzz) and the largest social bookmarking site.

reddit-slashdot-fark-mixx-newsvine.tiff

The second image shows four smaller social news site (with reddit shown as a comparison since the scale of this chart differs from the first). SlashDot is a technology-focused social news site. You can see that it has lost 75% of its daily visitor volume in the last two years. Mixx had a 3x growth from mid-2008 to mid-2009 but most of those extra visitors are now gone.

diigo-delicious-icio.tiff

The third chart actually shows two (not three) bookmarking sites In mid-2008 the site del.icio.us changed names to delicious. This site lost 2/3 of its daily visitor volume in the last two years. As you can see Diigo is much much smaller than delicious. There's really not much competition in this market.

Resources

Social News sites

  • Digg (about, tour, search): "Digg is a place for people to discover and share content from anywhere on the web… We’re here to promote that conversation and provide tools for our community to discuss the topics that they’re passionate about." (Propeller is quite similar to this.)
  • Yahoo! Buzz (about): "The buzz can be about anything — from breaking stories on major news to viral videos on personal blogs. Instead of editors, people like you submit the stories and "buzz up" the best ones."
  • Reddit (about, search): "reddit is a source for what's new and popular on the web — personalized for you. Your votes train a filter, so let reddit know what you liked and disliked, because you'll begin to be recommended links filtered to your tastes."
  • StumbleUpon (about, video intro, guide, search, Rating on StumbleUpon): "StumbleUpon helps you discover and share great websites. As you click Stumble!, we deliver high-quality pages matched to your personal preferences… This helps you discover great content you probably wouldn't find using a search engine." This site could also be considered a social bookmarking site.
  • Fark (about, help, search): "Fark.com, the Web site, is a news aggregator and an edited social networking news site… The idea was to have the word Fark come to symbolize news that is really Not News."
  • Mixx (about, tour, search): "You find it; we'll Mixx it. Use YourMixx to tailor the content categories, tags, specific users and groups, and we'll deliver the top-rated content as chosen by you and people who share your passions. So go ahead and whip up your own version of the web. Just tell us how you like it Mixxed and we'll deliver the best the web has to offer"
  • NewsVine (welcome, help, search): "At Newsvine, you can read stories from established media organizations like the Associated Press and ESPN as well as individual contributors from all around the world. Placement of stories is determined by a multitude of factors including freshness, popularity, and reputation. Contribution is open to all, and editorial judgement is in the hands of the community."
  • Slashdot (about, help, search): "News for nerds. Stuff that matters."

Social Bookmarking sites

Social activity site

  • Kaboodle: "Have fun shopping with friends, share, and discover new products."

General articles

  1. General articles
  2. Social bookmarking sites
  3. Articles on folksonomies
  4. Articles on tagging sites

Possible blog topics

  • Compare either Digg or Yahoo Buzz to one of the other social news sites.
  • Compare delicious and Diigo.

22 People Search

by samooresamoore (22 Nov 2009 19:43; last edited on 30 Nov 2009 15:46)

We discuss several different ways to find out information about people through their Web presence.

Class held on 11/25/2009. (student notes; possible questions).

Class structure

  1. Go through “At beginning of class” information
  2. Go through diagram explaining the tools we'll be looking at today
  3. Work on exercises (twitter)

At beginning of class

People search

  1. This problem has applicability to many areas
    • Just trying to find a person's address or phone number
    • Background checks (including criminal checks)
    • Finding missing persons
    • Ancestry (including obituary searches)
  2. A very frustrating, confusing topic
  3. Monetary interests have really infiltrated this set of tools
    • Lots of Web site purchases (industry consolidation)
    • Lots of pay-for-results

People search engines

General people search

  1. WhoZat (review)
  2. Pipl (about) — can search by name, email, username (screenname), and phone; emphasizes that it searches the Deep Web.
  3. ZoomInfo (advanced search, help) — find people or companies
  4. Spock (people search on the web, blogs, social networks)
  5. iSearch (review) — can search by name, phone, email, and username (screenname)
  6. Whoozy (review) —- searches the Web and social networks

White page directories

  1. 411 Locate — many different search tools:
  2. WhitePages — lots of different search tools:
    • people & business search
    • reverse phone & address
    • find area codes, zip codes
    • neighborhood search (find the names of people who live near an address)
    • search for an email address
  3. AnyWho — people search, reverse phone
  4. ZabaSearch (advanced search, review) — search by name or by phone number.

Social site search

  1. Wink — people search, phone number; will not only look for the specified name but similar names (i.e. Ted Kennedy, Theodore Kennedy…)
  2. 123people — people search (review, review)
  3. PeekYou — people search, username search
  4. The Internet Address Book — "Find, manage and discover internet addresses worldwide"
  5. Spokeo — good for searching social networking sites; most of the results require that you pay a fee.

Use public records

  1. Public Record Finder — this uses Intelius (like so many that I don't list here) so it shows partial results and then wants you to pay.
  2. Criminal Searches

Specialized

  1. Birthday Database
  2. FaceSaerch (yes, that's the right spelling!) — search for faces
  3. Namepedia — "world's largest information platform and community about personal names. Data is collected about names of all languages and cultures…" (review)
  4. PrivateEye — find maiden names, possible relatives, and roommates
  5. User Name Check — searches for a specific user name at a number of social sites (around 70 in late 2009); it tells you that the username is either available or not at each of the sites.
  6. InfoBel — international people search using white pages

Obituary search

  1. Social Security Death Index (ancestry.com) (1875-current, US Social Security numbers)
  2. Obituary Daily Times search
  3. National Obituary Archive (obituaries, memorials, funeral homes)
  4. Obituary Central (obituaries, cemetary searches)
  5. NY Times Obituaries

Ancestry information

  1. YourFamily (finding ancestors and lost relatives)
  2. Family Search (online birth, marriage, death, census, church, and other indices; run by Mormon Church)

General Web search

  1. Google (people search, reverse phone #)
    • Searching phone numbers (details)
      • phonebook
      • Doesn't find cell phone numbers
    • Searching names
      • "full name" location
      • "full name" company
  2. Yahoo People search (people search, reverse phone #, email search)

Resources

  1. People search engines: The newest web privacy threat (PC Advisor, March 14, 2009)
  2. 25 free people search engines to find anyone
  3. Ten Ways you can find Phone Numbers on the Web (about.com)
  4. Top Ten Ways to do a free people search on the Web (about.com)
  5. Fifteen People Search Engines (about.com)
  6. How to find someone online (about.com)
  7. 4 people search engines: Looking for someone online
  8. iSearch
  9. 123people searches the social web
  10. Facesaerch: search for people's faces
  11. Namepedia (review): the name database
  12. Yasni (review): people search
  13. Google, Yahoo, Ask, Cluuz

23 Project

by samooresamoore (30 Nov 2009 16:00; last edited on 07 Dec 2009 15:40)

No scheduled activities. Just going to be helping you w/ your projects, talking about test, talking about SE analysis grades.

Class held on 11/30/2009. (student notes; possible questions).

  1. Corporate visits:
    • Microsoft — they're not coming this semester. But let's talk.
    • Google — they're coming next Monday. Be attentive. Be on time. Ask questions. The moment that I start talking, close the computer in front of you and turn off your cell phone.
  2. Projects
    • Last part is due on 12/14/2009
    • The last part of the assignment (optional) is that you can write a blog post (ungraded) that will help me give you the best possible grade on your project. Since you can't stand over my shoulder as I'm grading your assignment, think about the types of things that you would like to make sure that I pay good attention to so that you get full credit for the assignment. Don't post this until the last couple of days before the assignment. The title of the post should be "Final project: titleOfYourTermProject". It would be great if you could put a link to the home page of your project near the top of the post as well.
  3. Test
    • Information about the test
    • It's looking like there's going to be more multiple choice questions on the exam — you folks did a nice job (early in the semester especially; some later days were less impressive).
  4. Search engine analysis assignments
2|5
3|5
4|
5|
6|5
7|5
8|257778
9|000000222233355557778

25 Google Inc.

by samooresamoore (07 Dec 2009 15:52; last edited on 11 Dec 2009 14:12)

We discuss the history, technology, and business model of Google, Inc.

Class held on 12/07/2009. (student notes; possible questions).

At beginning of class

  1. You should check your email for information about your test.
  2. Your final blog is due today. Be sure that your grade database information is up-to-date.
  3. The only assignment you have left is the term project.
  4. I have a request: Please send a tweet (or multiple) with #bit330 in it and suggest a search engine (or two) that you want to share with the rest of the class on Wednesday (our "Other Search Engine" day). You have blogged about or analyzed multiple search engine-related topics this semester beyond what we covered in class. I would like to make sure that the good ones that you have discovered are shared by all. (And also warn us off of search engines that you think are particularly awful.) Thanks for your help. Please do this by 6pm today (but right now would be okay).
  5. I will be discussing Google today.

Resources for today


26 Other Search

by samooresamoore (08 Dec 2009 19:54; last edited on 09 Dec 2009 17:49)

We explore a not-quite-random collection of search engines that we haven't looked at as a class. Many of these were suggested by students in this class.

Class held on 12/09/2009. (student notes; possible questions).

At beginning of class

  1. Google — review
  2. Rest of the semester:
    1. 12/14: Semester summary, project unveiling
      • By the time class begins, you should have made your site public. You need to keep it public at least until the beginning of next semester (early January). At that point you can do anything you want with it, though I would prefer that it be kept public.
      • If you would like to make a 2 minute presentation to the class about your project (because it's so cool!), please send me an email.
      • By the end of Tuesday 12/15, I would like a short blog entry (it would be approximately one page if printed) on your project site that basically summarizes this information (and covers those things that you would have liked to have covered if you had been given more time). The point of this blog entry is to make it easier for me to grade your assignment. You want to point out the good things about your site so that I don't overlook them. Point out those parts that took particular effort on your part or that you think are deserving of special attention (for some reason) on my part. This entry won't be graded per se; I will use it as a guide while grading your project.
  3. Office hours will end at 4pm. I have a meeting with the Dean about BA201.
    • I will hold office hours on Thursday from 3:30-4:30.

Sites

Video search sites

  1. ChizMax: lyrics and video search engine (Matt K)
  2. CastTV: "casttv is a pretty good video search engine, much better than hulu and comparable to clicker" (Liz J)
  3. Tudou: "One of the largest online video sites in China" (Liz J)
  4. ClipBlast: world's largest video search (Ray Park)
  5. Hello Movies: a great place to find movies to watch (Ray Park)
  6. Videosurf: "for a great video search engine use videosurf it has everything every other video surf engines offers, plus a great interface" (Tim F)
  7. PirateBay: "i dont think you want to be talking about torrenting, but the pirate bay is really good and no pesky ads" (Roberto J)
  8. ScrapeTorrent Torrent Metasearch Engine
Video Search popular in Foreign Countries: Youku and Tudou are popular in China, has Chinese video contents (news, shows, user content) as well as US videos (subbed of course)
  1. Youku
  2. Tudou
Index Video Sites: Indexes shows from a variety of sources, very frequently updated. Shows are usually found a day later
  1. Alluc
  2. OVGuide

Music

  1. Fizy: easy way to find songs and music videos; "Might want to check out music search engine Fizy.com. Simple, but really gets the job done. Like YouTube but with playlists" (Adrienne G)
  2. Songza: "Songza has become a daily staple for me. The playlist feature makes it something I can minimize and leave on for hours." (Nikhil G)
  3. Grooveshark: "Awesome, it has popular playlists, allows you to share playlists with friends and to make as many playlists as your heart desires."
  4. PlayList: "similar to songza, but better layout. great for listening to songs you don't have downloaded, can create playlists" (Diane B)
  5. Music Map: "you type in an artist, and it visually recommends similar artists for you. very cool!" (Diane B)
  6. The Hype Machine follows music blog discussions (Omer I)
  7. Stereo Mood: we've created a way to suggest songs that follow your feelings (Nikhil G)
  8. Musicovery: Like Stereo Mood but better interface and ability to link with your Itunes library (Vitaliy I)

Updated sites

  1. Google Trends: notice the "Hot Topics" their new real time search tools.
  2. Google Caffeine - New UI for Google
  3. At Google, notice the new real-time search results under "Latest results".
  4. At Google Finance, streaming news is available. Also note the "Sector summary" at the bottom of the page.
  5. Be sure that you're aware of the Related searches option and the Wonder wheel option.

Other (from students)

  1. Hakia: "I really enjoyed using Hakia this semester and want to tell the rest of the class about it on Wednesday!!" (Rachel B)
  2. WolframAlpha: "for one of the most unique search, or should I say computational knowledge engines, check out wolframalpha.com the site is sick…" (Larry W)
  3. Cuil: "if you are interested in the very basics about a topic you know little about, www.cuil.com is pretty useful. don't expect more tho" (Mike D)
  4. Living Stories: "The Living Stories project is an experiment in presenting news, one designed specifically for the online environment. The project was developed by Google in collaboration with two of the country's leading newspapers, The New York Times and The Washington Post." (from the home page of the site)
  5. Yahoo Glue: "pretty sweet. Similar to Kosmix. Kind of like a meta-search engine." (Joe K)
  6. PDF Database (Ray Park)
  7. Scribd: "Good website full of all types of freely downloadable documents, mostly pdfs. You can see the files on the website before you download them, which is helpful." (Dan B)
  8. Yebol: "Yebol is a new search website that could be useful if you are looking for a directory search, but don't use it for a regular search" (Michael A)
  9. Gigablast: "Gigablast was an awesome search engine that we didn't explore in depth. It returns a bunch of relevant results and great news sites" (Isaiah M)
  10. Sport Search (Andrew C)
  11. Funlus: "the game search engine" (Joe C)
  12. StumbleUpon: "great for just finding cool websites online/wasting time" (Tim F)
  13. Googlism: "an opinion site that may turn some unflattering results" (Rob L)
  14. Zebra Tickets: "a good site to visit when trying to find and compare ticket prices of concert and events (metasearch)" (Rob L)
  15. Foodieview: "if you are looking for a good cooking search engine try http://www.foodieview.com/, lots of cool customization" (Dan B)
  16. Epicurious: A good search tool for recipes and more. Also has a good community aspect.
  17. Boorah: "a restaurant review site" (Ran F)

Health search

Law

Shopping

Other (from me)

  1. CoolIris — the coolest browser search plugin AWESOME
  2. iSeek (about): focuses on ways to refine your search quickly and easily
  3. Quintura (about): visual information web
    • example: [detroit lions], [carbon trading]
  4. SenseBot In-depth search (about): in-depth summaries of Web pages on a topic plus a tag cloud
    • example: [carbon trading] (50 pages and 20 sentences) — really interesting.
  5. Hakia (about): quality, not just popular, search results
  6. Exalead (about): thumbnails, search expansion
  7. Cluuz (about): advanced results display
  8. FactBites (about): results based on content analysis instead of link popularity
  9. Evri (about): summary information, relationships, facts, web search, images, videos
  10. Lexxe (about): lexical analysis and clustered results
  11. Search Cloud (help): use of a “cloud tag” interface to create the queries
    • example: [carbon trading, environment, credits (in decreasing order)]
  12. Tip of my tongue (about): find words
  13. Abbreviations (about): find abbreviations and acronyms by category
  14. CarZen (about): automobile search
  15. IconFinder (about): exactly what it says!
  16. Snooth (about): wine search

27 Wrap up

by samooresamoore (14 Dec 2009 15:06; last edited on 14 Dec 2009 16:17)

We wrap up the semester and all that we have learned.

Class held on 12/14/2009. (student notes; possible questions).

Before class

  1. Requests
    • Please keep this wiki public for the next year (at least). As you know these projects would be a lot of help to next year's class.
    • After I (actually) grade your blogs, I still would like you to transfer your "10" blogs to this course Web site. Again, this will help next year's class.
  2. Reminders
    • Here is the list of student wikis.
    • Make your wiki public.
    • Be sure that all of your blogs (and notes) are in the grades database.
  3. Announcements
    • I will send twitters out when I have updated the grades database over the next 10 days.
    • Don't expect final grades until the middle of next week. Again, I'll send a twitter when I've posted them.
    • No more office hours.
    • I will keep this wiki available and will continue to update it. I plan on using this same address for next year's course wiki.

In-class

  1. Student presentations
    • Liz
    • Billy
  2. My final presentation
  3. Course evaluations (go to CTools)
  4. Discussion
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License