This page contains all the exercises from the whole semester.
01 Introduction
by
samoore (08 Sep 2009 00:48; last edited on 15 Sep 2009 19:46)
Description of what the course is about, what we'll do in the class, why students should take it, why all their friends should take it.
Class held on 09/09/2009. (student notes; possible questions).
My notes
- Introduce myself.
- Go through the class pitch on SlideShare
- At the end of the pitch have a discussion about the merits of the class, what they're thinking, what they think sounds good, what sounds confusing.
- Slideshow
- Take role; use the Photo Roster
- Course wiki
- Show it
- Everyone who is going to be in the class should join my site (see below).
- Show them the main pages of the site: schedule, instructor info, syllabus, assignments, last year's class info.
- Discuss interesting parts of class from my perspective:
- practical
- class wiki (that is open, evolves, contains blog so that we all can teach each other)
- twitter (I'm drsamoore)
- individualized learning
- project-centric
- your own wiki
- so much changes from year-to-year; for example:
- Yahoo/Microsoft Live to Bing
- Lots of tools disappeared
- Existing tools have improved
- New tools have appeared
- Lots of information available on the Web; I capture stuff that's of interest to me on delicious — specifically, under the bit330 tag.
- Lots of news and blogs available
- the class is revised from last year
- 4 totally new days
- 3 mostly new days
- every other day has changed non-trivially (because of technological changes, if nothing else)
- we learned lots about what tools are good, bad, and indifferent
- web-based almost completely
- little reading but lots of doing
- lots of small tasks; can't fall behind
- Office hours: MTW 3:30-4:30 in the Winter Garden

To do
- Get a twitter account if you don't already have one.
- If you have a cell phone for which you are not charged unlimited texting, then I'd like you to set this up with twitter.
- I have set it up (for now) so that my phone will receive tweets from 9am-9pm.
- Go to Settings/Devices (from the home page) after you have set up your account.
- Also upload a photo so that I can see who is tweeting me.
- We will do more with this in future classes.
- Sign up as WikiDot member
- Apply for membership to the BIT330 web site.
- Sign up for class notes or questions (as described on the assignments page).
- Sign up for search industry updates (as described on this page).
02 Web Search
by
samoore (11 Sep 2009 15:18; last edited on 06 Oct 2009 15:12)
Discuss basics of Web search and why students should use multiple search tools (rather than just Google).
Class held on 09/14/2009. (student notes; possible questions).
At beginning of class
- Someone should take notes and post them for today's class (sign up here)
- By now you should have done the following. This is not optional. This is not to do later. These sign-ups are due today.
- Become a member of twitter. Follow me, drsamoore.
- How will I use twitter? How can you use twitter?
- Become a WikiDot member
- Become a member of the the BIT330 web site
- Read through the major pages of the wiki and look through the rest of it
- Schedule
- Assignments (and all the individual assignment pages). I mean this — read these through carefully!
- Syllabus
- You will want to sign up for
- Become a member of twitter. Follow me, drsamoore.
- The structure of today's class is going to be fairly standard for the rest of the semester:
- Start off with announcements and taking questions and comments.
- Lecture for a bit. This lecture will be something of an overview and will provide the motivation and background for the exercises that you will complete and the assignments that you will have to work on.
- Provide some time for you to start on your exercises (which will allow you to explore the specifics of the topic that my lecture introduces).
My notes
- Go through “At beginning of class”
- Take role.
- Make presentation
- Go through the search syntax page
- Talk about blogging
- Point out the blogging guidelines page
- Point out the exercises for today; they should start working on these as soon as I'm done talking.
- Point out the “To do after class” section on this page
To do after class
- Finish the Web search exercises.
- Think about the following questions:
- Search tools can differ by their functions: generating results, exploring results, and monitoring changes. How do Google, Bing, and Ask differ along the first two dimensions (we'll explore the third later)?
- Look at any one of the search engines we used today (other than Google). Analyze it in terms of the "search experience" that it provides.
- Read Life before Google.
- You should be very, very familiar with the search syntax page by the next class. I'm going to update it to include Ask very soon; and I'm also going to update the Yahoo search information.
Resources
- Life before Google
- Search syntax — for Google, Bing, and (soon) Ask.
- Today's slides on SlideShare and as a PDF
03 Wikidot And Twitter
by
samoore (15 Sep 2009 23:33; last edited on 23 Nov 2009 15:13)
We'll go over techniques and tricks related to using this wiki, which will also be the host of your term project wiki. We will also learn a bit more about twitter, enough to get started using it.
Class held on 09/16/2009. (student notes; possible questions).
Before class
- Keep your cell phone out but put it on vibrate.
- Open your Web browser
- Make sure you are on these pages:
- Take note of these pages:
- Search related feeds (from "Content" menu)
- RSS feed items (from "Content" menu)
- Search syntax (from "Content" menu) — updated for Ask.com information
- On Monday, the "Grades" menu will lead to an online database in which you will "turn in" your assignments.
- Note the new menu structure
- Dynamic "Schedule" menu
- After the wikidot tutorial, be sure to look at how the top menu is put together (i.e., look at the code itself).
- Dynamic "Schedule" menu
- This course Web site is your course Web site.
At beginning of class
- Take role.
- Talk about assignments
- Term project topics
- Assigned/due dates
- Notes — notes to be posted by the end of the class day
- Questions — questions to be posted (at least first draft) by one week later
- General blog entries — write-up to be posted by the following class
- These will be posted on your own wiki. We'll see how to do this today.
- Industry updates — write-up to be posted on the day listed
- These will be posted on the course wiki. Ditto.
- The biggest challenges with this class
- Knowing what to do
- Staying familiar with the Web site
- Completing the daily exercises and frequent blog assignments so as to not fall behind
- Coming up with an interesting topic for your term project
My notes
Some background
- Today we're going to complete the setup of our twitter accounts and make sure that we know the basics of how to use it.
- Twitter is not an IM Client — "The basic idea behind Twitter is to produce occasional status updates, not hold personal conversations. Conversations with more than one person are exactly what Twitter is for and these should be encouraged, but if it is obvious that there is only one other participant take it off-Twitter to an IM client." (from The ultimate guide to everything twitter, below)
- Uses of twitter for this class
- Ask me questions about class (use #bit330 in message)
- Ask me questions about BBA program (use #rossbba in message)
- I'll remind you about something you need for an upcoming class or some assignment that is due
- Message types (in the US, all messages go to 40404)
- To update your twitter status: "message goes here"
- Direct message to a user: "d username message goes here"
- This is like a direct text message. The message does not appear in your twitter log; it only appears in your direct message outbox.
- Bring a message to someone's attention: "@username message goes here"
- This message appears in your twitter log but is brought to the attention of the person you identified.
- To start following someone: follow username
- To stop following someone: leave username
- To get a user's messages on your phone: on username
- To stop getting a user's messages on your phone: off username
- To stop all messages from going to your phone: off
- Actually, you'll still get direct messages; in order to stop getting direct messages as well, send off again.
- To nudge a person to update twitter: nudge username
- This is encouragement for someone to update their twitter status.
- To get statistics about your account: stats
- To invite a non-twitterer to join the fun: "invite 404 555 1212"
- Hashtags
- Words that are preceded by the hash #
- The hashtag for this class is #bit330
- Use the hashtag whenever you are referring to this class. It will make the message easier to find later. You'll see this in a later class.
- The twitter exercises for today's class can be found here. Do them now.
Wikidot
- Today we're going to learn about wikidot, get an idea of how to use it, get more familiar with working with a wiki.
- Wikidot is the host of the course Web site, but it's also going to be the host of your term project Web site.
- You are a member of the course Web site, and you will be the administrator of your own term project site. This means that, while you have total and complete control over your own site, you also have the ability to edit and create pages (but not delete them) within this course Web site.
- I will expect that you will be a very good user of this wikidot site by the end of the semester. Maybe you won't be an expert, but you'll be able to make a wiki that is filled with properly formatted content and useful navigation.
- The wikidot exercises for today's class can be found here. Do them now.
- If you have any questions, you can either send them on twitter or put them at the bottom of this page under "Common questions".
During and after today's class
- Complete all of the twitter tutorial and the wikidot tutorial before next class.
- This includes your twitter account setup (including headshot loaded and identifying personal info), sending a few twitter messages, your wikidot account setup (again, including headshot loaded and identifying personal info), a test blog (on your wiki), a blog listing page (on your wiki), and an introductory blog (on the course Web site).
- Add your information to the student list page before next class.
- Please stop kicking other people off of a page when editing. I'm going to have to come up with different methods of doing these assignments but, until then, please be polite. Thank you.
- This should all be done by Sunday night at 10pm. This is the first part of your participation grade. (I won't always tell you this, but I'm reminding you this first time.)
- Think about what you might do for the term project.
Resources
Wikidot
- Twitter tutorials
- Twitter resources
- URL shorteners
- Popular tweeters — Twitterholic
- Twitter-related news: Twittown
- Story sources
- shared-items-from-rss-feeds — when reading through stories in my RSS feeds, I mark some to "share"; these are those stories
Common questions
If you have questions about wikidot or twitter, add them to the list below. Someone (preferably a student) will answer your question.
About wikidot
- What does the pound sign mean in the menu bar (top menu bar)?
- It means that the menu item cannot be followed — generally it's a menu header or divider.
- How do you delete pages?
- Use the "site options" button at the bottom of the page.
- How do you edit pages?
- Use the "edit" button at the bottom of the page.
- How do you list the pages on the site?
- Use the button on the side bar that says "List all pages".
- Can you have the top menu bar automatically update with the most recent blog entries (or whatever)?
- Yes, look at the top navigation menu for this Web site. Look at "Schedule".
- Is it possible to dynamically generate a table from other pages within a Web site?
- Yes, and you should look at the code I use to generate the "Announcements" or "Schedule" or "Blogs" on the start page.
- How do you put an image hosted on mFile on your page?
- The instructions are discussed on this page.
04 Search Techniques
by
samoore (19 Sep 2009 22:25; last edited on 07 Oct 2009 14:01)
We go over several standard search techniques and strategies.
Class held on 09/21/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” info
- Lecture through the slides (as a PDF)
- Talk through the examples
- Go through “At end of lecture”
At beginning of class
- Look at announcements made since the previous class
- If you're going to ask me a question via twitter or email, first do the following:
- Look at my previous twitter messages at drsamoore
- Look at my recent announcements on the wiki
- If you ask me about a wiki page, use http://bit.ly to send me the link to that page so that I can look at it.
- Do not wait until the last minute to start your assignments.
- There are technical issues that you have to learn related to wikidot. This can't be taught very well over twitter. Maybe you've noticed this?
- My office hours are in the Winter Garden on MTW from 3:30-4:30 (or, generally, when students stop coming by, so I might leave early if no one is there or I might leave late if I'm busy).
- Check who is doing what:
- Notes & questions
- Special blogs
- Let's look at what these folks did for today.
- Blog template trouble
- Many of you are having issues with this. You should have the following pages on your wiki (substitute for myWiki and nameOfMyFirstBlog):
- myWiki.wikidot.com/blog:_template
- myWiki.wikidot.com/blog:nameOfMyFirstBlog
- myWiki.wikidot.com/bloglist
- You need to understand the relationship among these three pages.
- Many of you are having issues with this. You should have the following pages on your wiki (substitute for myWiki and nameOfMyFirstBlog):
- Also, let's fix some blog formatting issues.
- First paragraphs, mainly.
- But also paragraph separation.
- File history information
- Grade tool (what to do???)
- Your first possible blog entry (on today's exercises) could be turned in next class (see the schedule-2009 for details on the timing of blog entries)
- From two classes ago: Why do search engines return different results?
My notes
Search techniques
These are most of the search techniques that we'll cover in today's class.
- Special search syntax — This is the tool that you have at your disposal that allows you to target your searches on specific parts of documents. Since different text in different parts means different things and perform different functions, you can use these operators to raise the precision of your queries.
- Full text search engines
- Title — intitle:
- Site — site:
- Top-level domain — site:
- URL contents — inurl:
- Links — link:
- Full text search engines
- Unique words and phrases — The use of multiple unique words and phrases are a key both to reducing the number of documents that are retrieved and raising the precision of your queries. Further, using multiple words and phrases increases the chances of retrieving content-filled documents (that is, increasing the number of “meaty” documents).
- They can be used to focus in on more specialized pages that would use those terms
- Gather related words using summaries
- Use search engines to find related words
- Example at Ask.com (both “Narrow your search” and “Expand your search”)
- Google
- Google Suggest feature
- “Related searches” at bottom of search results window
- Yahoo
- Yahoo Search Assist feature
- “Also try” at top or bottom of search results window
- Yahoo Directory (we'll cover this in a future class) can point in the right direction
- Use means queries
- Query specificity
- Narrow to more general: this is when you have a real good idea of what you're looking for.
- More general to narrow: this is when you don't know what you're looking for.
- Alternative naming
- People
- Using different name forms can return different information
- Sometimes you have to use other information to differentiate two identically named people
- Also, search specifiers can help target the information (intitle, site type, include, exclude)
- Places
- Use addresses (streets, zips, area codes, phone numbers)
- Use "official"
- People
Sites
This is the best summaries of the major general search engines that I could come up with. I have also linked to several useful help pages for each site.
- Google
- The best, most reliable, fastest, most wide-ranging general purpose search engine. Nice features: Showable "Options" on the left with lots of choices (especially time-related and Related Searches switch). When you're serious about searching, you have to make at least one stop here.
- Useful pages
- Yahoo
- Historically, the second best search engine in terms of returning relevant results. Nice feature: the hideable "Search Assist" box at the top that also shows Related Searches.
- Useful pages
- Ask
- A great search engine for exploring a topic. Nice features: the "Related searches" on the right, the binoculars hiding the page preview and page statistics; also larger images appear on mouse-over. Notice there are sponsored results at the top and bottom of the page.
- Useful pages
- Advanced search tips
- Site features: 1/2
- Bing
- A search engine that focuses on the user experience during the search. Nice features: "More on this page" and "Popular Links" in the pop-up bar on the right; "Related Searches" immediately available on left.
- Useful pages
- Help center
- Tour of Bing's features (video)
Useful settings
Each of these search engines provides a way to set up an account and, thereby, set up preferences. I generally use the following preferences:
- 30-50 results per page — I like the ability to scan more information more quickly
- Filtering (moderate on Google) — don't want this stuff popping up in the middle of class or a group meeting
- Open search results in new browser window — this keeps the search results up and available so that they're not so easily lost or closed
- Turn on search suggestions — I find these to be amazingly useful as I structure queries.
In-class examples
For most of the following I will (by default) use Google as the search engine as a demonstration of the search technique. For the most part, each of these search engines (other than Bing) could have been used.
Special search syntax example: Information about tigers
- tigers (31.9mm)
- tigers -"Detroit Tigers" (29.0mm)
- tigers animal (4.61mm)
- animal intitle:tigers (1.45mm)
- Tigers (the animal but not any sports teams):
- Google: tigers -detroit -memphis -missouri -baseball -lsu -football -athletics -sports -mlb -soccer -"Louisiana State" (14.4mm)
- Bing: tigers -detroit -memphis -missouri -baseball -lsu -football -athletics -sports -mlb -soccer -"Louisiana State" (5.21mm)
- Yahoo: tigers -detroit -memphis -missouri -baseball -lsu -football -athletics -sports -mlb -soccer -"Louisiana State" (49.7mm)
- Ask: tigers -detroit -memphis -missouri -baseball -lsu -football -athletics -sports -mlb -soccer -Louisiana -State (4.25mm)
- What's wrong with this page?
- Information from an organization
- animal intitle:tigers site:org (25.6k)
- Information from an organization or a government
- Information from a zoo
Unique words and phrases
- Bunch of birds example
- "flock of seagulls" "gaggle of geese" sparrows turkeys
- Lesson: put what you know in the search
- Use "means" and "definition" queries: Hydrocephalus
- Ask — hydrocephalus (300k) — look at "Related searches"
- Yahoo directory — hydrocephalus
- Google — hydrocephalus — 2.0 million documents (2.34 in 2008; 2.26 in 2007); note the "Refine results" part of the page. Also note the “definition” link near the top of the page.
- Google — hydrocephalus means — 1.15mm documents (385k in 2008; 789k in 2007)
- Google — 'hydrocephalus means' — 3280 documents (844 in 2008; 415 in 2007)
- Google — intitle:hydrocephalus (intitle:means OR intitle:definition) — 1460 documents (470 in 2008; 200 in 2007)
- Google — 'hydrocephalus means' (site:edu OR site:org OR site:gov) — 1020 documents (44 in 2008; 131 in 2007).
- Google — define hydrocephalus (359k documents)
- Related words: Investment guidance
- investment guidance — 4.05mm (487k in 2008; 4.48mm in 2007)
- 'investment guidance' — 44.1k (82.8k in 2008; 71.7k in 2007)
- investment guidance financial goals stocks bonds portfolio — 600k (235k in 2008; 1.62mm in 2007)
- 'investment guidance' financial goals stocks bonds portfolio — 872 documents (13.1k in 2008; 10.9k in 2007)
- Fun with quotes
- 'statistical analysis' means — 10.4mm documents (26mm in 2008; 21.5mm in 2007)
- 'statistical analysis' mean — 6.56mm documents
- 'statistical analysis' 'means' — 5.55mm documents (4.73mm in 2008; 7.04mm in 2007)
- 'statistical analysis' 'mean' — 6.57mm documents
- define:"statistical analysis"
- Lyrics
- Google — 'big rock stars' nickelback lyrics 'we all just' 'drugs come cheap' — 34 lyrics (6 results in 2007, and they were all good)
- Google — rockstar nickelback intitle:official video
Query specificity
- Dog breed information
- Google — dog breed cavalier king charles spaniel — 220k documents (355k in 2008; 888k in 2007)
- Google — dog breed 'cavalier king charles spaniel' — 195k documents (890k in 2008; 535k in 2007)
- Google — dog breed intitle:'cavalier king charles spaniel' — 40.1k documents (26.2k in 2008; 15.4k in 2007)
- Yahoo Directory — dog breed 'cavalier king charles spaniel' — 67 documents (69 documents in 2008 and 2007)
- Dog breed disease information
- Google — 'cavalier king charles spaniel' 'heart problem' OR 'heart murmur' OR 'mitral valve' — 4.54k documents (7,710 in 2008; 22,900 in 2007)
- Google — intitle:'cavalier king charles spaniel' 'heart problem' OR 'heart murmur' OR 'mitral valve' — 2.9k documents (250 in 2008)
- Yahoo — dog breed 'cavalier king charles spaniel' 'heart problem'= — no documents in the directory
Alternative naming
People
- George Washington information
- 'George Washington' biography -site:com -'Carver' — 941k documents (1.22mm in 2008; 1.06mm in 2007).
- intitle:'George Washington' biography -site:com -'Carver' — 293k documents (218k in 2008; 240k in 2007)
- "George Washington": — one whole category on George Washington, plus 84 other related categories
- Stephen Hawking (as a name example)
- Stephen Hawking — 1.93mm documents (3.61mm in 2008; 2.27mm in 2007)
- 'Stephen Hawking' — 1.86mm documents (3.86mm in 2008; 2.12mm in 2007)
- Note that the 2008 results make no sense when compared with the previous result. At least not given my understanding of how Google should operate.
- intitle:'Stephen Hawking' — 73.4k documents (61.3k in 2008; 63.1k in 2007)
- intitle:"Stephen * Hawking" — 3.1mm documents (9,310 in 2008; 9,190 in 2007)
- intitle:"Stephen * Hawking" OR intitle:"Stephen Hawking" — 628k documents (62,900 in 2008; 75,200 in 2007)
- Note that the 2009 results really make no sense. Look at the number of results for the previous two queries.
- "Hawking, Stephen" — 270k documents (535k in 2008; 241k in 2007) — library and books, mostly
- "Hawking, Stephen W." — 47.6k documents (72.1k in 2008; 53.2k in 2007) — again, library and books, mostly.
- "Hawking, Stephen William" — 75.2k documents (20,400 in 2008; 13,900 in 2007) — lots of encyclopedia type entries.
- Levi Strauss (since there are two/three of them)
- "Levi Strauss" — 1.77mm documents (3.97mm in 2008; 2.24mm in 2007)
- "Levi Strauss" -french -france -philosopher — 1.19mm documents (2.21mm in 2008; 2.06mm in 2007)
- intitle:"Levi Strauss" — 56.6k documents (78,100 in 2008; 68,200 in 2007)
- intitle:"Levi Strauss" -french -france -philosopher — 46.2k documents (66,300 in 2008; 53,700 in 2007)
- intitle:"Levi Strauss" (french OR france OR philosopher) — 9,420 documents (9,680 in 2008; 15,900 in 2007)
- intitle:"Levi Strauss" claude (french OR france OR philosopher) — 883 documents (1,160 in 2008; 556 in 2007)
- intitle:"Levi Strauss" bavaria germany — 310 documents (241 in 2008; 48 in 2007)
Places
- Pizza places in Ann Arbor
- pizza "ann arbor" — 763k documents (1.19mm in 2008; 887k in 2007).
- Look at all of the information this query has available at the top of the results page.
- pizza "ann arbor" william — 102k documents (547k in 2008; 629k in 2007)
- (734) 669-6973
- pizza 734 "ann arbor" — 109k documents
- pizza "ann arbor" — 763k documents (1.19mm in 2008; 887k in 2007).
- The Sears Tower (as a landmark)
- "sears tower" — 1.12mm documents (1.44mm in 2008; 1.49mm in 2007)
- "sears tower" official — 1.12mm documents (1.11mm in 2008; 257k in 2007)
- intitle:"sears tower" — 170k documents (28,800 in 2008; 19k in 2007)
- intitle:"sears tower" official — 37.9k documents (28.7k in 2008; 1,440 in 2007)
- intitle:"willis tower" official — 15.7k documents
At end of lecture
- Start working on today's exercises. The exercises are on this page. You should work on them for no more (but, probably, no less) than another hour outside of class; we will have more time in the next class after the lecture to continue working on them before going on to that day's exercises.
05 More Search Techniques
by
samoore (22 Sep 2009 17:38; last edited on 23 Nov 2009 15:16)
We go through the exercises related to search techniques; we also discuss evaluating sources from the Web.
Class held on 09/23/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- This should include you adding your information to the Grades database.
- Lecture (but no slides today) going through “My notes”
- Work on exercises
- Finish the exercises from last class
- Then work through the exercises for this class
- Work on experiment:
- The instructions for the experiment
- Where you will post the results of your experiment
At beginning of class
Before class starts (for you to do)
- Check who is doing what:
- Notes & questions
- Special blogs (these should be written and posted on this howcanifindit site)
- Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
- Look at recent writings for the class:
- Recent announcements
- Twitter stuff:
- Recent blog-entries (you are responsible for these for the test; ask me about them if you have questions, comments, any interest, etc.)
- Stay up with recent information on these pages:
- Content update:
- We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.
Information for me to cover
- Grade stuff
- The Grade Database is evolving, and I need you to help out.
- If you want a grade for this class, I need you to do the following:
- Choose Grades from the Class Info menu.
- Use the Student menu to enter your information.
- Use the Add Grade Record to create your grade record; all you will have to do is enter your uniqname and save it. This gives you a database record where I can put your grades.
- If you have written a blog entry or notes or questions, then enter this information under Add Wiki Assignment.
- Next steps for me are as follows:
- To create more reports so that you can see grades that I have entered as well as your summary grades for the semester. I'll keep you posted.
- To enter your participation grades so far.
- To grade the blogs you have written.
- You should be thinking about the topic for your term project.
- Be sure to read the description of the assignment.
- A student from last year's class re-iterated my point that doing a project about a sports team would not be the best use of your time.
- Also look at the list of industry sectors that you might select.
- On day 8, which is October 5, you are turning in the first status report for your term project. On that date, you need to have decided on the topic, you need to have discussed your topic with me, described the topic on the start page of your wiki, and updated your information (that is, indicated the title of your wiki) on the class wikis list of student wikis.
- Every single one of you needs to meet with me next week during office hours!!
- Another note for your term project. Your term project reports will include a section on information sources (as we will discuss today). Part of this will be an evaluation of the quality of the information sources that you identified. You will want to describe how you evaluate the sources, and indicate on the report your evaluation of each one of them. This will not be a separate deliverable but should be integrated into the final report.
- There are so many blog opportunities from these two classes (i.e., today and Monday). If you want to blog on both classes, you don't have to choose something from “last class” and then something from “this class”. These are both the same topic; you can use any two things you want to blog on from both classes. It doesn't matter if they were the same or different days. (Again, you don't have to blog today, or last class. But you'll have to blog sometime, and you might as well start sooner rather than later.)
- If you want to know how I format anything on any of the wiki pages that I have, you can just look at the source yourself.
- Talk about the experiment.
- Questions about any or all of this?
My notes
Discussion
- Long term research projects, or more difficult queries, require another level of effort and analysis.
- Gather and save as much information as you can.
- Use information from the search results, page characteristics, and contents of the results pages.
- Look for names, contents, concepts, URLs, page titles, unique words, dates, places, facts, etc.
- Create a wiki site to keep information and links.
- Use information from the search results, page characteristics, and contents of the results pages.
- Sometimes finding a set of related nouns and unique names can help you find what you need.
- Use Google Sets
- Use the queries ["type of X"], ["there are * types of X"], ["compared to X"], ["X vs." OR "X versus"]
- Gather and save as much information as you can.
- Evaluate the potential validity of the Web page from which you get information.
- Facets to evaluation
- Location of the page
- Speaker's identity
- Speaker's motivation
- Credibility of sources
- Speaker's history
- Speaker's reputation
- Facets to evaluation
In-class examples
Candy bars
- Use of Google Sets
- Google Sets on twix, snickers, butterfinger, "baby ruth", "tootsie roll"
- twix "tootsie roll" butterfinger "baby ruth" snickers "almond joy" skittles starburst "milky way" "kit kat" twizzlers "nestle crunch" "clark bar" — 22 documents
Types of things
Resources
- Sites to use for evaluation
- Resources to use for evaluation
- Evaluating information found on the Internet from Johns Hopkins University
- Information evaluation form from the Cal-Berkeley library.
06 RSS Introduction
by
samoore (27 Sep 2009 23:40; last edited on 17 Dec 2009 19:49)
We are going to introduce the topic of RSS feeds, blogs, subscribing to RSS feeds, and the basics of searching for RSS feeds.
Class held on 09/28/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Go through today's slides (as a PDF).
- Demonstrate feed readers: Google Reader and bloglines.
- Work on exercises.
At beginning of class
Before class starts (for you to do)
- You probably need to talk with me about your project during office hours (MTW, 3:30-4:30).
- Update the student list page if you have already talked with me. Do this now.
- You should go to the Grades database for this class.
- If you haven't already done so, enter your personal information under "Student".
- If you haven't already done so, enter your uniqname under "Add Grade Record".
- If you have completed a wiki-based assignment (blog entry, industry update, notes, or questions), then enter that information under "Add Wiki Assignment".
- Do this now.
- I will grade blogs by next class. Why? I want to have more turned in before I assign these grades.
- Check who is doing what:
- Notes & questions
- Notes posted by the end of the class day.
- Questions (at least the first draft) posted by one week later.
- Special blogs (these should be written and posted on this howcanifindit site)
- Post these by the beginning of class on the day you sign up for.
- Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
- You can post these by the following class (since they are about what we do in class).
- Assigned/due dates
- Notes — notes to be posted by the end of the class day
- Questions — questions to be posted (at least first draft) by one week later
- General blog entries — write-up to be posted by the following class
- These will be posted on your own wiki. We'll see how to do this today.
- Industry updates — write-up to be posted on the day listed
- These will be posted on the course wiki. Ditto.
- Notes & questions
- Look at recent writings for the class:
- Recent announcements (this page now contains recent tweets related to this class!)
- Twitter stuff:
- Recent blog-entries (you are responsible for these for the test; ask me about them if you have questions, comments, any interest, etc.)
- Stay up with recent information on these pages:
- Content update:
- We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.
I'll go over
- Problems with wikidot (and the class wiki) over the weekend.
- Web sites go down. They come back up. They work most of the time. But they don't work all of the time.
- Your work planning must take this into account.
- Use this very small assignment as a learning opportunity to apply to your tasks for the rest of the semester (or, possibly, your life).
- I will give you the results of the experiment on Wednesday. I've delayed this so that I can get the results of students who did not complete the assignment in time.
- What this class should be like so far
- Any questions about this class so far this semester? Where we're going? Anything at all?
My notes
The Internet is changing all the time. New resources are being added at a phenomenal pace in millions of different sites. You can't keep up with everything on your own. You need help.
It's all about getting computers to work for you, to work while you're not using it. Use the computer to search through information so you don't have to. Use the computer to deliver information to your email inbox or to a specific Web page so you don't have to go get it. You don't have to remember to do the query.
You still have to define the search. You probably have to spend more time up-front when defining the query.
- RSS
- What it stands for
- Really Simple Syndication
- Rich Site Summary
- RDF Site Summary
- RSS is an application of XML.
- RSS is an open definition so anyone can use it.
- RSS is a standard widely adopted by millions of Web sites
- If you have a Web site that is updated relatively frequently, it makes sense to put these updates into an RSS feed.
- What it stands for
- Compare HTML and XML
- For our purposes, what are the benefits of XML (and, hence, RSS)
- Can easily be translated into HTML for display purposes
- Can specify "fields" that can be searched
- So, what does this mean for RSS?
- RSS a common representation for lots of databases and lots of Web sites
- This common representation means lots of tools can be specially written to work with that standard (send it, search it, slice it, dice it)
- So, what does this mean for you?
- Saved time
- Saved attention
- Classes of RSS feeds
- Blogs
- Newspaper articles
- Online feed readers
- Why not feed reader application?
- Where can you find RSS feeds
- In RSS feed directories (with search)
- In searchable subject indices of RSS feeds (with browsing)
- On RSS-enabled Web pages
- Created keyword-based feeds at search engines
- Types of RSS feeds
- Static feeds
- Keyword-based feeds
Resources
Online feed readers
- General
- Bloglines
- Google Reader
- iGoogle
- My Yahoo
- NewsIsFree
Where can you find RSS feeds
- Top lists
- Featured Reading Lists at Google Reader
- Top 100 Blogs at Technorati
- Most Popular Blogs at Bloglines
- 50 best business blogs
- AllTop
- Blogs.com
- In RSS feed database (with search)
- In searchable subject indices of RSS feeds (with browsing)
- Yahoo Directory — RSS feeds of updates to specific Yahoo Directory categories
- NewsIsFree — directory of RSS feeds
- Financial Investing RSS feeds
- On RSS-enabled Web pages
- Created keyword-based feeds at search engines
- NewsIsFree (paid)
RSS feeds from Wikidot
07 RSS Lab
by
samoore (29 Sep 2009 19:20; last edited on 23 Nov 2009 15:04)
We are going to work through more exercises allowing you to explore RSS feeds and related tools.
Class held on 09/30/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- No lecture today.
- Work on exercises for today.
- Complete the experiment before this Sunday. You should put your results here.
At beginning of class
Before class starts (for you to do)
- Term project
- I thoroughly enjoyed meeting with most of you during the last 10 days or so and discussing your term projects. I'm really looking forward to seeing these projects develop.
- Make sure that you have updated the student list page if you have talked with me about your term project topic. Do this now.
- The first status report is due by the beginning of class on October 5.
- You should go to the Grades database for this class.
- If you have completed a wiki-based assignment (blog entry, industry update, notes, or questions), then enter that information under "Add Wiki Assignment".
- You can check to see if I have recorded any grades for you on the SiteMaker page.
- I have graded everything that was submitted correctly through 9/26/09. I'm catching up!
- All blog entries, industry updates, notes, and questions are points out of 10.
- A “9” grade on a blog is what I would call a “normal, high-quality, well-written, informative blog entry.” A “10” means that you exceeded this standard. Your entry was somehow more informative, more insightful, more engaging (don't discount this — I very much welcome reading an interesting well-written entry with a good story integrated into it) than my expectations.
- If you get a “10” on a blog entry, I want you to copy your blog entry from your wiki to my wiki. Create a page with the same name (i.e., “blog:XXX”) but it should be in the class wiki. Do this as soon as you see your grade. Thanks. This gives other people the chance to learn about 1) what you wrote about in your blog, and 2) what a well-written blog entry looks like.
- Check who is doing what:
- Notes & questions
- Notes posted by the end of the class day.
- Questions (at least the first draft) posted by one week later.
- Special blogs (these should be written and posted on this howcanifindit site)
- Post these by the beginning of class on the day you sign up for.
- Other blogs should be written and posted on your own personal site. I will then tell authors of posts that get "10" grades to transfer the blog to the howcanifindit site.
- You can post these by the following class (since they are about what we do in class).
- Assigned/due dates
- Notes — notes to be posted by the end of the class day
- Questions — questions to be posted (at least first draft) by one week later
- General blog entries — write-up to be posted by the following class
- These will be posted on your own wiki. We'll see how to do this today.
- Industry updates — write-up to be posted on the day listed
- These will be posted on the course wiki. Ditto.
- Notes & questions
- Look at recent writings for the class:
- Recent announcements (this page now contains recent tweets related to this class!)
- Twitter stuff:
- Recent blog-entries (you are responsible for these for the test; ask me about them if you have questions, comments, any interest, etc.)
- Stay up with recent information on these pages:
- Content update:
- We now have some information on the notes page and questions page. Even if people are not signed up for specific days on the class notes page, I would still recommend that you post notes and questions. The more good information that is on these pages, the more that I will be able to use this information on the tests, and the better that you will do on the test — as opposed to me coming up with some random, poorly-worded question that you have to guess on.
I'll go over
- Feedback about blogs
- word usage lesson: it's versus its
- spelling lesson: "definitely"
- spelling lesson: lose/loose and choose/chose
- Specifics about grading criteria
- Notes — useful, complete, formatting makes easy to scan.
- If we don't have a lecture, then summarize what a student should have learned from the in-class exercises.
- Questions — variety, depth of coverage, usefulness of questions.
- Again, if we don't have a lecture, then questions should come from what students should have learned from the in-class exercises.
- Blogs — informs the reader, personal reaction, insight, context, detailed.
- Notes — useful, complete, formatting makes easy to scan.
- Web search experiment results
- As for today's experiment… you should complete the experiment before this Sunday at 5pm. You should put your results here.
- Note that your results are supposed to go in alphabetical order by family name. Please do this. And if you see that someone before you has messed up — go ahead and fix it!!
08 News Search
by
samoore (02 Oct 2009 15:45; last edited on 17 Dec 2009 19:46)
We learn about the major news search tools, as well as how to integrate them with your knowledge of RSS.
Class held on 10/05/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- I'll lecture for a bit using some slides (as a PDF)
- Work on today's news search exercises.
At beginning of class
On your own
- Read the current to-do list on the course home page.
- I will keep this up-to-date for each class. I hope this will make it easier for you to figure out just what it is that you're supposed to be doing (and when) during the semester.
- No grades (too much other prep going on)
Resources
News search
- General news search
- Historical
- Google News Archive Search
- Newspaper Archive (requires paid subscription)
- International
- Local newspapers
Print newspapers
- Can Newspapers Be Saved? Part 1, Part 2
- Bringing history online one newspaper at a time
10 Real Time Information
by
samoore (07 Oct 2009 02:41; last edited on 17 Dec 2009 19:39)
We learn about real time information exchange and social networks, and get an introduction to real time search tools.
Class held on 10/07/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- I'll lecture for a bit — but no slides today
- Work on today's exercises
At beginning of class
On your own
- Read the current to-do list on the course home page.
- No grades (too much other prep going on)
What I'll cover
- I loaded all of the slides from this semester for each day I did a lecture.
- RSS search experiment results
- Search engine analysis assignment
Notes
- When Twitter results are included with other results, they overwhelm everything else.
| Communication channels | ||||||
|---|---|---|---|---|---|---|
| Conversation | ||||||
| Channel | Mobile | Private | Public | based | Length | Concurrent |
| ? | x | x | varied | |||
| chat | x | x | x | x | short | x |
| microblog | x | x | x | ? | short | |
| texting | x | x | x | varied | ||
| x | x | x | x | varied | ? | |
| blogging | x | medium | ||||
| Uses of Twitter | |
|---|---|
| For business | For personal |
| polls | stay in touch with friends |
| advertise events | share photos, videos |
| follow what people are saying about your product | monitor what's really current |
| reminder of events and information | get answer to a question |
| bring attention to Web items, YouTube | |
| provide personal touch with customers & clients | |
Resources
General
- bit330 realtime at delicious
- The ultimate guide for everything Twitter
- An Illustrated Guide To Using Twitter
- Hashtags explained
- 50 Useful Twitter Tools for Writers and Researchers
Real-time search
- Scoopler (about)
- OneRiot (about)
- Collecta — realtime search for blog posts, articles, comments, twitter, flickr, twitpic, youtube
- MicroPlaza — "welcome to your personal micro-news agency. Discover relevant information filtered by the people you follow."
- Surchur — "the dashboard to right now";
- Addict-o-matic — "inhale the web"; instantly create a custom page with the latest buzz on any topic
- seems best for actively tracking a topic versus coming to it with a random query
- Yauba — search that tries to be all things to all people
- FeedMil (about) — "real-time feed search"
General search based on real-time information flow
- Topsy — "a search engine powered by tweets"
Twitter search
- Search at Twitter
- Tweetzi (help) — "real-time Twitter search & trends"
- TweetMeme — "hottest links on twitter"
- TweetGrid (how-to, FAQ) — "create a Twitter search dashboard that updates in real time."
- CrowdEye (about) — "what all the twitter is about"
- Twitter Power Search
- BackType (about) — "a conversational search engine"
- TwiST — "Twitter Search Tool"
- Tweetag
- Combining Twitter search with regular Web search
- Twiogle — "search twitter & google at the same time"
- BingTweets — "fusing twitter trends with bing insights"
Twitter trends
- TwitScoop (watch the live buzz cloud; search; hot trends)
- Twazzup — "search twitter. get real insights."
- Twendz (about) — "exploring Twitter conversations and sentiment"
- Twopular — "trends on Twitter aggregator"
- Twemes (about) — "twitter memes — global tags for twitter" (hashtag grouping and searching)
- Twitt(url)y (about) — "We track and rank what URLs people are talking about on Twitter."
- Retweet Radar — "Finding trends in the mountains of information 'retweet'ed on Twitter."
- Trendistic (help) — "see trends in twitter"
- TweetVolume — find out how frequently specific words appear in tweets.
Distribution
Local search
- NearbyTweets — finds tweets in your neighborhood
- AskTwitr — look at tweets on a map
- TrendsMap — "real-time local Twitter trends"
Twit search
- Twittorati — "Twittorati tracks the tweets from the highest authority bloggers, starting with the entire Technorati Top 100 and soon including many more of the web's most influential voices."
- TwitSeeker (about)— "who you're looking for…by what they're talking about"
- Twibs — Twitter business directory
- LocalTweeps — "a ZIP-code level Twitter directory"
Other Twitter tools
- BackTweets — "search for links on Twitter"
- Cloud.li — real time cloud generation of search results
- TwitterFall — "Twitterfall is a way of viewing the latest 'tweets' of upcoming trends and custom searches on the micro-blogging site Twitter. Updates fall from the top of the page in near-realtime."
- PollDaddy — polls on Twitter
- TwitLinks — "The latest links from the worlds top tech twitter users."
- MySkyStatus — Real time flight tracking updates. Shows your flight's location to twitter and facebook followers
Organization
- TwTask — manage to-do lists from Twitter
- Twit2Do (faq) — "create to-do lists with twitter"
- Postica (faq, twitter) — create and share sticky notes across the Web
Other real-time tools
- uberVU (tour, tools) — "easy way to find and follow conversations"
- ReadTwit (about)— converts Twitter feed into an RSS feed (with URLs un-shortened, filter users in/out of feeds, and filter out #hashtags).
- Near real-time search on Google
- Enable real-time updating of RSS feed readers from publishing blogs
Using Twitter
- TweeTree (about) — "we built this site to help us better use Twitter"
- TweetCloud — "what's being said?"; gives you a tag cloud that helps summarize the tweets of a specific user. Can help you decide whether or not to follow a twit.
- Twitter Karma — helps you see whether or not you are following your followers, whether your followers are following you. (Click on the "Whack!" button after logging in to Twitter.)
12 Research Sites
by
samoore (20 Oct 2009 17:59; last edited on 17 Dec 2009 19:41)
We learn about several kinds of academic research sites. We also learn what the Deep Web is, why we need to care about it, and how we might go about accessing it.
Class held on 10/21/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- I'll lecture for a bit (no slides today).
- Work on exercises.
At beginning of class
On your own
- Read the current to-do list on the course home page.
- No grades (too much other prep going on)
What I'll cover
- Project stuff
- RSS feeds vs current events stuff
- Grading stuff
- Dean
- Priority: status report feedback
My notes
- General Web search
- Suggests that all information can be searched within one system
- Easy and self-explanatory
- Has only a limited understanding of "structure"
- The Invisible Web
- "Invisible" to the general search engines since they don't index it
- You'll hear about the "Invisible Web" or the "Deep Web" — same thing
- Pages that are invisible
- Disconnected page
- Page consisting primarily of images, audio, video
- Flash, Shockwave, compressed files
- Content retrieved as a result of filling out forms
- Real time information (ex: stock quotes)
- Pages that are proprietary
- Significance of the Invisible Web
- Bergman's widely-cited statistic is that there are 550 billion documents in the invisible Web
- Others believe it's more like 20-100 billion
- Estimated that there's about 300K Web sites with queryable databases
- 60 of the largest Deep Web sites containing about 750 terabytes of data
- Bergman's widely-cited statistic is that there are 550 billion documents in the invisible Web
- Academic Web-based search
- More academic content is moving to the Web exclusively
- Part of general trend from print to electronic
- Much of this is contained in the Invisible Web
- Explain how search engines work
- General
- Crawlers go out and send information back to the central database
- Queries go against the central database
- SE company expertise is in design of the index and design of the query process (including input interface and output formatting and reporting)
- Academic
- Crawlers go out, find a database, and what? Index the query interface page? Send some standard queries to the index and save the results?
- General
- Should you consider using Google Scholar?
- Pros
- A cross-database (federated) search engine
- Returns snippets from articles (and sometimes abstracts)
- Indexes the full text (actually, part of the full text) and not just the abstracts and subject terms
- Can link to your own school's library
- Cons
- Secretive about its coverage of specific publishers, journals
- Limits it searches to the first 100-120K of a page
- Hasn't been updated much (at all?) since its launch
- Returns far fewer documents than the native search engines
- Searching by field is fairly unreliable and counter-productive
- Pros
- What do we want from an academic search engine?
- Comprehensive
- Contains lots of journals over lots of topics
- Goes far back in time
- Up-to-date
- Integrated across databases
- Integrated into a database
- Transparent as to what it contains or doesn't contain
- Comprehensive
- Recommendation
- Use Google Scholar
- as a way to find free, online versions of articles you already know you want
- like you use Wikipedia — as a good starting place for exploring
- Use the other Deep Web search tools — Scirus, Turbo10, plus the LII.
- To do a complete search, you should definitely talk to a librarian and use the Library's immense set of resources.
- Use Google Scholar
In-class demonstration and discussion
- Google Scholar (the gorilla in the room)
- Basics
- intitle:"carbon trading" — 472 (271 citations in 2008)
- Cited by
- Referenced by (under “Related articles”)
- Web search
- Availability at UM library (set up under "Scholar Preferences")
- "Recent articles" vs. "All articles"
- intitle:"carbon trading" — 472 (271 citations in 2008)
- Weird logic — that appears to have been fixed in 2009!
- Subject groups
- intitle:Vietnamese — 11,000 records (9,690 in 2008)
- allintitle:Vietnam — 98,900 records (816,000 records in 2008) (all subject areas)
- allintitle:Vietnam — 23,600 records (29,100 records in 2008) (with all of the subject areas checked)
- allintitle: Vietnam OR Vietnamese — 109,000 records (104,000 in 2008; notice that this is less than the 816,000 found for Vietnam alone above)
- intitle:Vietnamese OR intitle:Vietnam — 109,000 records (105,000 in 2008; this is less than the 816,000 found for Vietnam alone above)
- allintitle: Vietnam OR Vietnamese — 30,000 records (141,000 in 2008) (with all of the subject areas checked)
- intitle:Vietnam OR intitle:Vietnamese — 40,000 records (27,900 in 2008) (with all of the subject areas checked); this should be exactly the same as the previous query.
- Publication year strangeness
- intitle:Vietnam 1435-2008 — 20,200 records
- intitle:Vietnam 1960-2008 — 20,900 records
- intitle:Vietnam 2010-2050 — 2 records
- Basics
- Scirus (deep web search competitor)
- title:"low carb" "low fat" "weight loss" — 560 hits
- Ability to filter on the left (sources, file types)
- Recommendations of refining your search on the left
- Save or email the results.
- Sort by relevance or date.
- Similar results
- title:"low carb" "low fat" "weight loss" — 560 hits
- Google Books (book-based)
- UM Library (library-based)
- Biznar (specialized deep web search)
- BNet (another specialized search tool)
- carbon trading
- Content types to right
- RSS feeds
- carbon trading
- Wolfram|Alpha (computational knowledge)
- Yahoo Directory (Web site directory)
- Explore Business sites
Possible blog entries
There are two possible blog entries related to this class — you can write one, both or neither of these. But I would find these interesting.
- Write a blog entry on what you observed, what you learned and found interesting, focusing on information that other students might find useful.
- Go talk to a Ross librarian. Tell them your topic and ask what 3 to 5 databases or tools that you might find most useful given that topic. See what databases they might tell you to focus on. Use them for a while. By the end of the semester, write a blog entry describing how the information you find in these databases differs from what you would find in the Web at large or what you found in the Deep Web search tools we were introduced to above.
BTW, I would find it rather remarkable if you didn't have in your term project a section or group of resources or something related to information a person could get in a library's database (compared with Deep Web and the Web itself).
Resources
Research tools
Primary
The following sites are traditional Deep Web search sites. Each one of these takes a different way of accessing documents in the Deep Web so they're each worth trying.
- Google Scholar
- About Google Scholar
- Advanced Scholar Search Tips
- Native search engines vs. Google Scholar
- Redux Peter's Digital Reference Shelf: Google Scholar — basically a scathing review of Google Scholar.
- Google still not indexing hidden Web URLs by Hagedorn and Santelli, D-Lib Magazine, 14:7/8, July/August 2008.
- Scirus
- 350mm scientific documents indexed
- Advanced search
- About Scirus: provides a ton of useful details about Scirus.
- Peter's Digital Reference Shelf: Scirus
- IncyWincy — the invisible Web search engine
- Advanced search
- Be sure to investigate the preferences page.
- DeepDyve
Library- and book-based
Each of these tools provides a different way of accessing information in books. Lots of resources are being thrown at Google Books so we should definitely keep our eyes on it as more books enter the system.
- University of Michigan Library
- Google Books
- Amazon Advanced Book Search — Yes, I am including Amazon, the book seller, on this list.
- WorldCat
- 1.4 billion items in more than 10,000 libraries worldwide
- About
- Advanced search
Specialized Deep Web search
Each of these is a deep web search engine but the underlying document sets are specialized.
- Green Info Online
- Review on Peter's Reference Shelf
- Be sure to look under "Search Options", "Advanced Search", and "Visual Search"
- At the top of the screen, be sure to look at "Publications" and "New Features!"
- "GreenFILE offers well-researched information covering all aspects of human impact to the environment. Its collection of scholarly, government and general-interest titles includes content on the environmental effects of individuals, corporations and local/national governments, and what can be done at each level to minimize these effects. Multidisciplinary by nature, GreenFILE draws on the connections between the environment and a variety of disciplines such as agriculture, education, law, health and technology. Topics covered include global climate change, green building, pollution, sustainable agriculture, renewable energy, recycling, and more. The database provides indexing and abstracts for approximately 384,000 records, as well as Open Access full text for more than 4,700 records."
- BNet — management, strategy, work life skills & advice for professionals. This is more of a collection of useful business-related information but I couldn't figure out where else in this course to let you know about it. So here it is.
- Biznar — deep web business search
- Advanced search
- About and Help
- Mednar — deep web medical search
- Advanced search
- About and Help
- ScienceResearch.com — "the world's science all in one place"
- Science.gov
- Advanced search
- About and Help
- "Great dot-gov Web sites"
- "Deep Web Technologies powered federated search engine Science.gov takes government to next level" — "searches 38 government science collections, and comprises over 200 million pages of science information"
General reference and answers
Each of these sites provides access to sets of facts and answers to questions. The first is a computational knowledge engine and the other sites have well-organized sets of traditional articles and entries about specific topics.
- Wolfram|Alpha
- About
- Examples
- Wolram|Alpha and Google Face Off (from Technology Review)
- I was positively impressed with Wolfram|Alpha (Doug Lenat)
- Wolfram Alpha gets mixed reviews (WSJ)
- Information Please Almanac
- Encyclopedia.com
- Britannica
- Wikipedia
Secondary deep web sites
These are worth peeking at if you need some more information. Each one of these provides reliable resources.
- InfoMine (UCal, Riverside)
- Isn't being updated any more but still seems useful
- Directory of Open Access Journals — 1673 (1262 in 2008) journals are searchable at the article level, 319,861 (211,294 in 2008) articles.
- Bing
Web directories
The purpose of each one of these sites is to provide an organized and categorized sets of Web sites that have been evaluated for usefulness. Each one of these is worth looking for to see if you might get lucky.
- Yahoo Directory
- Google Directory
- Intute — "Helping you find the best websites for study and research"
- Librarian's Internet Index
- Overview: describes who they are, what they do, and what you might expect to get from looking at their site.
- Internet Public Library
Pay sites
Each one of these sites is quite useful but they require you to pay so I'm guessing you are out of luck; however, when you get out to the working world remember that these exist. You might be able to gain access to them through your employer.
In development
I have this listed here just so that I can remember to look at it in future years to see if it has evolved into something more useful than its current condition.
- Q-Sensei
- Includes the Library of Congress (I believe).
- DeepPeep
- About
- "DeepPeep is a search engine specialized in Web forms. The current beta version tracks 13,000 forms across 7 domains."
Dead
Each one of these was a viable deep web search engine but now they are not worth investigating or don't exist in any form.
- CompletePlanet — 70K databases (but appears to be dead as of 2004!)
- Turbo10
- Microsoft Live Search Academic — closed down in May 2008.
- OAIster: find the pearls
- Integrated into WorldCat in October 2009
Articles
- Exploring a 'Deep Web' that Google can't grasp, NYTimes, February 22, 2009
- Accessing the Deep Web
- Exploring the academic invisible Web
- Google Scholar revisited by Peter Jascso, Online Information Review, 32:1, 2008, pp. 102—114.
- The Deep Web: Surfacing hidden value
- As summarized by the editor of The Journal of Electronic Publishing: "Michael K. Bergman, whose BrightPlanet company offers a new approach to search engines, examines the wealth of information that is available only on dynamically created Web sites, those that don't exist except as relational databases until someone seeks information from them. As more sites adopt the dynamic approach to pages, they are creating a challenge for standard search engines. This article looks at some alternatives."
- Search engine technology and digital libraries: Libraries need to discover the academic internet
- Google Scholar -- a new data source for citation analysis, by Anne-Wil Harzing, February 5, 2008 (7th version).
E-books
- Google Book Search
- Project Gutenberg
- American Memory (by the U.S. Library of Congress)
- Million Book Project
- Google Electronic Text Archives
Other
- BrightPlanet
- Academic databases and search engines (wikipedia)
- Other databases of interest
- Science.gov
- Medline Cognition
- PubMed
- Infovell — the Deep Web of Life Sciences.
13 Change Notification Tools
by
samoore (25 Oct 2009 14:24; last edited on 23 Nov 2009 15:05)
We are going to discuss different tools that can notify you in different ways and in different circumstances when some specific thing has changed on the Web: email alerts, page monitoring software, and RSS feed-manipulation software.
Class held on 10/26/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- I'll lecture for a bit (no slides today).
- Work on exercises.
At beginning of class
On your own
- FYI, I added a scanned copy of the diagram I created for the real-time information class.
- Read the current to-do list on the course home page.
- No grades (too much other prep going on)
- I am about 1/4 the way through the status reports. I'm working on them, I promise.
- I haven't graded any blogs in a very long time.
My notes
Monitoring changes
- Email alert service
- Monitor entire site
- These are set up by the Web site and you subscribe to them
- No false positives
- Sometimes you want email (cell phone! or even Messenger)
- Page monitors
- Monitor specific pages (but not sites)
- Lots of false positives unless keyword based
- RSS feeds
- Problem: False positives
- Unless keyword based or filtered somehow
- Focused RSS feed — If you’re lucky, there is a keyword-based, or specific-topic defined, RSS feed available for a site you can subscribe to.
- General RSS feed: If there's simply a general RSS feed (such as "Yahoo breaking news"), then you should run that feed through a keyword tool:
- FeedRinse
- Yahoo Pipes (if other processing is needed)
- The following are useful if there's no RSS feed available on a page but you would like to set one up:
- FeedYes: I would try this first since it's the easiest to use when setting up a feed.
- Feed43: This is more powerful but more difficult to use.
- Dapper: This is another powerful tool.
- Problem: False positives
- Why not just use RSS
- Some sites don't have RSS feeds
- So use site-based email alerts
- Or use a tool to make an RSS feed
- Some information isn't site based
- So use search-based email alerts
- Some information is too fine-grained to be covered by RSS feeds
- So use page monitors
- Some sites don't have RSS feeds
Email alerts
Finding email alerts
- Search for email alerts
- Query: "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (189 million in 2009) (77.2 million in 2008) (60.2 million in 2007)
- Yahoo results (464 million in 2009) (392 million in 2008)
- More specific search for email alerts
- Query: inurl:mail OR inurl:alert "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (88,100 in 2009) (115,000 in 2008)
- Yahoo results (280,000 in 2009) (242,000 in 2008)
- Science email alerts
- Query: science "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (38.6 million in 2009) (17.1 million in 2008) (2.34 million in 2007)
- Yahoo results (72.3 million in 2009) (69.9 million in 2008)
- INURL query
- Google results (8,140 in 2009) (382,000 in 2008)
- Yahoo results (155,000 in 2009) (85,600 in 2008)
- Query: science "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Copper email alerts
- Query: copper "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (757,000 in 2009) (384,000 in 2008)
- Yahoo results (5.49 million in 2009) (2.75 million in 2008)
- INURL query
- Google results (475 in 2009) (531 in 2008)
- Yahoo results (3,560 in 2009) (835 in 2008)
- Query: copper "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- So, think about how you might apply this both to a company you are interested in or an industry you are interested in
General email alert services
- Yahoo Alerts
- Some types of alerts
- Broad-ranging alerts
- Breaking news (via SMS, Email, or IM), Local News (via email)
- More specific
- Keyword News (via Mobile, Email, IM), Stocks Watch (via Mobile, Email, IM)
- Broad-ranging alerts
- Be sure to look over the whole list of categories of alerts.
- Some types of alerts
- Google Alerts (help)
- All of this is based on submitting queries
- Once a day
- Once a week
- "As it happens"
- Broad-ranging alerts
- Web & comprehensive alerts
- More specific
- Keyword-based alerts for news, blogs, video and groups
- Can receive as email or as an RSS feed
- All of this is based on submitting queries
Page monitoring software
Overview
Page Monitors were the next big thing five years ago. It is a program or web based program that you download. Each day (or whatever time period you want to set) it downloads the webpage, and if it's different it will send you an email. Some tell you what has changed while others just tell you that it has changed.
At first, you might not be that impressed with page monitors. But after realizing that it can be used for a lot more than news, it can be quite a useful tool. WatchThatPage.com is the best free site.
WatchThatPage has a limit of 250 characters for the URL. Also, shortened URLs (from tinyurl.com or bit.ly) do not work. To get around these problems, use TrackEngine, where neither of these problems exist.
- Capabilities
- Automatically determine if a Web page, or part of a Web page, has changed
- Results might be delivered via email, RSS feed, or a summary Web page
- Page Monitoring Software Examples
- Track a company's press release page (Goldman Sachs)
- Find out when a new version of software is released (BBEdit)
- Find out when a new product is released (Canon cameras)
- Track a product category (Flat panel LCD TVs at Amazon)
- Monitor product information (comments about a movie at Amazon)
- Track auctions
- Track new jobs
- Monitor earnings releases (at JPMorganChase)
- Track who is linking to you (e.g., link:pogue.blogs.nytimes.com/ -site:pogue.blogs.nytimes.com David Pogue)
- Follow investment information about a company (e.g., American Express at The Motley Fool)
Web-based
- WatchThatPage
- Free (for any number of pages), or $20/year for priority service
- Can highlight changes in pages
- Changes sent in an email
- Keyword matching
- This site doesn't appear to be updated any more (4+ years)
- TrackEngine
- Free for 5 bookmarks, or $20/year for 10 pages, or $59/year for 50 pages
- Highlights new content in HTML email
- Monitors changes daily
- Does do keyword matching
- This site hasn't been worked on for 7+ years
- Other possible sites: InfoMinder, ChangeDetect, Trackle
Windows software
- WebSite-Watcher
- Free for 30 days; $45 purchase
Feed creation software
Overview
- Capabilities
- Demonstrations
- Demonstration with Feed43 and the JPMorganChase Annual report (the feed)
- Demonstration with Feed43 and a Google Web search results page
- Demonstration with Feed43 and the Goldman Sachs press release page
Make a feed
From other feeds
- FeedRinse: From their site, “Feed Rinse is an easy to use tool that lets you automatically filter out syndicated content that you aren't interested in. It's like a spam filter for your RSS subscriptions.”
- Can test on this page: http://feeds.nytimes.com/nyt/rss/companies
- Yahoo Pipes
- FeedZero: This uses adaptive filtering software to learn what feed articles you like, and which you don't like, based on your input.
From a page
- Dapper
- Description: Dapper is pretty slick. You can look through user created Dapps or you can (easily) create your own. Don’t forget to use the “get a nice short url” option and create your own that is easier to look at/use. This allows you to get an RSS feed for more things (instead of just news and blogs) such as searches.
- The Glory, Bliss and How-to of Screen Scraping for RSS
- Demo
- Video tutorial
- Useful Dapps
- FeedYes
- Create feed from http://www2.goldmansachs.com/our-firm/press/press-releases/index.html
- Feeds will work for 14 days, then you have to pay $30 per year
- Feed43
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
- Define Extraction Rules – By finding the specific places (within the code) of the information that you’re looking to have monitored by the RSS feed. There are directions for what specific code to use in the program.
- Then click extract
- Then you can give it a title, description, url, etc
- Then put in where the title, date, etc are etc
- If these sites are updated once a month, its too much of a hassle to make one of these (use a page monitor). But if it is updated daily and you want to monitor it, then it might be a good idea to make one!
- Free, or $29/year for 20 hourly updates
- My feeds
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
Examples
- From TrackEngine
- TrackEngine help
- TrackEngine tutorials
- TrackEngine hot lists
- InfoMinder examples
- ChangeDetect examples
- Feed43 example
Email filtering
- Gmail
- Limit around 7.2GB (4.5GB in October 2007)
- Can use a filter
- To forward just some emails (to different people?)
- To apply a label to emails
- Plus addressing
- A powerful method that can be applied to Email alerts is using “plus addressing” service when you sign up for an Email alert (e.g. from some query), i.e. tell them that your address is dummy+moc.liamg|reifitnedIyreuQemos#moc.liamg|reifitnedIyreuQemos instead of the normal address moc.liamg|ymmud#moc.liamg|ymmud. Thus, if you get this address to your mail account, you can filter it by what comes after the plus! This is a extremely helpful since it makes it easier to filter emails.
- Description for GMail
- Use a different address for each email alert
- Helps you filter
- Helps you track who is selling your email address
- Defining a filter
- Keep definition to a minimum, as simple as possible
- Test, test, test
Tools you now have at your disposal
- Method to follow to find site-based email alerts
- Tools to create search-based email alerts
- Tools to monitor Web pages for any changes to their contents
- Tools to apply keyword-based filters to RSS feeds
- Tools to convert tabular Web page content to an RSS feed
Your term project
Email alerts and your term project
You should do the following for your project wiki:
- You should figure out some way that you are going to document the email alerts that you use in your email account to route your incoming alerts. Maybe print the alert page to a PDF file and link it to your wiki? Maybe take a screenshot of your email inbox and highlight the email alerts?
- In either case, you are going to want to have a section in your wiki called "Email alerts".
- On this page you should describe each of the email alerts that you used: the page from which you subscribed to it, why it is useful, and if there are any keywords (or such) that you used to generate it.
All of the above also applies to your page monitors, any feeds you create using FeedYes/Feed43/Dapper, and any feeds you filter using FeedRinse or Yahoo Pipes.
Possible blog topics
You do not have to write a blog. These are suggested blog topics if you were to write one. There are lots of possibilities in this class.
- Describe different ways that you found these tools useful (or not useful).
- Describe how you used Yahoo Pipes, possibly differently than how we have described them here.
Hints about possible test questions
You're definitely going to be held responsible for the following topics:
- What WatchThatPage (as an example of a page monitor) can do
- What Dapper can do
- What Feed43 can do and how its search patterns work
- What Yahoo Pipes can do and how feeds can be manipulated (for example, Fetch Feeds, Union, Filter, Sort)
- Under what circumstances would you use each one of these tools (as opposed to another)
14 Custom Search Engine
by
samoore (27 Oct 2009 13:57; last edited on 17 Dec 2009 19:42)
We discuss custom search engines, and how you can build your own.
Class held on 10/28/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Explain what custom search engines are
- Work on exercises
At beginning of class
- Assignments, you, and me
- Changes (this topic, this class)
My notes
- What is it
- A search tool that uses Google's search engine (as the back end) but that you can instruct the following ways:
- Look in a certain list of URLs (pages, whole sites, or subsets of sites)
- Avoid a certain list of URLs
- Append a set of terms to any user-supplied query
- Customize its looks (within bounds)
- A search tool that uses Google's search engine (as the back end) but that you can instruct the following ways:
- How can it be used
- Its own Web page
- An iGoogle widget
- Embedded in random Web page
- Why would you use it
- Captures creator's knowledge of the field
In previous years students have explored and learned about several search engines: Topicle, Eurekster Swicki, and RollYO; however, for the last year or so these sites have been completely displaced and dominated by the Google Custom Search Engine so that is the only tool we are going to look at today.
Blog topics
- Describe how useful or not Google Custom Search Engine is for your site.
- Describe how you chose the sites to include in your custom search engine.
- Compare and contrast 2 (or more) different custom search engines.
Resources
- Topicle
- Eurekster Swicki
- Rollyo
- Google Custom Search Engine
- BuildASearch
- Yahoo Search BOSS — Build your Own Search Service
- 3 Guides to FireFox Quick Searches (Smart Keywords)
15 Project Day
by
samoore (02 Nov 2009 17:48; last edited on 02 Nov 2009 17:48)
We worked on our projects
Class held on 11/02/2009. (student notes; possible questions).
We just worked on our projects.
16 Wikidot Day
by
samoore (04 Nov 2009 15:03; last edited on 04 Nov 2009 16:24)
We go through some special features of Wikidot while also quickly discussing some tools for investigating the popularity of certain Web sites.
Class held on 11/04/2009. (student notes; possible questions).
Web site popularity
Wikidot issues
- Question from student: "We have blog:blogtitle and we can see them all by going to bloglist. Can we make a template for other things, like to list feeds or reviews?"
- Create (e.g.) news:_template (and have the contents be something like "blog:_template")
- Create (e.g.) news-stories (and have the contents be something like "bloglist")
- You can also look at my code on the homepage for the list of announcements (annc), schedule items (sched09), and blogs (blog).
- Images not showing up
- Make sure the image name is allOneWordWithNoSpaces.
- Header images
- Start with Site Manager/ Appearance/ Themes
- And then progress with Custom Themes
- Sample themes can be found at pages such as
- Themes preview page
- Theme that I use
- Design your own CSS theme
- Free Wikidot Themes
- ColorBlender — for setting up color themes
- HTML Color Codes (and color picker)
- Articles about color and Web design
- Help with coding Wikidot
- Wikidot Snippets
- Modules
- Templates (like for blogs or news items)
17 Image Search
by
samoore (06 Nov 2009 14:31; last edited on 23 Nov 2009 15:06)
We discuss and explore the variety of image resources and search tools available.
Class held on 11/09/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Go through diagram explaining page monitors & RSS filters
- Work on exercises (tweets)
At beginning of class
- No new grades.
- Tags on your pages (on this site and your own site).
My notes
- Diversity of image search tools
- Basic image search on the Web
- Search for images related to news stories
- Search for images on flickr
- Search for images by "similarity"
- Search for images of a person's face
- Search for images related to what's going on right now
- Search high quality for-pay or for-free images
Basic image search
- Google Images — "Ann Arbor" example, only large black & white of Ann Arbor
- Can search by size, type of image, color, and description, of course
- Under advanced image search, you can search for images related to news content, that have faces in them, that have a specific file type, or that are from a specific domain (.edu or a specific site)
- Ask Images — large B&W photos of Ann Arbor
- Can search by size, filetype, color
- Yahoo Images — large B&W of Ann Arbor
- Can search by size, color, domain
- "Travel overlay" — consider Las Vegas, NV (look in left column)
- PicSearch (and advanced search page)
- "ann arbor" "michigan theater"
- Can search for animations separately; also search by color, shape, and size
News image search
- Images at Yahoo News — you can't specify to search for images but they appear at the top of the results page
- This works as a direct link — just replace "football" with your own search term
- Images at Google News — again, you can't specify to search for images directly; you'll have to click on Images on the left side
- This works as a direct link — just replace the "detroit+lions" with your own search term
Flickr search tools
- Flickr
- Can search by full text or tags
- Explore flickr — you could spend many, many hours starting on this page
- Exploring flickr via map (article: Explore the world with flickr)
- Compfight: a flickr search tool
- eiffel tower (via tags) and eiffel tower (via text)
- Notice difference in searching "Tags only" vs. "All text"
- Behold: a Flickr search tool
- Eiffel tower
- Can search based on license
Similar images
- Pixolu — search for [eiffel tower]
- Searches Google, Flickr, and Yahoo
- At Google Images, you can search for "similar images"
- Starbucks coffee and then click on "Find similar images" on the image you like
- I wish they were better:
- GazoPa — find similar images
- Find image similar to http://seedmagazine.com/images/uploads/turtle_article.jpg
- Or to http://www.mccullagh.org/db9/1ds-4/eiffel-tower-from-below.jpg
- Ummmmm…..
- TinEye — find similar images
- Sign up here first
- Search for this photo: http://etc.usf.edu/clipart/3700/3758/eiffel-tower_1_lg.gif
- I get lots of "Received empty response."
- GazoPa — find similar images
Face search
- Exalead
- Search for Gerald Ford
- Watch what happens as you scroll down the page
- Only medium-sized face shots of Gerald Ford
- Doesn't play well with Camino on a Mac
- Search for Gerald Ford
- Google Images
- Search for "Gerald R. Ford"|"Gerald Ford"
- Using the the options on the side of the page, search for facial shots of Gerald Ford
- I wish these were better:
- Picitup — image search (plus face search and "similar" search)
- Gerald Ford
- Search for "Gerald Ford"
- Only facial shots of Gerald Ford
- Search from Yahoo, Flickr, or Picasa
- Search filters based on size, layout, faces, landscapes
- FaceSaerch
- Search for Gerald Ford
- Picitup — image search (plus face search and "similar" search)
Real time image search
Stock photography
These are recommended by Presentation Zen.
- Inexpensive (but good)
- Free (but not bad)
- Cyclops: a stock-photo image search site
- Search for Eiffel tower (at BigStockPhoto)
- EveryStockPhoto
- eiffel tower
- Advanced search allows for license, shape, and "safe search"
- US Library of Congress images
Blog ideas
- Compare your results (related to efforts for your term project) using Google Images compared with one of Ask Images, Yahoo Images, PicSearch, Exalead Image, Compfight, and Flickr.
- Compare the effectiveness (related to efforts for your term project) of the two news image search tools (Google News, Yahoo News).
Resources
- Where to find free images and visuals...
- LIFE magazine photo archive hosted by Google
18 Geography Based Sites
by
samoore (08 Nov 2009 14:53; last edited on 17 Dec 2009 19:45)
We discuss all types of geography-based search tools and resources.
Class held on 11/11/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Go through diagram explaining the tools we'll be looking at today
- Work on exercises (tweets)
My notes
Here's what we're doing today:
- Mostly just you exploring some amazingly cool and useful Web resources.
International, country-specific Web search engines
- SearchEngineColossus.com: "International Directory of Search Engines"
- Yahoo International: Yahoo home pages from countries around the world
Google Maps
- Google Maps (tour, popular content, featured content)
- Google Maps Mania: "An unofficial Google Maps blog tracking the websites, mashups and tools being influenced by Google Maps."
- 100 things to do with Google Maps mashups: the most fun you can have with maps.
- Related tool: Google Earth
Travel
- Google Sightseeing: "Google Sightseeing takes you on tour of the world as seen from satellite, using the free Google Earth program, or Google Maps in your web browser. Each weekday your guides James and Alex present new weird and wonderful sights as suggested by readers."
- Articles analyzing the industry
- Standard travel search tools
- Newer general travel search tools
- Kayak: travel aggregator; searches 140+ travel sites with one search (review, another review)
- TripWolf: worldwide travel guide (review)
- UpTake: "your first step on a great trip"; "search over 1000 travel websites and 20M opinions at once" (review, review)
- WeGo: "Wego searches through 100+ travel sites in the time that it takes you to search one. We’ll help you find the best prices and connect you to the best place to buy." (review)
- Goby: "Create your own adventure" (review)
- Bing Travel: "Bing Travel Price Predictor tells you whether fare prices are expected to go up, down or stay the same."
- Hotel, home, and hostel search
- Sprice: "smart prices to go…anywhere" (actually, focuses on hotels in SE Asia, India, Europe) (review)
- Hotelicopter: "elevate your search" (review)
- LetMyBed: "more places to stay" (review)
- Unusual hotels of the world (review)
- Specialized travel information
- Ixigo: travel in India
Local search
- Biggest players
- Ask City: search for businesses, movies, and events
- MSN City Guides
- Mapquest Local: restaurants, events, news, weather
- Yelp: restaurant reviews (around 250K daily visitors)
- Newer entrants
- Outside.In: "What's happening. Where you are. Right now." (description, review)
- BooRah: "restaurant reviews, menus, pictures, and more" (review)
- When.com: "where to go, what to do, local events" (review, another)
- WebLocal: Canadian local search (review)
Road trips and driving
- Driving directions
- Driving itineraries
- RoadsideAmerica Maps: "Find oddities and tourist attractions and plan trips more quickly"
- MileByMile: free road map RV itinerary guides
- Traffic information
- Outside the U.S.
- Streetmap.uk: Great Britain street and road maps
- Australian driving directions: Australian travel maps, street directory, driving directions, and aerial photographs
- European driving directions (ViaMichelin)
- Mappy.com: European maps, route plans, and address guide (by country; look in upper left of window)
- TheAA.com: routes, maps, and directions for U.K. and Europe
- Zoombu: "Find the best way from your home to your destination", door-to-door journey planner for Europe (review)
- Public transit
- How to use Google Maps to plan a trip by public transportation (cities included)
- HopStop: Provides door-to-door subway and bus directions and maps for NYC. Currently expanding to other major cities. Very popular in Manhattan.
- NYC subway
- Google Transit: plan a trip using public transportation
- PublicRoutes: "get public transit, driving directions, and maps" (review)
Maps
- US Maps
- Mapquest maps
- Maps of the Americas: Perry-Castaneda Library Map Collection
- World Maps
- Map Machine (National Geographic): political or satellite maps of anywhere in the U.S. or world
- Mapquest world maps
- Multimap maps
- Holt, Rinehart, & Winston maps
- Maps made for printing and copying (National Geographic)
- Atlas Explorer (National Geographic): tool to explore geophysical and geopolitical maps of the world.
- CIA's World Factbook: maps of every country in the world
Entertainment
Information
- Historical Maps at the Perry-Castaneda Library Map Collection at the University of Texas
- Maps of current interest (from Perry-Castaneda Library Map Collection)
- Perry-Castaneda Library Map Collection (Univ of Texas)
- NationalAtlas Map Maker: build your own layered map with a wide variety of information
- WorldMapper: "a collection of world maps, where territories are re-sized on each map according to the subject of interest. There are now nearly 600 maps."
- Animation: be sure to check out this animation
- World Sunlight Map: "Watch the sun rise and set all over the world on this real-time, computer-generated illustration of the earth's patterns of sunlight and darkness. The clouds are updated every 3 hours with current weather satellite imagery."
- National Geographic Atlas Explorer: "investigate our world"; a visual guide to global trends
- EarthPulse: State of the Earth 2010
- EarthTools: "find places, latitude/longitude, sunrise, sunset, elevation, local time, and time zones"
Commerce
- AuctionMapper: search eBay for auctions (info)
- Oodle: buy and sell locally (classifieds)
- LiveDeal: online local marketplace
Real estate
- Zillow: "your edge in real estate"
- ZIP Realty: "your home is where our heart is"
- Trulia: real estate search
- RealtyTrac: "foreclosure real estate listings"
- Roost: "homes for sale and MLS listings"
- HomeFinder: "homes for sale, real estate listings & foreclosures"
- ActiveRain: "world's largest real estate network"
- Smaller sites
- PropSmart: real estate search (and community)
- HousingMaps: a mashup of Google Maps and Craigslist
- Enormo: "Every property. Everywhere" (review)
Interactive tools
- GMap Pedometer: plan your walking trips and measure their length
- Wikimapia: a mashup of Google Maps and Wikipedia. Completely addicting to explore.
- MapMyRun: a tool to plot your running route and see what the distance was
Clocks
- The World Clock - Time Zones: Current local times around the world
- Greenwich Mean Time: use this to set your clock to the right time whereever you are
- World Time Zone: find the time using a map
Mobile tools
19 Video Search
by
samoore (14 Nov 2009 17:17; last edited on 23 Nov 2009 15:07)
We discuss several different ways to search for videos.
Class held on 11/16/2009. (student notes; possible questions).
Class structure
- Go through "At beginning of class" information
- Go through diagram explaining the tools we'll be looking at today
- Work on exercises (tweets)
At beginning of class
- I'm gathering your tweets: geography class
My notes
The numbers in parentheses are the average daily visitors in the most recent months (as determined by Google Trends). In each section the sites are generally listed in order of popularity.
Video
Video search tools come in three basic varieties:
- General Web search based on video description and tags
- "Deep video" search (my term) based on an analysis of the audio and video content of the video
- Video directories in which human experts have classified videos (or video series) by their general content
Each of these two varieties of search tools can be applied to different targets:
- Site-specific search
- Web-wide search
So this means that we have six types of tools that we might consider:
| Web | Site | |
|---|---|---|
| General Web | A | B |
| Deep video | C | D |
| Directories | E | F |
Now, even for these six types of tools, we can still have sub-categories of video search tools that differ based on their "target" content. For example, some sites search specifically for entertainment content, others for podcast-type content, and others for academic content.
Finally, sites can differ on dimensions other than those discussed above, most commonly:
- Uploads — does the site allow videos to be uploaded
- Host — does the site host videos or is it just a search site
- Social — does the site allow visitors to tag and/or comment on the videos
Video search
All of the following sites allow you to search for videos around the Web.
| Site | Search | Scope | Uploads | Host | Social |
|---|---|---|---|---|---|
| Bing Video | General | Web | No | No | No |
| Google Video | General | Web | No | Yes | No |
| Yahoo Video | General | Web | No | No | No |
| Blinkx | Deep video | Web | No | No | No |
| Truveo | Deep video | Web | No | No | No |
| Pod-o-matic | General, Directory | Web | Yes | Yes | Yes |
| VideoSurf | Deep video | Web | No | No | No |
| YouTube | General | Site | Yes | Yes | Yes |
| Daily Motion | General | Web | Yes | Yes | Yes |
| MegaVideo | General | Entertainment | Yes | Yes | Yes |
| Metacafe | General | Entertainment | No | Yes | Yes |
| Veoh | General | Web | Yes | Yes | No |
| Hulu | General | Shows | No | Yes | No |
| Clicker | General | Web, Shows | No | No | No |
- Bing Video
- Bing vs Google rematch on video search
- News videos at Bing
- sample (check the filters in the left column)
- Google Video
- Advanced video search
- sample (be sure to check the "Show options" box for the filters on the left)
- Yahoo Video
- Advanced video search
- sample (check out the filters at the top of the page)
- Blinkx: video search engine (100K)
- sample
- "wall" of query results — and you can embed it!
- Truveo: "search video across the Web" (40K, down from 400K 12 months ago)
- Description: "Truveo video search lets you search and find videos from across the Web. Use Truveo to find all types of online video including hit television shows, full-length movies, breaking news clips, sports highlights, music videos, or the latest viral videos. If you are looking for a specific video, Truveo video search can help you find exactly the video you want. Truveo can also help you browse through video across the web and discover new videos that you might like."
- sample… so many things to look at on this page:
- On the left results by channel that you can filter with
- In the center you can choose "Top ranked", "Most recent", "Most popular", and "Highest rated" — FTW!
- On the right you get results from Bloomberg. Why Bloomberg? I have no idea.
- Clicking on the Search button you can choose to search Channels, Categories, or Shows — try it out and see how the results differ
- Help
- Articles
- Pod-o-matic (10K)
- sample — notice the following on this page:
- You can search by episode or by podcast (series)
- You can browse categories of podcasts
- You can search by audio, video, or both
- You can look at community-based information
- You can upload your own podcasts
- sample — notice the following on this page:
- VideoSurf (read this article) (7K)
- Browse categories of news
- Browse people in the news, for example, J
- sample — look at all of the information on this page:
- In the left column, lots of different filters.
- In the center column, the results of the query with a short description, age of the clip, and a film-strip with shots of what is happening throughout the video
- You can also move your pointer over the film strip and get an option to show the faces of the people in the video — amazing!
- You can sort in multiple ways, and you can limit the results to videos added by time period
- Description: "VideoSurf is video search engine that has created a better way for people to search, discover, and watch online videos. Using computer vision VideoSurf has taught computers to “see” inside videos to let users find and watch videos that they really want to see. Whether you’re looking to watch funny videos or scary videos, movie clips or TV full episodes, the hottest new music videos or breaking news clips, VideoSurf’s video search engine is the place to go to find the videos you’ll love."
- YouTube
- sample; there are lots of good tools across the top of the page:
- The ability to look for specific episodes, channels, or playlists
- The ability to sort several different ways
- The ability to filter by length and by type of video
- The advanced options button provides several more tools to refine your search
- Articles
- sample; there are lots of good tools across the top of the page:
- Daily Motion: "Dailymotion is about finding new ways to see, share and engage your world through the power of online video. You can find - or upload - videos about your interests and hobbies, eyewitness accounts of recent news and distant places, and everything else from the strange to the spectacular." Site is at 1.4M visitors per day, but it has lost 1/2 of its traffic in the last 18 months.
- sample; you can sort these results in many different ways
- MegaVideo: "I'm watching it." (1.4M)
- This is almost exclusively entertainment videos.
- sample
- Metacafe:
"Metacafe is one of the world's largest video sites, attracting more than 40 million unique viewers each month (comScore Media Metrix). We specialize in short-form original content - from new, emerging talents and established Hollywood heavyweights alike. We're committed to delivering an exceptional entertainment experience, and we do so by engaging and empowering our audience every step of the way." (500K)
- Veoh: "Veoh is a revolutionary online video service that gives users the power to easily discover, watch, and personalize their entertainment viewing experience. With a simple broadband connection Veoh gives you free access to all of the great TV and film studio content, independent productions, and user-generated videos on the Web." (1.1M)
- sample
- Notice the ability to limit by category and sort in different ways
- Also, under the Advanced button you can filter by length
- sample
- Hulu: "help people find and enjoy the world's premium video content when, where and how they want it" (600K)
- Clicker: "What's on online"
- Description: "Clicker is the complete guide to Internet Television. Our mission is to make it simple for you to find the right show, right now. … Clicker catalogs all broadcast programming online, along with TV-quality Web originals, from these silos and delivers them in one seamless, organized experience so you can easily discover what's available to watch (and what isn't) online, where to watch it, and what's worth watching."
- Articles
- sample; notice the features on the page:
- Across the top, you can filter by whether the source came from TV, Web, Music or Movies
- Across the top, you can sort by relevance, popularity, or airdate
- Down the right, you can see two things:
- The source results
- The categories from which the results come — if you click on one of them, then you re-run the query and just get the results from that category
Podcasts
The term podcast has not been well-defined though there are some elements that are generally agreed-upon. As stated in Wikipedia, "A podcast is a series of digital media files (either audio or video) that are released episodically and downloaded through web syndication." A single episode of a podcast can be thought of as a talk-radio show or an editorial on TV news or a TV news segment. These podcasts are delivered to listeners by subscribing to them; some are delivered on a regular basis while others are not so regular.
Part of the trouble with this type of search is that the term itself isn't agreed-upon. Alternative terms are vodcast (referring specifically to video podcasts), vidcast, netcast, audio blog, blogcast, or DIY radio. This is not a good situation. A related problem is that these terms generally can be used to refer either to a single episode or the whole series.
The other major difficulty with podcast search is that the content of a specific episode is less well captured by tags and descriptions than by the content of the episode itself. As stated above, a podcast generally consists of an audio or video file — it does not contain a searchable text translation of that content. Every podcast series generally has a text description. So, if you're looking for a podcast about the U.S. economy, you're in luck; however, if you're looking for a podcast that specifically mentions the U.S. foreign trade balance, you have a much more difficult time.
This generally means that a superior podcast search engine would provide the following:
- Aggregate podcasts (by whatever name), and only podcasts, at the site
- Provide the ability to search over the text description of the podcast series
- Provide the ability to search over the text translation of the podcast episode.
The third feature requires that the search engine implements a feature that grabs the podcast episode off the Internet and uses voice-to-text translation on the file. Unfortunately, this is a very computationally expensive task. Blinkx provides a search engine that uses speech-to-text technology on all of the videos that it indexes; it currently has over 35 million hours of video on its site. EveryZing is a company that provides multimedia search tools (including speech-to-text tools) for Web sites that want to provide this type of search on its content.
Except for iTunes, Google, YouTube and Bing, all of the following are niche sites. That doesn't mean that the small sites aren't worthwhile; it just means that this is not a popular class of Web sites. This section is a good set of sites for looking for business and news related sites.
Many podcast search engines and sites have existed over the last few years, but none have succeeded to any significant extent. iTunes is probably your best bet as a first place to look for podcasts but, for your particular topic, you should give the other sites a try as well.
If you're going to search on a general video search engine (such as Google Video or Bing Video), then you should use the following query:
theTopic podcast|vodcast|vidcast|netcast|"audio blog"|blogcast|"DIY radio"
Other
- Yubby — find, collect, and publish from 30+ video sites
- YoVisto — "academic video search" (article)
- CastTV: "one stop watching" (40K)
- sample; look at all the options on this page:
- Down the left column are a lot of different filters for the search
- Down the center column are all the results of the query
- sample; look at all the options on this page:
- Vuze: "find, download, and play high quality and HD video" (review) (40K)
- MeFeedia: "watch video from around the Web" (review) (25K)
- VideoSift (9K)
- LiveLeak: "redefining the media" (5K)
Music
Search
- Using Google Web Search: Add [music:] before any music related query (artist, song title, lyrics); article
- YouTube
- Yahoo Audio Search (click on "options" to be able to search on format, duration, source) (help)
- Yahoo Video (using video search to search for music)
- Altavista Audio Search (search by length and file type)
Music search and music providers
- iTunes
- Last.FM: "social music revolution" (review) (250K)
- Rhapsody (50K)
- Pandora (400K)
- New
- Moozikk — "music search made simple. find, listen, save and share your favorite songs." (article)
- Noiset — "Noiset.com is a music search engine where you can search for music albums, artist biographies and songs. Using Noiset, you will be able to find your favourite artist's discography, browse entire album collection, listen to song previews and find free download links. Noiset scans several up-to-date music blogs hosted on blogspot or wordpress and collects rapidshare, megaupload and mediafire download links for you."
Music blogs
- The Hype Machine — follows music blog discussions (20K)
- Search for audio blog or music blog or mp3 blog:
Other
- Musipedia — "a searchable, editable, and expandable collection of tunes, melodies, and musical themes"
20 Metasearch
by
samoore (16 Nov 2009 02:06; last edited on 23 Nov 2009 15:08)
We discuss and evaluate metasearch engines.
Class held on 11/18/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Go through highlights of some of the amazing search tools that are available.
- Work on exercises
At beginning of class
My notes
The search community has a tool called a meta-search engine. These tools derive their functionality from other search engines. A complicating factor is that there are two very different types of tools that are called meta-search engines. The search community does not recognize these subtypes, but they are very real and quite significant:
- Integrated search for multiple search engines
- the meta-search engine uses an algorithm to combine the results from multiple search engines into one result list
- Unified interface for separate search engines
- the meta-search engine provides a single interface, enabling easy and quick switching between the results of different search engines
These are very different tools. In the first case, the functionality provided by the meta-search engine revolves around an informed method of combining the results of multiple search engines, while in the second case the functionality is simply based on providing a unified interface while simply passing through the results from one external search engine at a time.
Resources
Integrated search for multiple search engines
- Info.com (about, review) (60K -> 40K)
- examples: carbon trading (web), carbon trading (images), carbon trading (reference)
- other: separate columns for Web results & sponsored results
- Features
- Integrates Google, Yahoo, Bing, Ask, About
- Separate tabs for Web, Research, News, Images, Video, Health, Shop, Classifieds, Flights, Jobs, Hotels, Movies, Audio, Yellow Pages, White Pages, Webmail
- StartPage (about, review)
- examples: cannot link directly to results
- other
- explicit aggregation (with stars), refinement of results with user input
- formerly known as ixQuick
- Features
- Integrates All the Web, Ask, EntireWeb, Exalead, Gigablast, MSN, NBC, Open Directory, Qkport, Wikipedia, Winzy, Yahoo
- Separate tabs for Web, International phone directory, Video, Pictures
- Search.com (review, help, search tips) (110K -> 40K)
- example: carbon trading (web), carbon trading (images), carbon trading (reference)
- other: top searches
- Features
- Integrates Google, Ask, MSN, DMOZ
- Separate tabs for Web, Images, Video, Reference (Wikipedia), Directory, Downloads, Shopping, People, Games, Music, Entertainment, News
- InfoSpace (200k -> 500k)
- example: carbon trading (web), carbon trading (images), carbon trading (news)
- Features
- Integrates Google, Yahoo, Bing, Ask, Twitter
- Separate tabs for Web, Images, Video, News, Business, and People
- article
- Clusty (about, review) (8k -> 3K)
- examples: advanced search form, carbon trading (advanced search results), carbon trading (web), carbon trading (images), carbon trading (wikipedia)
- other: metasearch with clustering (description of their clustering technology)
- Features
- Integrates Ask, Bing, NY Times, Open Directory, Yahoo News, others
- Separate tabs for Web, News, Images, Wikipedia, Blogs, Jobs, Shopping, Gov, Labs
- DogPile (about, review) (140k -> 70k)
- examples: carbon trading (web), carbon trading (images)
- other: favorite fetches (on home page), related searches
- Features
- Integrates Google, Yahoo, Bing, Ask
- Separate tabs for Web, Images, Audio, Video, News, Yellow Pages, White Pages
- URL.com
- examples: carbon trading
- Users can contribute to the results ranking process
- Scour
- examples: carbon trading
- Features
- You can change the ordering of the results by clicking on an icon at the top of the results
- You can see how each search engine ranked each item at the right of each item in the list
- Others
Unified interface for separate search engines
Results from multiple sites available separately
- Soovle (click "secrets" in upper right)
- other: top searches,
- automatic refinement of search in multiple search engines in real time (really cool)
- Be sure to hit the right arrow key after you have entered a search (but before pressing enter)
- Features
- Separate searches for Google, Yahoo, Ask, Wikipedia, Amazon, Answers.com, YouTube
- Just Web search
- other: top searches,
- Search.IO
- other: latest searches
- Features
- Separate searches for 8-10 sites in each category
- Separate tabs for Audio, Blogs, Books, CSS Galleries, Fonts, Images, Jobs, Lyrics, News, People, Recipes, Search Engines, Social Bookmarks, Stock Photos, Torrents, Tutorials, Videos, Web 2.0 Sites
- Joongel (review)
- examples: carbon trading (web), carbon trading (images)
- Features
- Integrates the "10 leading Websites" in each category
- Separate tabs for General Search, Images, Music, Videos, Shopping, Social, Q&A, Health, Torrents, Gossip
- Zuula (about, help)
- other: tracks recent searches
- Features
- Separate searches for Google, Yahoo, Bing, Gigablast, Exalead, Alexa, Entireweb, Mahalo, Mojeek (for the Web; others for the other categories)
- Separate tabs for Web, Images, Video, News, Blog, Jobs
Results from multiple sites returned simultaneously & separately
- LeapFish (about, review) (2k -> 5k)
- carbon trading
- Features
- Separate searches for Google, Yahoo, MSN
- Separate tabs for Web, News, Answers, Videos, Images, Shopping, Blogs
- Kosmix (about)
- carbon trading
- Features
- Multiple stories returned in multiple different types of searches, all displayed on one page
- Good way to get overview of a topic at the beginning of an investigation
Internet Start page with focus on unified search interface
- MrSapo
- Features
- Separate searches for 10-20 web sites per category
- Separate tabs for General, Images, Video, News, Social, Files, Reference, Academic, Business, Tech, Shop
- Features
- Symbaloo
- Features
- Just one search engine per category
- Features
21 Social Sites
by
samoore (21 Nov 2009 17:52; last edited on 23 Nov 2009 16:24)
We discuss social news and bookmarking sites.
Class held on 11/23/2009. (student notes; possible questions).
Class structure
- Go through "At beginning of class" information
- Go through metasearch results
- Work on exercises
At beginning of class
- Test information
- Timing of the test — Wednesday, December 2?
- Cut-off date for modifications (questions, notes, etc.)
- Nigel Melville
My notes
Introductory information
- Describing with tags
- Taxonomies
- Folksonomies
- Types of social sites
- Social News (based on tagging)
- Technology
- Search & Internet marketing
- For researchers & scientists
- Social Bookmarking (based on voting)
- Social activity (e.g., shopping)
- Social News (based on tagging)
- Features & dimensions
- Voting
- Current (e.g., "what's hot")
- Time periods (e.g., "last hour", "today", "last week", "last month")
- Topic categories (e.g., "business", "entertainment")
- Tags (both personal and shared)
- Discuss the sites
General market size information
The following series of charts shows the relative traffic for several social news and/or bookmarking sites.
From the first, you can see that the traffic at these sites have shrunk over the last 18 months. About mid-2008 you can see a big jump in the traffic for Delicious — this is when it was renamed from del.icio.us to its current name. This first chart shows the three largest social news sites (except for Yahoo! Buzz) and the largest social bookmarking site.
The second image shows four smaller social news site (with reddit shown as a comparison since the scale of this chart differs from the first). SlashDot is a technology-focused social news site. You can see that it has lost 75% of its daily visitor volume in the last two years. Mixx had a 3x growth from mid-2008 to mid-2009 but most of those extra visitors are now gone.
The third chart actually shows two (not three) bookmarking sites In mid-2008 the site del.icio.us changed names to delicious. This site lost 2/3 of its daily visitor volume in the last two years. As you can see Diigo is much much smaller than delicious. There's really not much competition in this market.
Resources
Social News sites
- Digg (about, tour, search): "Digg is a place for people to discover and share content from anywhere on the web… We’re here to promote that conversation and provide tools for our community to discuss the topics that they’re passionate about." (Propeller is quite similar to this.)
- Yahoo! Buzz (about): "The buzz can be about anything — from breaking stories on major news to viral videos on personal blogs. Instead of editors, people like you submit the stories and "buzz up" the best ones."
- Reddit (about, search): "reddit is a source for what's new and popular on the web — personalized for you. Your votes train a filter, so let reddit know what you liked and disliked, because you'll begin to be recommended links filtered to your tastes."
- StumbleUpon (about, video intro, guide, search, Rating on StumbleUpon): "StumbleUpon helps you discover and share great websites. As you click Stumble!, we deliver high-quality pages matched to your personal preferences… This helps you discover great content you probably wouldn't find using a search engine." This site could also be considered a social bookmarking site.
- Fark (about, help, search): "Fark.com, the Web site, is a news aggregator and an edited social networking news site… The idea was to have the word Fark come to symbolize news that is really Not News."
- Mixx (about, tour, search): "You find it; we'll Mixx it. Use YourMixx to tailor the content categories, tags, specific users and groups, and we'll deliver the top-rated content as chosen by you and people who share your passions. So go ahead and whip up your own version of the web. Just tell us how you like it Mixxed and we'll deliver the best the web has to offer"
- NewsVine (welcome, help, search): "At Newsvine, you can read stories from established media organizations like the Associated Press and ESPN as well as individual contributors from all around the world. Placement of stories is determined by a multitude of factors including freshness, popularity, and reputation. Contribution is open to all, and editorial judgement is in the hands of the community."
- Slashdot (about, help, search): "News for nerds. Stuff that matters."
Social Bookmarking sites
- Delicious (about, video tutorial, help, getting started, search): "Delicious is a social bookmarking service that allows you to tag, save, manage and share Web pages all in one place. With emphasis on the power of the community, Delicious greatly improves how people discover, remember and share on the Internet."
- Diigo (about, video tour, tour, search): "Bookmark, highlight, and add sticky notes to any web page. Organize your bookmarks and annotations by tags or lists. Multiple ways to share your bookmarks and annotations."
Social activity site
- Kaboodle: "Have fun shopping with friends, share, and discover new products."
General articles
- General articles
- Social bookmarking sites
- Articles on folksonomies
- Articles on tagging sites
- Tag, you're it: Scientists describe collaborative tagging sites like delicious (at Scientific American)
Possible blog topics
- Compare either Digg or Yahoo Buzz to one of the other social news sites.
- Compare delicious and Diigo.
22 People Search
by
samoore (22 Nov 2009 19:43; last edited on 30 Nov 2009 15:46)
We discuss several different ways to find out information about people through their Web presence.
Class held on 11/25/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- Go through diagram explaining the tools we'll be looking at today
- Work on exercises (twitter)
At beginning of class
- I forgot to mention the metasearch results last class!
- A very special quiz
People search
- This problem has applicability to many areas
- Just trying to find a person's address or phone number
- Background checks (including criminal checks)
- Finding missing persons
- Ancestry (including obituary searches)
- A very frustrating, confusing topic
- Monetary interests have really infiltrated this set of tools
- Lots of Web site purchases (industry consolidation)
- Lots of pay-for-results
People search engines
General people search
- WhoZat (review)
- Pipl (about) — can search by name, email, username (screenname), and phone; emphasizes that it searches the Deep Web.
- ZoomInfo (advanced search, help) — find people or companies
- Spock (people search on the web, blogs, social networks)
- iSearch (review) — can search by name, phone, email, and username (screenname)
- Whoozy (review) —- searches the Web and social networks
White page directories
- 411 Locate — many different search tools:
- WhitePages — lots of different search tools:
- people & business search
- reverse phone & address
- find area codes, zip codes
- neighborhood search (find the names of people who live near an address)
- search for an email address
- AnyWho — people search, reverse phone
- ZabaSearch (advanced search, review) — search by name or by phone number.
Social site search
- Wink — people search, phone number; will not only look for the specified name but similar names (i.e. Ted Kennedy, Theodore Kennedy…)
- 123people — people search (review, review)
- PeekYou — people search, username search
- The Internet Address Book — "Find, manage and discover internet addresses worldwide"
- Spokeo — good for searching social networking sites; most of the results require that you pay a fee.
Use public records
- Public Record Finder — this uses Intelius (like so many that I don't list here) so it shows partial results and then wants you to pay.
- Criminal Searches
Specialized
- Birthday Database
- FaceSaerch (yes, that's the right spelling!) — search for faces
- Namepedia — "world's largest information platform and community about personal names. Data is collected about names of all languages and cultures…" (review)
- PrivateEye — find maiden names, possible relatives, and roommates
- User Name Check — searches for a specific user name at a number of social sites (around 70 in late 2009); it tells you that the username is either available or not at each of the sites.
- InfoBel — international people search using white pages
Obituary search
- Social Security Death Index (ancestry.com) (1875-current, US Social Security numbers)
- Obituary Daily Times search
- National Obituary Archive (obituaries, memorials, funeral homes)
- Obituary Central (obituaries, cemetary searches)
- NY Times Obituaries
Ancestry information
- YourFamily (finding ancestors and lost relatives)
- Family Search (online birth, marriage, death, census, church, and other indices; run by Mormon Church)
General Web search
- Google (people search, reverse phone #)
- Searching phone numbers (details)
- phonebook
- Doesn't find cell phone numbers
- Searching names
- "full name" location
- "full name" company
- Searching phone numbers (details)
- Yahoo People search (people search, reverse phone #, email search)
Resources
- People search engines: The newest web privacy threat (PC Advisor, March 14, 2009)
- 25 free people search engines to find anyone
- Ten Ways you can find Phone Numbers on the Web (about.com)
- Top Ten Ways to do a free people search on the Web (about.com)
- Fifteen People Search Engines (about.com)
- How to find someone online (about.com)
- 4 people search engines: Looking for someone online
- iSearch
- 123people searches the social web
- Facesaerch: search for people's faces
- Namepedia (review): the name database
- Yasni (review): people search
- Google, Yahoo, Ask, Cluuz
23 Project
by
samoore (30 Nov 2009 16:00; last edited on 07 Dec 2009 15:40)
No scheduled activities. Just going to be helping you w/ your projects, talking about test, talking about SE analysis grades.
Class held on 11/30/2009. (student notes; possible questions).
- Corporate visits:
- Microsoft — they're not coming this semester. But let's talk.
- Google — they're coming next Monday. Be attentive. Be on time. Ask questions. The moment that I start talking, close the computer in front of you and turn off your cell phone.
- Projects
- Last part is due on 12/14/2009
- The last part of the assignment (optional) is that you can write a blog post (ungraded) that will help me give you the best possible grade on your project. Since you can't stand over my shoulder as I'm grading your assignment, think about the types of things that you would like to make sure that I pay good attention to so that you get full credit for the assignment. Don't post this until the last couple of days before the assignment. The title of the post should be "Final project: titleOfYourTermProject". It would be great if you could put a link to the home page of your project near the top of the post as well.
- Test
- Information about the test
- It's looking like there's going to be more multiple choice questions on the exam — you folks did a nice job (early in the semester especially; some later days were less impressive).
- Search engine analysis assignments
- Nice analyses: one and one and this one
- Some basic information
- Median grade: 90
- Range: 25-98
- Distribution of SE Analysis grades
2|5
3|5
4|
5|
6|5
7|5
8|257778
9|000000222233355557778
25 Google Inc.
by
samoore (07 Dec 2009 15:52; last edited on 11 Dec 2009 14:12)
We discuss the history, technology, and business model of Google, Inc.
Class held on 12/07/2009. (student notes; possible questions).
At beginning of class
- You should check your email for information about your test.
- Here is some information about the test scores.
- My secretary (in room R5492) will have your exam. You can look at it (but not keep it) through the end of the semester.
- Your final blog is due today. Be sure that your grade database information is up-to-date.
- The only assignment you have left is the term project.
- I have a request: Please send a tweet (or multiple) with #bit330 in it and suggest a search engine (or two) that you want to share with the rest of the class on Wednesday (our "Other Search Engine" day). You have blogged about or analyzed multiple search engine-related topics this semester beyond what we covered in class. I would like to make sure that the good ones that you have discovered are shared by all. (And also warn us off of search engines that you think are particularly awful.) Thanks for your help. Please do this by 6pm today (but right now would be okay).
- I will be discussing Google today.
Resources for today
- Presentation on slideshare and as a PDF
- Google Doodles video
- Inside the Mind of Google (CNBC video)
- Another Inside the Mind of Google (CNBC video)
26 Other Search
by
samoore (08 Dec 2009 19:54; last edited on 09 Dec 2009 17:49)
We explore a not-quite-random collection of search engines that we haven't looked at as a class. Many of these were suggested by students in this class.
Class held on 12/09/2009. (student notes; possible questions).
At beginning of class
- Google — review
- Rest of the semester:
- 12/14: Semester summary, project unveiling
- By the time class begins, you should have made your site public. You need to keep it public at least until the beginning of next semester (early January). At that point you can do anything you want with it, though I would prefer that it be kept public.
- If you would like to make a 2 minute presentation to the class about your project (because it's so cool!), please send me an email.
- By the end of Tuesday 12/15, I would like a short blog entry (it would be approximately one page if printed) on your project site that basically summarizes this information (and covers those things that you would have liked to have covered if you had been given more time). The point of this blog entry is to make it easier for me to grade your assignment. You want to point out the good things about your site so that I don't overlook them. Point out those parts that took particular effort on your part or that you think are deserving of special attention (for some reason) on my part. This entry won't be graded per se; I will use it as a guide while grading your project.
- 12/14: Semester summary, project unveiling
- Office hours will end at 4pm. I have a meeting with the Dean about BA201.
- I will hold office hours on Thursday from 3:30-4:30.
Sites
Video search sites
- ChizMax: lyrics and video search engine (Matt K)
- CastTV: "casttv is a pretty good video search engine, much better than hulu and comparable to clicker" (Liz J)
- Tudou: "One of the largest online video sites in China" (Liz J)
- ClipBlast: world's largest video search (Ray Park)
- Hello Movies: a great place to find movies to watch (Ray Park)
- Videosurf: "for a great video search engine use videosurf it has everything every other video surf engines offers, plus a great interface" (Tim F)
- PirateBay: "i dont think you want to be talking about torrenting, but the pirate bay is really good and no pesky ads" (Roberto J)
- ScrapeTorrent Torrent Metasearch Engine
Video Search popular in Foreign Countries: Youku and Tudou are popular in China, has Chinese video contents (news, shows, user content) as well as US videos (subbed of course)
Index Video Sites: Indexes shows from a variety of sources, very frequently updated. Shows are usually found a day later
Music
- Fizy: easy way to find songs and music videos; "Might want to check out music search engine Fizy.com. Simple, but really gets the job done. Like YouTube but with playlists" (Adrienne G)
- Songza: "Songza has become a daily staple for me. The playlist feature makes it something I can minimize and leave on for hours." (Nikhil G)
- Grooveshark: "Awesome, it has popular playlists, allows you to share playlists with friends and to make as many playlists as your heart desires."
- PlayList: "similar to songza, but better layout. great for listening to songs you don't have downloaded, can create playlists" (Diane B)
- Music Map: "you type in an artist, and it visually recommends similar artists for you. very cool!" (Diane B)
- The Hype Machine follows music blog discussions (Omer I)
- Stereo Mood: we've created a way to suggest songs that follow your feelings (Nikhil G)
- Musicovery: Like Stereo Mood but better interface and ability to link with your Itunes library (Vitaliy I)
Updated sites
- Google Trends: notice the "Hot Topics" their new real time search tools.
- Google Caffeine - New UI for Google
- At Google, notice the new real-time search results under "Latest results".
- At Google Finance, streaming news is available. Also note the "Sector summary" at the bottom of the page.
- Be sure that you're aware of the Related searches option and the Wonder wheel option.
Other (from students)
- Hakia: "I really enjoyed using Hakia this semester and want to tell the rest of the class about it on Wednesday!!" (Rachel B)
- WolframAlpha: "for one of the most unique search, or should I say computational knowledge engines, check out wolframalpha.com the site is sick…" (Larry W)
- Cuil: "if you are interested in the very basics about a topic you know little about, www.cuil.com is pretty useful. don't expect more tho" (Mike D)
- Living Stories: "The Living Stories project is an experiment in presenting news, one designed specifically for the online environment. The project was developed by Google in collaboration with two of the country's leading newspapers, The New York Times and The Washington Post." (from the home page of the site)
- Yahoo Glue: "pretty sweet. Similar to Kosmix. Kind of like a meta-search engine." (Joe K)
- PDF Database (Ray Park)
- Scribd: "Good website full of all types of freely downloadable documents, mostly pdfs. You can see the files on the website before you download them, which is helpful." (Dan B)
- Yebol: "Yebol is a new search website that could be useful if you are looking for a directory search, but don't use it for a regular search" (Michael A)
- Gigablast: "Gigablast was an awesome search engine that we didn't explore in depth. It returns a bunch of relevant results and great news sites" (Isaiah M)
- Sport Search (Andrew C)
- Funlus: "the game search engine" (Joe C)
- StumbleUpon: "great for just finding cool websites online/wasting time" (Tim F)
- Googlism: "an opinion site that may turn some unflattering results" (Rob L)
- Zebra Tickets: "a good site to visit when trying to find and compare ticket prices of concert and events (metasearch)" (Rob L)
- Foodieview: "if you are looking for a good cooking search engine try http://www.foodieview.com/, lots of cool customization" (Dan B)
- Epicurious: A good search tool for recipes and more. Also has a good community aspect.
- Boorah: "a restaurant review site" (Ran F)
Health search
- GoPubMed (help): search PubMed
- example: [focal nodular hyperplasia] (and check the what, who, where, when on the left side — amazing!)
- Medline (Cognition) (about, video)
- example: focal nodular hyperplasia, liver parasite
- HealthPricer (about): comparison shopping for health products (prescription drugs, contacts lenses, beauty products, etc.)
- Healia (about): health search engine, focused on support groups and communities
- example: diabetes
- iMedix (about): health community
Law
- LexisWeb (about): runs Web searches, but has option of going to LexisNexis recommended sources
- FeeFiFoe Firm (about): law firm search engine
- example: detroit "environmental law"
- Case law (Cognition)
Shopping
- PicItUp (about): visual shopping and similarity matching
- example: men watch
- Picitup, shopitup, and buyitup
- Retrevo (about): matching people and electronics
- example: canon camera
- TheFind (about): shopping search
Other (from me)
- CoolIris — the coolest browser search plugin AWESOME
- iSeek (about): focuses on ways to refine your search quickly and easily
- example: carbon trading
- Quintura (about): visual information web
- example: [detroit lions], [carbon trading]
- SenseBot In-depth search (about): in-depth summaries of Web pages on a topic plus a tag cloud
- example: [carbon trading] (50 pages and 20 sentences) — really interesting.
- Hakia (about): quality, not just popular, search results
- example: carbon trading (health and environment are the only focus right now for “credible” sites)
- Hakia hopes to expand coverage enlisting the help of librarians
- Hakia serves credible results in new search interface
- Exalead (about): thumbnails, search expansion
- example: carbon trading
- What's up with Exalead?
- Cluuz (about): advanced results display
- example: carbon trading
- FactBites (about): results based on content analysis instead of link popularity
- example: carbon trading
- Evri (about): summary information, relationships, facts, web search, images, videos
- Lexxe (about): lexical analysis and clustered results
- example: carbon trading
- Search Cloud (help): use of a “cloud tag” interface to create the queries
- example: [carbon trading, environment, credits (in decreasing order)]
- Tip of my tongue (about): find words
- Abbreviations (about): find abbreviations and acronyms by category
- example: ARM
- CarZen (about): automobile search
- IconFinder (about): exactly what it says!
- example: stop
- Snooth (about): wine search
27 Wrap up
by
samoore (14 Dec 2009 15:06; last edited on 14 Dec 2009 16:17)
We wrap up the semester and all that we have learned.
Class held on 12/14/2009. (student notes; possible questions).
Before class
- Requests
- Please keep this wiki public for the next year (at least). As you know these projects would be a lot of help to next year's class.
- After I (actually) grade your blogs, I still would like you to transfer your "10" blogs to this course Web site. Again, this will help next year's class.
- Reminders
- Here is the list of student wikis.
- Make your wiki public.
- Be sure that all of your blogs (and notes) are in the grades database.
- Announcements
- I will send twitters out when I have updated the grades database over the next 10 days.
- Don't expect final grades until the middle of next week. Again, I'll send a twitter when I've posted them.
- No more office hours.
- I will keep this wiki available and will continue to update it. I plan on using this same address for next year's course wiki.
In-class
- Student presentations
- Liz
- Billy
- My final presentation
- Course evaluations (go to CTools)
- Discussion












