We are going to discuss different tools that can notify you in different ways and in different circumstances when some specific thing has changed on the Web: email alerts, page monitoring software, and RSS feed-manipulation software.
Class held on 10/26/2009. (student notes; possible questions).
Class structure
- Go through “At beginning of class” information
- I'll lecture for a bit (no slides today).
- Work on exercises.
At beginning of class
On your own
- FYI, I added a scanned copy of the diagram I created for the real-time information class.
- Read the current to-do list on the course home page.
- No grades (too much other prep going on)
- I am about 1/4 the way through the status reports. I'm working on them, I promise.
- I haven't graded any blogs in a very long time.
My notes
Monitoring changes
- Email alert service
- Monitor entire site
- These are set up by the Web site and you subscribe to them
- No false positives
- Sometimes you want email (cell phone! or even Messenger)
- Page monitors
- Monitor specific pages (but not sites)
- Lots of false positives unless keyword based
- RSS feeds
- Problem: False positives
- Unless keyword based or filtered somehow
- Focused RSS feed — If you’re lucky, there is a keyword-based, or specific-topic defined, RSS feed available for a site you can subscribe to.
- General RSS feed: If there's simply a general RSS feed (such as "Yahoo breaking news"), then you should run that feed through a keyword tool:
- FeedRinse
- Yahoo Pipes (if other processing is needed)
- The following are useful if there's no RSS feed available on a page but you would like to set one up:
- FeedYes: I would try this first since it's the easiest to use when setting up a feed.
- Feed43: This is more powerful but more difficult to use.
- Dapper: This is another powerful tool.
- Problem: False positives
- Why not just use RSS
- Some sites don't have RSS feeds
- So use site-based email alerts
- Or use a tool to make an RSS feed
- Some information isn't site based
- So use search-based email alerts
- Some information is too fine-grained to be covered by RSS feeds
- So use page monitors
- Some sites don't have RSS feeds
Email alerts
Finding email alerts
- Search for email alerts
- Query: "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (189 million in 2009) (77.2 million in 2008) (60.2 million in 2007)
- Yahoo results (464 million in 2009) (392 million in 2008)
- More specific search for email alerts
- Query: inurl:mail OR inurl:alert "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (88,100 in 2009) (115,000 in 2008)
- Yahoo results (280,000 in 2009) (242,000 in 2008)
- Science email alerts
- Query: science "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (38.6 million in 2009) (17.1 million in 2008) (2.34 million in 2007)
- Yahoo results (72.3 million in 2009) (69.9 million in 2008)
- INURL query
- Google results (8,140 in 2009) (382,000 in 2008)
- Yahoo results (155,000 in 2009) (85,600 in 2008)
- Query: science "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Copper email alerts
- Query: copper "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- Google results (757,000 in 2009) (384,000 in 2008)
- Yahoo results (5.49 million in 2009) (2.75 million in 2008)
- INURL query
- Google results (475 in 2009) (531 in 2008)
- Yahoo results (3,560 in 2009) (835 in 2008)
- Query: copper "email alerts" OR "e-mail alerts" OR "email alert" OR "e-mail alert"
- So, think about how you might apply this both to a company you are interested in or an industry you are interested in
General email alert services
- Yahoo Alerts
- Some types of alerts
- Broad-ranging alerts
- Breaking news (via SMS, Email, or IM), Local News (via email)
- More specific
- Keyword News (via Mobile, Email, IM), Stocks Watch (via Mobile, Email, IM)
- Broad-ranging alerts
- Be sure to look over the whole list of categories of alerts.
- Some types of alerts
- Google Alerts (help)
- All of this is based on submitting queries
- Once a day
- Once a week
- "As it happens"
- Broad-ranging alerts
- Web & comprehensive alerts
- More specific
- Keyword-based alerts for news, blogs, video and groups
- Can receive as email or as an RSS feed
- All of this is based on submitting queries
Page monitoring software
Overview
Page Monitors were the next big thing five years ago. It is a program or web based program that you download. Each day (or whatever time period you want to set) it downloads the webpage, and if it's different it will send you an email. Some tell you what has changed while others just tell you that it has changed.
At first, you might not be that impressed with page monitors. But after realizing that it can be used for a lot more than news, it can be quite a useful tool. WatchThatPage.com is the best free site.
WatchThatPage has a limit of 250 characters for the URL. Also, shortened URLs (from tinyurl.com or bit.ly) do not work. To get around these problems, use TrackEngine, where neither of these problems exist.
- Capabilities
- Automatically determine if a Web page, or part of a Web page, has changed
- Results might be delivered via email, RSS feed, or a summary Web page
- Page Monitoring Software Examples
- Track a company's press release page (Goldman Sachs)
- Find out when a new version of software is released (BBEdit)
- Find out when a new product is released (Canon cameras)
- Track a product category (Flat panel LCD TVs at Amazon)
- Monitor product information (comments about a movie at Amazon)
- Track auctions
- Track new jobs
- Monitor earnings releases (at JPMorganChase)
- Track who is linking to you (e.g., link:pogue.blogs.nytimes.com/ -site:pogue.blogs.nytimes.com David Pogue)
- Follow investment information about a company (e.g., American Express at The Motley Fool)
Web-based
- WatchThatPage
- Free (for any number of pages), or $20/year for priority service
- Can highlight changes in pages
- Changes sent in an email
- Keyword matching
- This site doesn't appear to be updated any more (4+ years)
- TrackEngine
- Free for 5 bookmarks, or $20/year for 10 pages, or $59/year for 50 pages
- Highlights new content in HTML email
- Monitors changes daily
- Does do keyword matching
- This site hasn't been worked on for 7+ years
- Other possible sites: InfoMinder, ChangeDetect, Trackle
Windows software
- WebSite-Watcher
- Free for 30 days; $45 purchase
Feed creation software
Overview
- Capabilities
- Demonstrations
- Demonstration with Feed43 and the JPMorganChase Annual report (the feed)
- Demonstration with Feed43 and a Google Web search results page
- Demonstration with Feed43 and the Goldman Sachs press release page
Make a feed
From other feeds
- FeedRinse: From their site, “Feed Rinse is an easy to use tool that lets you automatically filter out syndicated content that you aren't interested in. It's like a spam filter for your RSS subscriptions.”
- Can test on this page: http://feeds.nytimes.com/nyt/rss/companies
- Yahoo Pipes
- FeedZero: This uses adaptive filtering software to learn what feed articles you like, and which you don't like, based on your input.
From a page
- Dapper
- Description: Dapper is pretty slick. You can look through user created Dapps or you can (easily) create your own. Don’t forget to use the “get a nice short url” option and create your own that is easier to look at/use. This allows you to get an RSS feed for more things (instead of just news and blogs) such as searches.
- The Glory, Bliss and How-to of Screen Scraping for RSS
- Demo
- Video tutorial
- Useful Dapps
- FeedYes
- Create feed from http://www2.goldmansachs.com/our-firm/press/press-releases/index.html
- Feeds will work for 14 days, then you have to pay $30 per year
- Feed43
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
- Define Extraction Rules – By finding the specific places (within the code) of the information that you’re looking to have monitored by the RSS feed. There are directions for what specific code to use in the program.
- Then click extract
- Then you can give it a title, description, url, etc
- Then put in where the title, date, etc are etc
- If these sites are updated once a month, its too much of a hassle to make one of these (use a page monitor). But if it is updated daily and you want to monitor it, then it might be a good idea to make one!
- Free, or $29/year for 20 hourly updates
- My feeds
- Feed43 is a little bit more complicated. You have to find the actual html within the source code of the page.
Examples
- From TrackEngine
- TrackEngine help
- TrackEngine tutorials
- TrackEngine hot lists
- InfoMinder examples
- ChangeDetect examples
- Feed43 example
Email filtering
- Gmail
- Limit around 7.2GB (4.5GB in October 2007)
- Can use a filter
- To forward just some emails (to different people?)
- To apply a label to emails
- Plus addressing
- A powerful method that can be applied to Email alerts is using “plus addressing” service when you sign up for an Email alert (e.g. from some query), i.e. tell them that your address is dummy+moc.liamg|reifitnedIyreuQemos#moc.liamg|reifitnedIyreuQemos instead of the normal address moc.liamg|ymmud#moc.liamg|ymmud. Thus, if you get this address to your mail account, you can filter it by what comes after the plus! This is a extremely helpful since it makes it easier to filter emails.
- Description for GMail
- Use a different address for each email alert
- Helps you filter
- Helps you track who is selling your email address
- Defining a filter
- Keep definition to a minimum, as simple as possible
- Test, test, test
Tools you now have at your disposal
- Method to follow to find site-based email alerts
- Tools to create search-based email alerts
- Tools to monitor Web pages for any changes to their contents
- Tools to apply keyword-based filters to RSS feeds
- Tools to convert tabular Web page content to an RSS feed
Your term project
Email alerts and your term project
You should do the following for your project wiki:
- You should figure out some way that you are going to document the email alerts that you use in your email account to route your incoming alerts. Maybe print the alert page to a PDF file and link it to your wiki? Maybe take a screenshot of your email inbox and highlight the email alerts?
- In either case, you are going to want to have a section in your wiki called "Email alerts".
- On this page you should describe each of the email alerts that you used: the page from which you subscribed to it, why it is useful, and if there are any keywords (or such) that you used to generate it.
All of the above also applies to your page monitors, any feeds you create using FeedYes/Feed43/Dapper, and any feeds you filter using FeedRinse or Yahoo Pipes.
Possible blog topics
You do not have to write a blog. These are suggested blog topics if you were to write one. There are lots of possibilities in this class.
- Describe different ways that you found these tools useful (or not useful).
- Describe how you used Yahoo Pipes, possibly differently than how we have described them here.
Hints about possible test questions
You're definitely going to be held responsible for the following topics:
- What WatchThatPage (as an example of a page monitor) can do
- What Dapper can do
- What Feed43 can do and how its search patterns work
- What Yahoo Pipes can do and how feeds can be manipulated (for example, Fetch Feeds, Union, Filter, Sort)
- Under what circumstances would you use each one of these tools (as opposed to another)