Win $25,000 in cash and prizes! Thomson Reuters is challenging developers in an international competition to create innovative mobile apps for financial professionals.
Judging: Will be performed by Thomson Reuters.
close
As the volume of web content grows, the ability to usefully find, understand and utilize that information becomes exponentially more difficult. While fascinating technologies and applications exist for processing text, images, video and other web media, until now they have not been able to monetize their value.
80legs, the web-scale crawling service, is providing a channel for developers of these applications to monetize their technology. Later this Fall, we will be launching the 80apps Store, which will allow developers to sell web content processing applications directly to 80legs users.
We are issuing a challenge to developers from around the world to build the best 80legs application.
Examples of possible 80Apps may be:
The potential applications are limitless, though. We are interested in your ideas for programs that can best harness 80legs’ ability to crawl and process up 2 billion web pages per day.
Additionally, all submitted apps will be offered through the 80legs App Store. When the store launches, 80legs users will be able to buy and use apps developed by contestants. 80App developers can set their own prices and keep 100% of the revenue earned.
Three levels of prizes will be rewarded for the “Best App.”
1st prize will be the winner’s choice of the following:
2nd prize will be the winner’s choice of the following:
3rd prize will be the winner’s choice of the following:
Winners will be decided by a 5 person panel of judges that include:
All applications must be submitted by December 11th. Winners will be announced December 15th. (Dates are subject to change.)
Contestants should submit a text description of their 80App to ChallengePost.
In addition to the ChallengePost submission, contestants should create accounts at http://portal.80legs.com and upload their submissions under the Code section.
Please name your 80App using the format "Contest_Submision_xxxx".
For information on how to create an 80App, see http://80legs.pbworks.com/80Apps. This page outlines specific requirements for 80Apps.
SOLUTION DEADLINE
December 11, 2009
(well not really sure if this is the place to submit competition entry) Contest_Submision_dabsused is a crawler made for one of my pet projects dabsused.com which tracks price changes of used stocks on several websites run by a popular UK e-commerce company, Dabs. I wrote existing crawler in Java a few months ago and it's been in production since; This 80App is to replicate what the crawler does for checking prices. It's not a general purpose crawler, rather it's designed specifically for currently layout of those websites, which share a common backend system. I managed to get very similar, definitely usable results from creating a 80App crawler but it's not plain sailing either. I've put together a few slides to express my opinions (see attachment) and hopefully that's going to help someone or 80legs :) The crawler project's source is available for download too.
view uploaded documentCompanies are increasingly aware that they need to monitor the blogosphere to see what is being said about their brands and products. Given the fact that there are too many blogs to follow individually, which blogs are important to monitor? Obviously, looking at the number of incoming links is a known metric for impact but we need additional measures to determine who is influential in the blogosphere and who is not.
Blogcrawler is an 80legs.com hosted application to measure the community strength of a blog. Community strength is defined as the number of returning commenters divided by the total number of commenters and is scaled by the number of posts of the blog and total number of comments. This measure is best used when comparing multiple blogs and traffic stats for these blogs are unknown. Returning commenters are an indication that a blogger has the capability to attract and retain an audience.
Blogcrawler is a generic tool for blogs as it supports generic parsers for most blogs and can be extended with custom parsers if needed to crawl a specific blog.
After the crawler has crawled all required blogs, then a simple Python script is used to parse the results and calculate community strength for each blog. Community strength can be calculated for different time windows: last week, last month or even last year.
Detailed output is also possible, how many articles were posted in a given time window, how many comments were made and how many unique commenters participated.
The default parsers are configured for a number of Canadian political blogs including: eaves.ca, terahertzatheist.ca, djkelly.ca and stageleft.info.
You can reach me at @drdee_is_wired
view uploaded documentAdverSpider is a next-generation web crawler that specializes in detecting and analyzing online advertising. AdverSpider takes a list of URLs, visits each page and figures out how many ads are being displayed, which ad networks are being used and which sizes of ads are available. This data is returned to you as XML which makes it easy to process however you'd like.
For more details see: http://shawn.simister.ca/AdverSpider/