1. Trang chủ >
  2. Công Nghệ Thông Tin >
  3. Quản trị mạng >

Hack 22. Yahoo! Directory Mindshare in Google

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (15.01 MB, 888 trang )


are,areyousureyoursitewillstandoutandgetclicks?Maybe

youshouldchooseadifferentcategory.

WegotthisideafromasimilarexperimentdonebyJonUdell

(http://weblog.infoworld.com/udell)in2001.HeusedAltaVista

insteadofGoogle;see

http://udell.roninhouse.com/download/mindshare-script.txt.We

appreciatetheinspiration,Jon!



1.23.1.TheCode

YouwillneedaGoogleAPIaccount(http://api.google.com)as

wellasthePerlmodulesSOAP::Lite(http://www.soaplite.com)

andHTML::LinkExtor(http://search.cpan.org/author/GAAS/HTMLParser/lib/HTML/LinkExtor.pm)torunthefollowingcode.You'll

alsoneedacopyoftheGoogleWSDLfileinthesamedirectory

asthescript(http://api.google.com/GoogleSearch.wsdl).Save

thefollowingcodetoafilecalledmindshare.pl:















#!/usr/bin/perl-w













my$google_key="yourAPIkeygoeshere";

my$google_wdsl="GoogleSearch.wsdl";

my$yahoo_dir=shift||"/Computers_and_Internet/Data_







"eXtensible_Markup_Language_/RS









#downloadtheYahoo!directory.

my$data=get("http://dir.yahoo.com".$yahoo_dir)or









#createourGoogleobject.

my$google_search=SOAP::Lite->service("file:$google_w



usestrict;

useLWP::Simple;

useHTML::LinkExtor;

useSOAP::Lite;













my%urls;#wherewekeepourcountsandtitles.

#extractallthelinksandparse'em.

HTML::LinkExtor->new(\&mindshare)->parse($data);

submindshare{#foreachlinkwefind…



















































my($tag,%attr)=@_;















}



#andprocesseachURLthroughGoogle.

my$results=$google_search->doGoogleSearch(





$google_key,"link:$attr{href}"





"true","","false","","",""





);#wheee,thatwaseasy,guvn

$urls{$attr{href}}=$results->{estimatedTotalR



















#nowsortanddisplay.

my@sorted_urls=sort{$urls{$b}<=>$urls{$a

foreachmy$url(@sorted_urls){print"$urls{$



#onlycontinueonifthetagwasalink,

#andtheURLmatchesYahoo!'sredirectory,

returnif$tagne'a';

returnif$attr{href}=~/us.rd.yahoo/;

returnunless$attr{href}=~/^http/;



1.23.2.RunningtheHack

ThehackhasitsonlyconfigurationtheYahoo!directoryyou're

interestedinpassedasasingleargument(inquotes)onthe

commandline(ifyoudon'tpassoneofyourown,adefault

directorywillbeusedinstead):





%perlmindshare.pl"/Entertainment/Humor/Procrastinati



YourresultsshowtheURLsinthosedirectories,sortedbytotal

Googlelinks:





















554:http://www.p45.net/

339:http://www.ishouldbeworking.com/

124:http://www.india.com/

45:http://www.geocities.com/SouthBeach/1915/

15:http://www.eskimo.com/~spban/creed.html

15:http://www.jlc.net/~useless/

5:http://www.black-schaffer.org/scp/

2:http://www.angelfire.com/mi/psociety

1:http://www.geocities.com/wastingstatetime/



1.23.3.HackingtheHack

Yahoo!isn'ttheonlysearchablesubjectindexoutthere,of

course;there'salsotheOpenDirectoryProject(DMOZ,

http://www.dmoz.org),whichistheproductofthousandsof

volunteersbusilycatalogingandcategorizingsitesonthe

Webthewebcommunity'sYahoo!,ifyouwill.Thishackworks

justaswellonDMOZasitdoesonYahoo!;they'reverysimilar

instructure.

ReplacethedefaultYahoo!directorywithitsDMOZequivalent:





my$dmoz_dir=shift||"/Reference/Libraries/Library_a











Science/".

















"Technical_Services/Cataloguing

"Applications/RSS/News_Readers/



You'llalsoneedtochangethedownloadinstructions:







#downloadtheDmoz.org!directory.

my$data=get("http://dmoz.org".$dmoz_dir)ordie$!



Next,replacethelinesthatcheckwhetheraURLshouldbe

measuredformindshare.WhenwewerescrapingYahoo!inour

originalscript,weskippedoverYahoo!linksandthosethat

weren'twebsites:







returnif$attr{href}=~/us.rd.yahoo/;

returnunless$attr{href}=~/^http/;



SinceDMOZisanentirelydifferentsite,we'llmakesureit'sa

full-bloodedlocation(i.e.,itstartswithhttp//:)asbeforeand

thatitdoesn'tmatchanyofDMOZ'sinternalpagelinks.

Likewise,we'llignoresearchesonotherengines:







returnunless$attr{href}=~/^http/;

returnif$attr{href}=~/dmoz|google|altavista|lycos|y



Canyougoevenfurtherwiththis?Sure!Youmightwantto

searchamorespecializeddirectory,suchastheFishHoo!

fishingsearchengine(http://www.fishhoo.com).

Youmightwanttoreturnonlythemostlinked-toURLfromthe

directory,whichisquiteeasybypipingtheresultstohead,

anothercommonUnixutility:





%perlmindshare.pl|head1



Alternatively,youmightwanttogoaheadandgrabthetop10

GooglematchesfortheURLthathasthemostmindshare.Todo

so,addthefollowingcodetothebottomofthescript:













print"\nMostpopularURLsforthestrongestmindshare:

my$most_popular=shift@sorted_urls;

my$results=$google_search->doGoogleSearch(







$google_key,"$most_popular",0







"true","","false","","",""















foreachmy$element(@{$results->{resultElements}}){



nextif$element->{URL}eq$most_popular;



print"*$element->{URL}\n";



print"\"$element->{title}\"\n\n";

}



Thenrunthescriptasusual(theoutputhereusesthedefault

hardcodeddirectory):































%perlmindshare.pl

24600:http://www.newsburst.com/

22700:http://www.bloglines.com/

9640:http://radio.userland.com/

6890:http://www.feedreader.com/

4770:http://www.sharpreader.net/

4660:http://www.newsgator.com/

3580:http://www.newsisfree.com/

2680:http://www.pubsub.com/

2090:http://www.disobey.com/amphetadesk/

1740:http://www.serence.com/site.php?page=prod_klipfol

1690:http://www.pluck.com/

1610:http://www.rssbandit.org/

1160:http://www.allheadlinenews.com/













1140:http://www.newzcrawler.com/

961:http://www.rojo.com/











MostpopularURLsforthestrongestmindshare:

*http://www.newsburst.com/Source/?add=PUTyourFEEDurlHE

""









*http://deeplinking.net/xmlsrv/rss.php?blog=4

"Deeplinking"









*http://www.bloglines.com/citations?url=http://www.new

"Bloglines|Citations"









*http://www.feedforall.com/forum/posting.php?mode=quot

"FeedForAll::Postareply"















KevinHemenwayandTaraCalishain







Chapter2.Services

Section2.1.Hacks2350:Introduction

Hack23.TrackYourInvestments

Hack24.BuildYourOwnStockUpdateEmail

Hack25.DownloadFinancialDataUsingExcelWebQueries

Hack26.ConvertCurrencieswithOneClick

Hack27.DotheMathwithYahoo!Calculators

Hack28.AddaYahoo!BookmarkwithOneClick

Hack29.ImportExistingBookmarksintoYahoo!

Bookmarks

Hack30.OpenYahoo!BookmarksinaSidebar

Hack31.PublishYourYahoo!Bookmarks

Hack32.TracktheMedia'sAttentionSpanoverTime

Hack33.MonitortheNewswithRSS

Hack34.PersonalizeMyYahoo!

Hack35.TrackYourFavoriteSiteswithRSS

Hack36.AddaFeedtoMyYahoo!withaRight-Click

Hack37.BuildYourOwnNewsCrawler

Hack38.ReplaceYourPhoneBookwithYahoo!



Hack39.MonitorYourCommute

Hack40.GettheFactsatYahoo!Reference

Hack41.FindandRateMovies

Hack42.SubscribetoMovieShowtimes

Hack43.ViewMovieListsonYourCellPhone

Hack44.PlanYourTVViewing

Hack45.CreateaTVWatchList

Hack46.DevelopandShareaTripItinerary

Hack47.ShopIntelligently

Hack48.VisualizeYourMusicCollection

Hack49.TakeYahoo!ontheGo

Hack50.StayConnectedwithYahoo!Alerts



2.1.Hacks2350:Introduction

Inadditiontopointingpeopletodocumentsandresources

acrosstheWebthroughYahoo!SearchandtheYahoo!

Directory,Yahoo!hasbecomeadestinationitself.Bygathering

informationfrommanysourcesunderasingleroof,Yahoo!has

madefollowingthefinancialmarkets[Hack#23],thedaily

news[Hack#33],oreventheproductsavailableonline[Hack

#47]abreeze.

Yahoo!alsoallowsyoutopersonalizetheinformationyoufind

atthesitessotheinformationismoremeaningfultoyou.This

meansyoucangatherandtrackyourfavoritenewssources

togetheratMyYahoo![Hack#34]orevenvisualizeyour

personalmusiccollection[Hack#48]inanewway.Andonce

yourpersonalpreferencesarestoredatYahoo!,they're

accessiblefromanycomputerconnectedtotheInternet.Thisis

especiallyusefulforstoringbookmarks[Hack#28]orkeeping

tabsonmoviesyou'dliketosee[Hack#43].

Yahoo!alsohasseveralmethodsofroutingyourpersonalized

informationtoyouwhenandwhereyouneedit.Yahoo!Alerts

[Hack#50]cansendupdatedinformationtoyouviaemail,

instantmessenger,orcellphone.AndYahoo!Mobile[Hack

#49]cangiveyouaccesstoyoursettingsonamobiledevice

whenyou'reoutandabout.

Thehacksinthischapterareaboutpersonalizingandworking

withthedatayou'llfindacrossYahoo!properties.Thehacks

hererepresentonlyaportionofYahoo!Herearesome

additionalYahoo!propertiesthatdidn'tmakeitintothebook:



AskYahoo!



http://ask.yahoo.com



Banking

http://banking.yahoo.com



Cars

http://autos.yahoo.com



Classifieds

http://classifieds.yahoo.com



Health

http://health.yahoo.com



Horoscopes

http://astrology.yahoo.com



Insurance

http://insurance.yahoo.com



Jobs

http://hotjobs.yahoo.com



Loans

http://loans.yahoo.com



Lottery

http://lottery.yahoo.com



Pets

http://pets.yahoo.com



RealEstate

http://realestate.yahoo.com



SmallBusiness

http://smallbusiness.yahoo.com



Sports

http://sports.yahoo.com



Xem Thêm
Tải bản đầy đủ (.pdf) (888 trang)

×