nicolas leroy

Yahoo! Query Language rocks!!

February 27, 2009

This article is a techy one :) At Kelkoo, I’m working on product side, managing the product development team ; but, I like doing a bit of development at work but also at home… Lately, I had two technical problems to solve, and the Yahoo! Query Language (YQL) greatly helped me to solve those issues…

What is YQL?

In Yahoo!’s own words:

Yahoo! makes a lot of structured data available to developers, primarily through its web services. These services require developers to locate the right URLs and documentation to access and query them which can result in a very fragmented experience. The YQL platform provides a single endpoint service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we enable developers to discover the available data sources and structure without opening another web browser.

By using the YQL language, and calling YQL statements via the Yahoo! YQL web services, developers can easily retrieve results of their queries in XML or JSON formats, formats that can be easily parsed in any language.

Using YQL to retrieve Google Search results as XML

At Kelkoo, I’m getting interested in SEO lately ; and I needed a way to extract Google Search results as an XML file. However, Google doesn’t provide any API to do so. With YQL and XPath, that’s straightforward.

Using the YQL console, I simply used the following YQL statement to parse a Google result page:

select * from html where url=”http://www.google.fr/search?hl=fr&q=kelkoo” and xpath=’//h3[@class=”r”]/a’

Used? Yes, because, for a few days now, YQL displays: “robots.txt for that domain disallows crawling for that url“… Too bad Google doesn’t support the method anymore, and has blacklisted YQL web services :(

Retrieving Flickr data without bothering about authentication

I’m using Flickr to host images for this blog, and had developed some pieces of code to display photos using the phpflickr API… Worked well, but I needed to deal with authentication: Flickr API key, secret key, token… Quite complicated, as all my photos are public on Flickr… Using YQL, I can easily query the Flickr API, and get results as JSON (even simpler to parse in PHP than XML) without requiring any authentication. For instance,

select * from flickr.photos.info where photo_id=3303578613

YQL console

Overall, the YQL language and the YQL console are convenient tools for developers. They don’t solve any new problems, but provide a faster answer to common problems. Sure, they have greater values when you want to access Yahoo!-owned data (those tools are part of the Y!OS – Yahoo! Open Strategy – initiative); but the ability to easily parse HTML, ATOM, RSS, CSV, Microformats, XML… makes them nice tools to add in a developer toolbox.


2 commentaires

Jonathan
on Feb 28, 2009 / 3pm
Nicolas,

We added support for what we call "open data tables" a few weeks ago which enables any developer to create and use a YQL table for most web services, so far less Yahoo!-centric now (and, as you mentioned, we've always had the "generic" tables like xml and html).

Also, one of the really cool things with YQL is the ability to join across and web service. Thats something you might want to explore too.

Jonathan

nicolasleroy
on Mar 01, 2009 / 9am
Hi Jonathan,
Indeed "open data tables" seem promising, I will def have a look at this new feature!
Thanks,
Nicolas