This article is a techy one :) At Kelkoo, I’m working on product side, managing the product development team ; but, I like doing a bit of development at work but also at home… Lately, I had two technical problems to solve, and the Yahoo! Query Language (YQL) greatly helped me to solve those issues…
Yahoo! makes a lot of structured data available to developers, primarily through its web services. These services require developers to locate the right URLs and documentation to access and query them which can result in a very fragmented experience. The YQL platform provides a single endpoint service that enables developers to query, filter and combine data across Yahoo! and beyond. YQL exposes a SQL-like SELECT syntax that that is both familiar to developers and expressive enough for getting the right data. Through the SHOW and DESC commands we enable developers to discover the available data sources and structure without opening another web browser.
By using the YQL language, and calling YQL statements via the Yahoo! YQL web services, developers can easily retrieve results of their queries in XML or JSON formats, formats that can be easily parsed in any language.
At Kelkoo, I’m getting interested in SEO lately ; and I needed a way to extract Google Search results as an XML file. However, Google doesn’t provide any API to do so. With YQL and XPath, that’s straightforward.
Using the YQL console, I simply used the following YQL statement to parse a Google result page:
select * from html where url=”http://www.google.fr/search?hl=fr&q=kelkoo” and xpath=’//h3[@class=”r”]/a’
Used? Yes, because, for a few days now, YQL displays: “robots.txt for that domain disallows crawling for that url“… Too bad Google doesn’t support the method anymore, and has blacklisted YQL web services :(
I’m using Flickr to host images for this blog, and had developed some pieces of code to display photos using the phpflickr API… Worked well, but I needed to deal with authentication: Flickr API key, secret key, token… Quite complicated, as all my photos are public on Flickr… Using YQL, I can easily query the Flickr API, and get results as JSON (even simpler to parse in PHP than XML) without requiring any authentication. For instance,
select * from flickr.photos.info where photo_id=3303578613
Overall, the YQL language and the YQL console are convenient tools for developers. They don’t solve any new problems, but provide a faster answer to common problems. Sure, they have greater values when you want to access Yahoo!-owned data (those tools are part of the Y!OS – Yahoo! Open Strategy – initiative); but the ability to easily parse HTML, ATOM, RSS, CSV, Microformats, XML… makes them nice tools to add in a developer toolbox.