Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Saturday, August 02, 2008

Fav5

My pick of the week's top 5 items of interest:
  1. In The 'Anti-Java' Professor and the Jobless Programmers, CS professor Robert Dewar complains of the Java-centric mentality that many American CS graduates have today. Here are two questions Dewar might ask a recent graduate in an interview in order to separate the wheat from the chaff:
    1) You begin to suspect that a problem you are having is due to the compiler generating incorrect code. How would you track this down? How would you prepare a bug report for the compiler vendor? How would you work around the problem?

    2) You begin to suspect that a problem you are having is due to a hardware problem, where the processor is not conforming to its specification. How would you track this down? How would you prepare a bug report for the chip manufacturer, and how would you work around the problem?

  2. In Google Still Not Indexing Hidden Web URLs, Kat Hagedorn and Joshua Santelli follow-up on a paper I published two years ago. Apparently Google is still not doing a good job indexing the OAI-PMH corpus; only 44% of the URLs tested were indexed by Google.

  3. Carnegie Mellon has just introduced a new masters degree: Master of
    Tangible Interaction Design
    . The one year degree combines computer science and architecture.

  4. The Google Blog has a series of posts discussing how Google ranks their results. The discussion is understandable for non-techies and delves into the psychology of web search.

  5. Just for fun: Super Mario Bros. in 20 lines of JavaScript.

Thursday, July 10, 2008

Yahoo's new Search BOSS API

Yahoo has just released a new web search API called BOSS (Build your Own Search Service) which improves on their earlier API in several ways:
  1. No daily query limits.

  2. No restrictions on how the results are displayed, ordered, or mixed in with other proprietary results.

  3. Ability to make money showing paid results.

The BOOS API is REST-based. You can receive results in either JSON or XML format, and you can get 10-50 results back per query.

There is one item that appears to be missing without explanation: the cached URL of each search result. This URL is useful to the user when the result's live URL is not responding. The old Yahoo web search API did provide this, so I'm not sure why it dropping in Boss.

One thing that makes me a little nervous about the API from a researcher's perspective is the prohibition in their Terms of Service against analyzing their search results:
You will not, will not attempt, or will not permit or take actions designed to enable other third parties to: ... perform any analysis, reverse engineering or processing of the Web Search Results
Analyzing the Yahoo search results is exactly what I did in my paper Agreeing to Disagree: Search Engines and their Public Interfaces. Well, better to do and ask forgiveness than get permission up front. ;-)

So here's a simple example in Java using the new BOSS API to search for the title of my blog "questio verum", the index status of my blog's root page, and all the pages indexed for my blog. To make this example work for you, simply put your Yahoo API key in API_KEY.

Note that this example is very similar to the Google AJAX example in Java from last month.

import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
import java.net.URLEncoder;
import org.json.JSONArray; // JSON library from http://www.json.org/java/
import org.json.JSONObject;

public class YahooQuery {

// Yahoo API key
private final String API_KEY = "Your Key Here";


public YahooQuery() {

makeQuery("questio verum");
makeQuery("url:http://frankmccown.blogspot.com/");
makeQuery("site:frankmccown.blogspot.com");
}

private void makeQuery(String query) {

System.out.println("\nQuerying for " + query);

try
{
// Convert spaces to +, etc. to make a valid URL
query = URLEncoder.encode(query, "UTF-8");

// Give me back 10 results in JSON format
URL url = new URL("http://boss.yahooapis.com/ysearch/web/v1/" + query +
"?appid=" + API_KEY + "&count=10&format=json");
URLConnection connection = url.openConnection();

String line;
StringBuilder builder = new StringBuilder();
BufferedReader reader = new BufferedReader(
new InputStreamReader(connection.getInputStream()));
while((line = reader.readLine()) != null) {
builder.append(line);
}

String response = builder.toString();

JSONObject json = new JSONObject(response);

System.out.println("\nResults:");
System.out.println("Total results = " +
json.getJSONObject("ysearchresponse")
.getString("deephits"));


System.out.println();

JSONArray ja = json.getJSONObject("ysearchresponse")
.getJSONArray("resultset_web");

System.out.println("\nResults:");
for (int i = 0; i < ja.length(); i++) {
System.out.print((i+1) + ". ");
JSONObject j = ja.getJSONObject(i);
System.out.println(j.getString("title"));
System.out.println(j.getString("url"));
}

}
catch (Exception e) {
System.err.println("Something went wrong...");
e.printStackTrace();
}
}

public static void main(String args[]) {
new YahooQuery();
}
}


Running this program produces the following results:


Querying for questio verum

Total results = 13600

Results:
1. Questio Verum
http://frankmccown.blogspot.com/
2. WikiAnswers - What does questio verum mean
http://wiki.answers.com/Q/What_does_questio_verum_mean
3. Questio Verum: URL Canonicalization
http://frankmccown.blogspot.com/2006/04/url-canonicalization.html
4. Questio Verum: WIDM 2006
http://frankmccown.blogspot.com/2006/11/widm.html
5. Questio Verum: Fav5
http://frankmccown.blogspot.com/2007/09/fav5_29.html
6. Questio Verum: Fav5
http://frankmccown.blogspot.com/2007/12/fav5.html
7. Questio Verum: August 2006
http://frankmccown.blogspot.com/2006_08_01_archive.html
8. Amazon.com: Profile for Questio Verum
http://www.amazon.com/gp/pdp/profile/A2Q6CLLQPXG55A
9. Questio Verum: JCDL 2007 - day 2
http://frankmccown.blogspot.com/2007/06/jcdl-2007-day-2.html
10. Questio Verum: OA debate - Eysenbach and Harnad
http://frankmccown.blogspot.com/2006/05/oa-debate-eysenbach-and-harnad.html


Querying for url:http://frankmccown.blogspot.com/

Total results = 1

Results:
1. Questio Verum
http://frankmccown.blogspot.com/


Querying for site:frankmccown.blogspot.com

Total results = 4080

Results:
1. Questio Verum
http://frankmccown.blogspot.com/
2. Questio Verum: OA debate - Eysenbach and Harnad
http://frankmccown.blogspot.com/2006/05/oa-debate-eysenbach-and-harnad.html
3. Questio Verum: JCDL 2007 - day 2
http://frankmccown.blogspot.com/2007/06/jcdl-2007-day-2.html
4. Questio Verum: No singles here
http://frankmccown.blogspot.com/2007/08/no-single-here.html
5. Questio Verum: Pledge Week and Insults
http://frankmccown.blogspot.com/2007/10/pledge-week-and-insults.html
6. Questio Verum: WIDM 2006
http://frankmccown.blogspot.com/2006/11/widm.html
7. Questio Verum: Fav5
http://frankmccown.blogspot.com/2007/09/fav5_29.html
8. Questio Verum: August 2006
http://frankmccown.blogspot.com/2006_08_01_archive.html
9. Questio Verum: Fav5
http://frankmccown.blogspot.com/2007/06/fav5.html
10. Questio Verum: Fav5
http://frankmccown.blogspot.com/2007/12/fav5.html


Thanks, Martin, for the head's up on this.

Update on 7/28/2008:

The missing cached URL feature is apparently coming soon.

Thursday, July 03, 2008

Restart applet in Firefox

I was doing some debugging on a Java applet this morning using Firefox 3, and I couldn't figure out how to restart my applet. I was rebuilding my applet and then hitting the refresh button on Firefox, and the old version of my applet was still being executed.

In IE you must press the Ctrl button while pressing refresh (or Ctrl-F5), but this was not working for Firefox.

I finally figured it out: Open the Java Console (available from the Tools menu) and press x which runs the "clear classloader cache" option. Then press refresh, and the newest version of your applet will load.

Friday, December 28, 2007

Fav5

My pick of the week's top 5 items of interest:
  1. If you use MySpace, Facebook, or other social networking systems, you will appreciate this diagram of your social network. smile

  2. Kevin Newcomb at Search Engine Watch has compiled a list of the most significant search-related events of 2007. Any of my students looking for a seminar topic for the spring... here's a great place to start!

  3. According to a recent article in InfoWorld, Java is the next Cobol... used in many businesses internally but outdated by newer web programming languages like Ruby on Rails, PHP, AJAX, and Microsoft .Net.

  4. Perl 5.10 has just been released. It's the first major upgrade to the language in 5 years. Some new items: smart match operator, switch statement, more powerful regexs, state variables, and more.

  5. And finally, be careful next time you hit reply-all on that email. It's been estimated that interruptions at work and information overload in general cost the US economy $650 billion in 2006.

Thursday, February 02, 2006

Updated C#, VB.NET, Java Comparisons

Today I updated my C# comparison pages for Java and VB.NET:

Maintaining these pages is really time-consuming, but I get so much positive feedback that it makes it worth it. A look the server logs shows that the C# vs. VB.NET page is especially popular. In any given month it’s typically the 4th requested URL from the harding.edu website.



When filtering for just pages produced by faculty members, the C# vs. VB.NET page is first by a factor of 4, and the Java 1.5 vs. C# page is around 4th place.



A shocker is my JavaScript vs. VBScript comparison page appearing 7th. I haven’t updated that thing in years, and who is using VBScript anyway? I guess its still getting attention from those die-hard ASP programmers. ;)