Saturday, February 28, 2009

My first knol

As I mentioned yesterday, I just wrote my first knol entitled Introduction to Web Search Engines. This article was originally meant for my Internet Development class. I wanted to have them understand some of the technical issues of how search engines work because it affects how a website should be developed to make it Google-friendly. At the same time, I didn't want the article to be overly technical... it needed to convey just enough technical information so my students would get the big picture. Whether I hit that sweet-spot or not is debatable.

I couldn't find any similar articles on the Web (this is close), so I thought it would be useful to put one out there, and I've been eager to try out Google's new knol service. Despite my many frustrations yesterday, it wasn't too bad. I didn't find myself needing to manipulate the raw HTML too much, and the ability to add and manage references was very intuitive.

Yes, the amazing graphics are my own. I know I got skillz. wink

Friday, February 27, 2009

Why I (sometimes) hate cloud computing

Cloud computing offers a lot of positive benefits, namely access to data from anywhere.

But today I hate it.

I realize "hate" is a strong word, and I rarely break it out, but today I must.

I decided to write my first Knol yesterday. Google's Knol system uses an online editor and saves your data in their systems (i.e., in the cloud). After working on my article for some time, the system started having problems saving, but a warning message at the top of the screen warned that Knol would be down for about an hour. So after waiting the hour, I returned to the system and was able to make a lot of progress.

Or so I thought.

This morning when I returned to my article, only the first two paragraphs remained. I accessed the revision system to see if there were previous versions that contained my complete article, but every revision was the same: just two paragraphs. This really jolted me because I had repeatedly saved every so often just so something like this wouldn't happen. I never received a single error message after pressing Save.

Thankfully I had been smart and saved a copy of my article in Microsoft Word... my previous experience with cloud computing told me such a move would be smart. So I didn't loose my cool too much... I just copy and pasted my stuff back into Knol and then worked on some formatting issues.

About 10 minutes after I started editing, I got this error message warning me that my session had expired:


The message suggested I refresh the page. Knowing what I know, this is usually not a good way to repair an "expired" session. But there was nothing else I could do. Sure enough, after refreshing the page, my content was all gone. Back to two paragraphs.

Now I'm hot.

After taking a little walk to calm down, I returned to my office and decided to persist. I think I'm almost done with my Knol now, but I'm still feeling raw. If this is what cloud computing is going to be like, I'd rather stay on the ground. I've never had Word lose my document because my session expired.

Now you may be thinking this is an isolated issue with Knol, but I have had similar losses with Blogger (losing entire blog posts when their system was temporarily inaccessible) and Google Calendar (losing a number of appointments I had typed in but were apparently never saved). Maybe this is more a Google problem than a cloud problem, but if Google can't seem to get it right, who will?

Friday, February 20, 2009

Happenings at Harding

There's a lot going on around here at Harding, so a quick post to bring you up to date:
  1. This weekend in the Benson Auditorium, Hal Runkel will be presenting ScreamFree Parenting. Read more about it here.


  2. Jimmy Allen has retired from teaching. If you are a Harding alum, there's a good chance you took his class on Romans. But Dr. Allen is still around... I played basketball with him just a few weeks ago.


  3. Construction of a new Pizza Hut started a few weeks ago, about 50 yards from the Beebe-Capps entrance into campus. Why Harding was unable to purchase the land, I don't know. It's gonna look tacky, but at least it isn't a used car lot.


  4. If you haven't been spammed by the Alumni Office recently, here's your chance to win an all-expense paid Homecoming weekend at Harding. All you have to do is give the Alumni Office the email address of 5 "missing" alumni. The campaign is called Six Degrees of Harding University.


  5. A group of Harding students joined with others in helping storm victims in north-west Arkansas.


  6. The Harding programming team headed by David Farrow smashed the other business programming teams this past weekend in the Axiom programming contest held in Conway, AR. David was actually the lone programmer since his two teammates were just buddies who were there for moral support. David confirmed my suspicion that a CS-trained programmer is 10 times more effective than a business major that knows how to program. wink


  7. Ethan saw his first Harding basketball game last night. He cheered on the Lady Bisons as they won their 6th consecutive win.

Monday, February 16, 2009

How *not* to implement online security

I have an online account with a bank which shall remain nameless. Let's just call them Amgirl Direct. They use a really "sophisticated" security system which they have apparently leased from a third party named Information Technology, Inc.

Here's how I login to my account:
  1. First I must enter my 9 digit number account which I have not been able to memorize because I use it once a month. So I have to search for it in my email each time.

  2. I then am told I need to answer a security question because the bank doesn't recognize my IP address. (Of course it doesn't... my home computer is assigned a new one periodically by my ISP.) The security question is always the same:

    What is your high school mascot?


    I went to two high schools, and I have no idea which mascot I entered originally. But it doesn't matter... if I type in the mascot of either high school, the answer is always wrong.

  3. After I answer the first security question wrong twice, I'm finally asked for my mother's middle name. Thankfully it recognizes my answer to this question.

  4. Next I'm asked to enter my password. But supposedly my password has something to do with an "authentication image" which is always a white vase. I have no idea why. There's no link to an explanation. It's always the same image, and I have only one password, so I'm left wondering what-in-the-world this vase has to do with anything.

    (Note Information Technology, Inc.'s proud declaration of ownership for their system.)

  5. After entering my password, I'm finally logged in (usually). But be careful! If you ever click the back or forward browser button at any time, you are presented with this most unhelpful error message:

    Error

    A Security Error Has Occurred. Your Online Session Has Expired.
    Possible Reasons Include Double Clicking A Link Or Pressing The Browser's Back Forward Or Refresh Buttons.
    Return To The Login Page To Continue Your Session.


    They "expire" my session for using navigation buttons that most users are accustomed to using. And there is no link to a login page... you just have to re-type Amgirl Direct's original URL and proceed through the steps above once again.

I keep asking myself, is using an online bank with this lousy of a system really worth the 2.25% APY?

Specifying canonical URLs

Last week the big three search engines (Google, Yahoo, and Live Search) announced their support for a new HTML attribute value which will help prevent search engines from indexing duplicate content. Search engines naturally want to avoid crawling and indexing duplicate content because it lessens the quality of search result pages. Google's Webmaster Central Blog has a good write-up about the new rel="canonical" attribute value.

Essentially, the new attribute value will allow a webmaster to tell a web crawler to ignore a page if it is accessible from another URL. So if a I have a single page that is accessible at URLs A, B, and C, I can tell the web crawler that URLs B and C are pointing to the same content as A by placing the following code in the head element of the page:

<link rel="canonical" href="http://foo.com/A" />

When the web crawler grabs the pages using URLs B or C, it will find the given canonical URL A in the header and therefore ignore the contents of the pages since they duplicate page A.

Of course the entire mechanism requires a willing and competent webmaster to implement it. Webmasters who are very concerned about SEO are likely to use it since it will help bolster the PageRank of certain pages. But the rest of us who aren't concerned about our rankings can safely ignore this new functionality.

See also rel="nofollow".

Friday, February 13, 2009

Feel the Nutch burn...

This week I spent all 3 hours of class time showing my Search Engine students how to install, run, and modify Nutch, an open-source search engine written in Java. Since Nutch is new to me as well, I spent several hours last week trying to get familiar enough to walk my students through the time-consuming, error-prone, and laborious process of getting Nutch to run on Windows and in Eclipse.

I have labeled my newbie experience the "Nutch burn." And boy does it.

I followed a couple of tutorials that were pretty helpful, but I ran into several problems that required me to scour the Web looking for solutions. After much trial and effort, I was able to overcome and make some modifications to Nutch in Eclipse. I also got my modifications to run from the command line.

The barrier to entry is so high and the learning curve so steep that it makes me wonder... there's got to be a better way. The goal is for my class to make a major contribution to Nutch. Maybe our contribution could be to make the initial install/edit process just a little easier.

Tuesday, February 10, 2009

Ben Stein speaking at Harding University tonight

Ben Stein will be presenting his thoughts on the economy, etc. tonight at Harding University (7:30pm in the Benson Auditorium). Stein is well-known as an author, entertainer, and humorist. He is especially well-known for the hilarious "Bueller...? Bueller...?" scene in Ferris Bueller's Day Off and more recently for the controversial movie Expelled.

In an ironic twist, Stein was recently uninvited as the commencement speaker at the University of Vermont (technically he uninvited himself). Apparently strong opposition arose from some in the UVA academic community because of Stein's stance in Expelled. The theme of Expelled is that the academic community will shut you out for offering an opinion that differs from the status quo.

Disclaimer: I have not seen Expelled and have no opinion for or against the movie.

Update:

After seeing the talk, here are my impressions: Smart & witty. Loves Sonic. Extremely conservative. Fiercely loyal to Nixon. Not scientifically inclined.

Thursday, February 05, 2009

Why is Facebook hiding network statistics?

A while back I compared the Harding University and Old Dominion University networks on Facebook. But in the last several months, I've noticed that the ability to view network statistics in Facebook seems to have been turned off. I can't seem to find any discussions on the Web about this issue, so either I'm the first one to write about it or I've been over-dosing on crazy pills.

Several months ago, you could see your network's statistics (% male/female, top interests, etc.) by clicking the "Network Statistics" link while browsing your network. The URL looked like this:

http://www.facebook.com/networks/16777927/Harding/

where the number is the network ID assigned by Facebook and Harding is the network's name. When I try to access this URL now, I am redirected to

http://www.facebook.com/editaccount.php?networks

which only allows me to view the networks I'm a part of or join a new one (pictured below).



The interesting thing is that when I searched Google for links to network statistics pages, they apparently have hundreds of them indexed and cached as shown below. But when you click on any of the links, you are routed back to the screen above, and when you click on a cached link, you are told your search "did not match any documents".



It doesn't look like the Internet Archive has any of these pages archived.

So is anyone else able to view their Facebook network statistics? And what would be their motivation for hiding this information?


Update later today:

Somehow I missed it... last May Facebook placed an announcement on all network pages:
Network Pages will be discontinued soon
I was able to view a number of cached network pages with Live Search. Although they didn't have Harding's page cached, they had a number of pages from various US cities. Below is a snapshot of Washington DC's page from 10/14/2008 which includes the warning:


Once Live attempts to re-crawl this page, it will disappear into the bit bucket in the sky. All the user comments will also disappear.

It's really a shame Facebook got rid of these pages as they provided an interesting summary of each network.

Monday, February 02, 2009

Is my desktop too cluttered?

I have students stopping by my office all the time saying hello, asking questions, etc. Almost all of them freak-out when they see my computer desktop as pictured below.


What really gets them is my task bar: two rows of icons, many of them browser-related. In each browser I usually have 2-4 tabs open.

So how do I find anything?

I try to keep my left-most browser icon set to my email, calendar, and RSS/Atom reader. The next browser holds my blog, Blackboard, and Easel (class grading system). The other browser windows are open to a number of web pages related to my work. I realize that I could group multiple instances of browsers into a single button on the task bar, but I hate doing that. Beyond that, the rest is a hodge-podge collection of whatever app I need open: Visual Studio, Eclipse, Putty, Firefox, Chrome, IE, Notepad, Word, Excel, Adobe Reader, WinEdt, and iTunes.

Because I teach 4 classes and work on research, I am constantly having to which from one task to another. I've tried virtual desktop software, but I often need to bounce between unrelated windows which I can't have open in multiple desk tops. Then I'm left bouncing from one virtual desktop to the next, trying to find what I'm looking for. It usually ends up being more trouble than it's worth.

So this is how I operate. It looks ridiculous, but I make it work. By the way, this is also close to how my real desktop looks.

So am I alone on this, or do others have an equally cluttered computer desktop?

P.S. Dear Microsoft- Why can't I move the icons in my task bar? It would be quite helpful if I could reorder them at times.