Monthly Archives: October 2006

Fixing the geocoder gem.

I wanted to use the ruby geocoder library on the windows machine, but the installation of the gem failed due to some weird error. I checked the rubyforge project page to see if someone else had a similar problems and someone actually had, but the bug was open for a long time. I decided to fix this issue and found that the problem was due to the fact that windows platform does not allow characters '?' and '&' in the filename with any escaping, period. The said files were used (in a very innovative way I must say!) to test the library by modifying http.rb to return the test datafile contents instead of fetching the URL from the net. (yay open classes in ruby!). The way I fixed the problem was to change the filenames to use '_' instead of '?' and '__' instead of '&'.

I wrote to the developer, but there was no response. Anyway I managed to create a new GEM with the changed files so that this should be installable on windows now. Here are the files if you want to try installing the gem. (Also including the tgz because... it got generated anyway!)
geocoder-0.1.1.gem
geocoder-0.1.1.tgz

More about Phishtank API

Here is what will be good-to-have from phishtank.com API:

  • Good documentation about each interface e.g. how is callback_url used by auth.frob.request API ?
  • Description of all possible fields in return response (all possible XML elements and their possible values)
  • Some test URL's and emails which will return known responses (i.e. phishy URL, good URL, not in the database etc.)
  • Developer mailing list/wiki
  • Response should always honor the responseformat parameter if specified and valid

Phish Tank

http://www.phishtank.com is a new service which aims to help weed out phishing URLs and email addresses using wisdom of the crowds. Users can submit emails/URLs which they suspect of fraud and others can vote if they really are fraudulent or not. I think it is a great concept. There is a REST API using which applications can embed this webservice within them. So for example, there could be a outlook plugin which will display "phishy" email addresses in a special way in order to alert the user immediately. Same for web browsers which can render phishing websites in a special stylesheet. The applications can also add interface for the user to submit suspect pages and email easily without using web browsers.

I checked out the API and it does not feel like it is fully baked! There are interfaces for authorization and checking email/url status and submitting new emails/urls. Some things that stand out immediately are:

  • Exclusive use of SSL for the API access.
  • Parameter authentication (i.e. including cryptographic digest of all the parameters to ensure that parameters are not changed using man-in-the-middle attack)
  • Choice of xml or php output.

The api calling sequence works like this:

  1. User registers on the web for API access and gets api key and shared secret
  2. Using the API, application gets a frob (what is behind the name ?) and authorization url using auth.frob.request
  3. User has to authorize the frob using the authorization url specified in the response. (optionally you can specify callback url which the server will call for authorization, I will need to check this from home when I have access to a server -- the docs are very thin about the mechanism)
  4. Once authorized, app uses the frob and gets a token for short time API access (30 minutes in my tests) (auth.token.request)
  5. App can check token status which tells remaining time on token.(auth.token.status)
  6. App can revoke the token when it is done using it. (auth.token.revoke)
  7. The APIs for check.url, check.email, submit.url, submit.email then use the token.

I did not understand why there is a need for FROB in this, why can't you just get the token from api key and shared secret ? What problem are they solving by this indirection ?

Anyway, here is the ruby script that I used for testing this... I am planning to turn this into a module, but providing it here for early access...
phishtank.rb
config.yml

P.S. the check_url interface is not working, I am getting invalid token error. and the same token can be revoked successfully.
P.P.S. The API uses SSL (no cleartext api available) and ruby's open-uri library insists on checking the server SSL certificate which always fails (probably because signer needs to be trusted by openssl), I had to change it locally to ignore ssl verification in order to proceed.

Update (Oct/12/06): the check.url interface is finally working. For this API, the signature needs to be calculated before escaping the url. I refactored the ruby script a bit to remove redundant code and moved the configuration to a seperate file. I still need to work with the response parser and make it general for all types of responses. XML parsing gets so ugly so fast, it's amazing!