Internet

You are currently browsing the archive for the Internet category.

I was really looking forward to get a chance to play with Google Wave. If you have not heard about this, Google is experimenting with a new means of communication and collaboration (to destroy any remains of your unused time!). Watch the video of the demo from Google IO conference to get a feel for the technology.

The catch line is – How would email be if it is designed today ?

Being Google, they are fairly open about the entire technology.Google Wave is a collection of several components which work in concert to bring us this amazing way to collaborate and communicate. There is the wave server (which hosts the waves. Google provides an implementation and others are free to implement it in their own control), federation protocol (which is open specifications protocol and allows the servers to talk to each other), the client (typically your web browser which you use to interact with the wave server, but there is a sample text client and emacs based client in development as well!), the gadgets (small pieces of code that are embedded in documents and provide rich look and feel and additional functionality to the wave) and the robots (robot participants in the wave which can do cool things like correct spelling as you type, syntax highlight code while it is being pasted in the wave, translate language etc.)

I have spent some time in developing a robot called Nokar (meaning assistant or servant in Marathi/Hindi) which can do several things when invited to a conversation – Insert images based on specified keywords, translate text between a set of 20 languages among some other geeky functions. The intention was to learn about the robot protocol. I also created some pages which use the embed API. This allows any web page to embed a wave conversation (or a subset of it). I am also going to experiment with the Gadgets in the next few weeks. I will try to document my process in next few posts.

Tags: , , , , ,

Bing (the microsoft way of googling information) has gone live today! Seems nice and googly! Had some fun with the search suggestions: type linux in the search box (but don’t press enter). Watch all the suggestions :)

Seems like they are making AJAX request to http://api.search.live.com/qson.aspx?query=linux every second or so for anything that is typed and it returns those wonderful unbiased suggestions.

It sounds like they are trying to do more than just running the search query, like adding their own interpretation to the query, organizing results, paying you money (i.e. if you buy something using the search links). I haven’t seen anything here that Google does not already have (or cannot implement very quickly) i.e. except for the cash-back bait. So let’s see how this goes.

Tags: , , , , ,

Frustrated by Indian Express’s inability to provide individual syndication feeds for its columnists, I have written scripts to parse the HTML pages and generate the feeds myself.

Here are the feeds for
Shekhar Gupta
Tavleen Singh.
R. Jagannathan (DNA India)
Arun Shourie
Sudheendra Kulkarni
Ila Patnaik

If you want this for another columnist, let me know and I will add that too. This is very easy to do for Indian Express columnists as I already have the script, but I can also help with other websites.

P.S. The script is in ruby and I will release the source after I fix some things and clean it up some more.

Update: May 29, 2009
Added new feeds for Arun Shourie, Sudheendra Kulkarni and Ila Patnaik

Update: June 22, 2009 – Added columns feed for R. Jagannathan of DNA India

Update: August 11, 2009 – Feeds for C. Rajamohan , Harsha Bhogle , Shailaja Bajpai.

Update: Sept. 10, 2009 – You can use my shared page from Google Reader to see all the new posts from all of these columnists on a single page.

Tags: , , , , , , ,

Some notes from experimenting with my new toy!

The device (DMX) plugs into HDMI port of the TV and has its own HDMI input allowing pass-through. The control of DMX is done from a USB cable which is attached from the TV to DMX (Wonder why they do not use the HDMI for that!). DMX has an ethernet port which can get autoconfigured using DHCP (or can also be configured for manual settings from the TV). Sony has licensed a lot of content providers (slacker for radio and tons of other video providers like youtube, Amazon, Yahoo, Blip.tv). They have a portal (http://internet.sony.tv/) which allows you to add your own video links and some browser extensions which allow you to link your favorite videos which the DMX can then play for you. The firefox extension unfortunately requires older (2.X) version of firefox and does not work with newer versions (3.X). I tweaked that extension and it now works okay without significant changes. Here is a link to it. This extension adds a context menu to firefox and you can use it to direct the DMx to start playing the linked video, bookmark it etc. The caveats are that the link needs to be a raw video file (mp4, mov, avi, divx etc.) and not HTML file or any other kind.

Here is what I learned from reverse engineering the extension code.

The DMX runs a primitive web browser on port 9784. The server is reportedly called “Callisto Debug Server v0.2″. There is a php script running which responds to URLs like the following
Send commands using the following REST API:

http://192.168.0.101:9784/renderer.php?method=play&url={encoded URL}

The firefox and internet explorer generates these URL’s.

Here is what I gathered from the extension code about the API:

Available commands and arguments:

  • play (url => encoded URL to mp4, AVI, MOV file to play)
  • pause
  • stop
  • addbookmark (title=> encoded bookmark title, description=>encoded description, icon => ???, source =>encoded URL)

It seems to play the mp4 files quite well (I had good success with HD mp4 links from youtube and dailymotion websites which have a lot of bollywood movies). The streaming is very smooth and video quality is acceptable. You can use a website like http://keepvid.com/ to see the hidden video links. These links are then directly playable by DMX. I will experiment with AVI, MOV, DIVX files next…

Next step for me is to create a simple webservice which does all of these steps and posts the link to the DMX! Can’t wait till I can manage to do that…

Tags: , , , , ,

I upgraded wordpress software for this blog to version 2.5.2. After that none of the posts that had devanagari (unicode) text looked okay. After comparing the configuration files, I discovered that the troubling variable is DB_CHARSET. The default config setting is ‘utf8′. But if you have been updating the software versions, your database table is probably in ‘latin-1′ charset, though wordpress has been saving unicode data to the tables. Once you remove the DB_CHARSET setting (or setting it to latin-1 or ”), things return to normal.

The setting is in the file wp-config.php in the wordpress install directory.
The value before change:
define(‘DB_CHARSET’, ‘utf8′);

The value after change:
//define(‘DB_CHARSET’, ‘utf8′);
define(‘DB_CHARSET’, ”);

So maybe I should try to recreate the tables with correct charset defined in mysql some day ?

Tags: , , ,

Google has launched appengine which provides developers with a platform SDK (python based!) and hosting with access to own Google BigTable database! This competes with Amazon.com’s SQS , S3 (storage) and EC2 (hosting) services which are used by many startups… The applications will get google’s massively scalable infrastructure, failover. Apps would also be able to easily use google’s user authentication, analytics and other google API’s.

The applications gallery points to some cool goodies… The applications would get a subdomain under appspot.com. So it is possible to run a search on google to find the existing applications.

Here is a python shell web app. You can see the loaded modules, enter and run some small programs…

Both companies are trying to entice developers from hot startups into using their infrastructure, so just in case they start getting bigger, it is easier to assimilate them! Let there be competition!

Tags: , , , , ,

I just upgraded django tree which recently merged in the unicode support. This immediately broke django templates for venus. Here is what you need to change in planet/shell/dj.py to account for new django changes:

43c43,46
< f.write(t.render(context))
---
> ss = t.render(context)
> if isinstance(ss,unicode):
> ss=ss.encode(‘utf-8′)
> f.write(ss)

This is probably due to render returning unicode strings which need to be converted to byte-streams.

Update: I found out that my changes broke it for people using older version of django. I have updated the patch above to account for that.

Tags: , , , ,

I created a small form using which you can search the web for unicode devanagari words. It is very cumbersome to actually enter unicode devanagari characters using qwert keyboards, so I have adopted the phonetic transliteration scheme from Manogat website. Do give it a try:

http://amit.chakradeo.net/search/ (Link now removed, please see update below)

Start typing devanagari words phonetically and you will see unicode characters in the input area. When you hit enter, the phrase will be submitted to google.

Update: (2009-07-14) I have taken down my link now, as there are many more effective alternatives to do this. Check out the following pages from google:
Google Indic Transliteration
Bookmarklets to transliterate any text element on any webpage.
A simple form for transliterating any text.

Tags: , , , , , , , ,

Here is what will be good-to-have from phishtank.com API:

  • Good documentation about each interface e.g. how is callback_url used by auth.frob.request API ?
  • Description of all possible fields in return response (all possible XML elements and their possible values)
  • Some test URL’s and emails which will return known responses (i.e. phishy URL, good URL, not in the database etc.)
  • Developer mailing list/wiki
  • Response should always honor the responseformat parameter if specified and valid

http://www.phishtank.com is a new service which aims to help weed out phishing URLs and email addresses using wisdom of the crowds. Users can submit emails/URLs which they suspect of fraud and others can vote if they really are fraudulent or not. I think it is a great concept. There is a REST API using which applications can embed this webservice within them. So for example, there could be a outlook plugin which will display “phishy” email addresses in a special way in order to alert the user immediately. Same for web browsers which can render phishing websites in a special stylesheet. The applications can also add interface for the user to submit suspect pages and email easily without using web browsers.

I checked out the API and it does not feel like it is fully baked! There are interfaces for authorization and checking email/url status and submitting new emails/urls. Some things that stand out immediately are:

  • Exclusive use of SSL for the API access.
  • Parameter authentication (i.e. including cryptographic digest of all the parameters to ensure that parameters are not changed using man-in-the-middle attack)
  • Choice of xml or php output.

The api calling sequence works like this:

  1. User registers on the web for API access and gets api key and shared secret
  2. Using the API, application gets a frob (what is behind the name ?) and authorization url using auth.frob.request
  3. User has to authorize the frob using the authorization url specified in the response. (optionally you can specify callback url which the server will call for authorization, I will need to check this from home when I have access to a server — the docs are very thin about the mechanism)
  4. Once authorized, app uses the frob and gets a token for short time API access (30 minutes in my tests) (auth.token.request)
  5. App can check token status which tells remaining time on token.(auth.token.status)
  6. App can revoke the token when it is done using it. (auth.token.revoke)
  7. The APIs for check.url, check.email, submit.url, submit.email then use the token.

I did not understand why there is a need for FROB in this, why can’t you just get the token from api key and shared secret ? What problem are they solving by this indirection ?

Anyway, here is the ruby script that I used for testing this… I am planning to turn this into a module, but providing it here for early access…
phishtank.rb
config.yml

P.S. the check_url interface is not working, I am getting invalid token error. and the same token can be revoked successfully.
P.P.S. The API uses SSL (no cleartext api available) and ruby’s open-uri library insists on checking the server SSL certificate which always fails (probably because signer needs to be trusted by openssl), I had to change it locally to ignore ssl verification in order to proceed.

Update (Oct/12/06): the check.url interface is finally working. For this API, the signature needs to be calculated before escaping the url. I refactored the ruby script a bit to remove redundant code and moved the configuration to a seperate file. I still need to work with the response parser and make it general for all types of responses. XML parsing gets so ugly so fast, it’s amazing!

Tags: , , , ,

There is a big copyright violation fight going on between Copyright Holders and Google about what is fair use and what is a violation. I came across a great article by Cory Doctorow on this issue. He is firmly on the side of Google on this issue and lists the three main points of contention:

  1. Google should cut copyright holders in for a slice of any revenue that comes from this.
  2. Google should have obtained permission before scanning the GBS books
  3. Although Google only shows excerpts, wily hackers could eventually piece together enough excerpts to reproduce the entire GBS library and then post it on the Internet

He then goes and explains how each of the three points are invalid. As he rightly points out, the biggest threat as an author isn’t piracy, it’s obscurity.

He has quotes Tim O’reilly, who says Piracy is progressive taxation. This appears in this article which is also a must read. That article was written in 2002 and was about legality of online file sharing.

If I were a copyright holder, (I mean a big copyright holder), my own stand would be to let google scan and index all my work (with possible penaulties if any evidence was found that people can hack google’s system to reconstruct the complete work). The publishing industry will definitely move online and I will gain more if people can find links to my work when they are searching for related content.

The del.icio.us folks have a nifty javascript piece of code which adds a small button to all the mp3 links on your webpage. (This infact embeds a small shockwave/flash script for each link). All you need to do is include the following code in the head section of your webpage:


<script type="text/javascript" src="http://del.icio.us/js/playtagger"></script>

Here is a link to their webpage (Notice the small play button just before the link to audio file.

Here are some popular audio links contributed by del.icio.us users.

Update: And then there is Yahoo! Media player with nicer look and more features. You just need to add
<script type="text/javascript" src="http://mediaplayer.yahoo.com/js"></script>

at the end of the html (just before closing </body>)

Find lattitude and longitude of any address

Yahoo just released a new beta of their maps webservice. Here is a small ruby script (inspired by Rasmus’s PHP code ) that I wrote that returns Lattitude, Longitude of the address provided…

require 'open-uri'
require "rexml/document"
include REXML
url='http://api.local.yahoo.com/MapsService/V1/geocode?appid=yahoomap.rb&location='
puts 'Enter Location: '
address=gets
address=URI.escape(address)
result=URI(url+address).read
doc = Document.new result
r=doc.elements["/ResultSet/Result"]
print "Precision: ", r.attributes["precision"],"\n"
r.children.each { |c| print c.name, " : ",c.text,"\n"}

Update: Here is a link to the script in github or to the .

Tags: , , , , , , , ,

I promised here that I will polish my library lookup script and post it here, but haven’t yet found any time to do that. I am posting it here so someone could work on it and make it better…

I have tried to look for the ISBN in both SDCL and SDPL, and it inserts the message and the link correctly. However in some of the cases, amazon inserts some more elements inside the same div container, so the links appear much later. (not immediately following the title). I have an idea of how to fix this and am going to fix this very soon…

Get the LibraryLookup Greasemonkey script from here.

(Usage: RIght click on the link and select “Install User Script” from the menu. OR open it in forefox, go to tools/Install this user script, click ok and now on you will see if any book that you are browsing on amazon.com is in SDCL or SDPL!)

Update: 11/30/05. The new bookburro extension for firefox supports SDPL and SDCL libraries by default, I will not have to update my miserly scripts now…

On the internet nobody knows you are a dog
On the internet nobody knows you are a dog
Unfortunately it’s only true in cartoons! Basically you are leaving a surprisingly easy trail of the websites you visit. Visit Test anonymity if you want to find what web servers can know about you. A determined person can find out about the websites you browsed, what you did on each of them etc.

There are some commercial services like anonymizer that insert a random proxy between you and the destination web server. There are also a number of HTTP/Socks proxies that you can use. But then all of your traffic is subject to monitoring by these people.

Freenet project takes anonymity to other extreme and you can access content that you may not access otherwise, and also provides anti sensorship / banning features. But it has always been very slow, prone to protocol changes. (i.e. sites working the previous day do not work the next day because of release of new protocol and peer software).

Tor project takes another approach for this. The endpoints are still the same, but all your packets are routed using random combinations of tor routers. The routing technology is called onion routing where the encryption is only between hops in the route and none of the intermediate hops know either the contents of the packet or the sender. There is a provision for hidden services(any TCP protocol), which are not accessible from regular internet, which comes close to achieving what freenet does. I have been using tor for some time now and noticing some things:
* The performance is improving a great deal (as more and more tor nodes are commissioned, it will yield better performance)
* You can get routed through completely different continent, so going to google might open their german page (because they send you german page if they detect your IP address is from germany)
* This service might be easily abused by spammers who will definitely want to route spam through tor, child pornographer who can host “hidden services”, illegal content downloaders. (Though I believe many tor nodes block SMTP and peer-to-peer traffic). I guess there is a price to be paid for “really free speech”

Tags: ,

« Older entries