The only social media metric that really matter.

Social media can be daunting or even complex if we see it as a whole.

It can even be a mess…

One of the best ways to keep things under control would be to measure it.

as the saying goes “you can only improve what you measure”, so knowing the math behind social media certainly helps.

This brings me to a whole list of tools which may or may not provide metrics for measuring social media.

Too long a list? No problem, try reading Social Media Metrics written by Jim Stern.

Still overly complex?

Try reading a research paper that is published by HP Social Computing Labs : “Predicting the Future with Social Media

The premise is this: the rate of tweets posted about you/your product/company is the most accurate predictor of success.

Kudos to HP Social Computing Labs.

Scraping and Analyzing Social Data

[Update Feb 2013: this tool has evolved to become Querybox which came in first for StartUp Weekend 2013 at Taiwan, Tainan. ]

[Update : the tool is now featured on First Monday! http://www.firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/3937 ]

I’m currently involved in a project National Science Council (Taiwan) funded project that investigates public communications in times of crisis.

I’m in charge of building a tool that helps non-technical users to mine and analyze data related to disasters such as the recent Japanese Earthquake on new social media outlets such as Twitter,m Facebook and blogs.

I’ve given it a nickname “Scraper”, cos its scraps data from social media. Not the most creative name, but it delivers. =).

I do have some experience in doing similar tools, but on a smaller scale. Hence it’s certainly a challenge to be involved in such projects.

Here’s some of my thoughts about this project:

Architecture

The tool should contain the following components:

  1. a web based GUI so that any users can use it
  2. A visualization layer, so that non technical users can make sense of the data
  3. A mining layer, so users can keep track of the topics they are interested in
  4. A computation layer – this is to allow users to perform various analysis, such as searching for keyword frequency over a certain period of time.

Potential Challenges

Computation Layer

The computation layer will be required to do massive calculations in very short periods of time. I’m currently tracking data about the Japanese Earthquake (using Twitter’s search API )and have about 300,000 plus odd tweets. Calculating the most frequent keywords and users on a Quad-core, 4GB RAM  personal computer can take more than 40 seconds to do the job.

So the solution would most probably make use Hadoop or some other distributed computing platforms to speed up the process.

Visualization Layer

This is fairly subjective. It depends on what kinds of analysis end users would want to perform. In my case, I foresee that users would want to perform time-series analysis.

For example, based on my current data, i noticed that different keywords “bubbles up” in different days; on the first day or so of the Japanese Earthquake, the keyword “earthquake”, “death” appears

Than by 14th of March, approximately 3 to 4 days after the earthquake, keywords like “nuclear” begins to bubble up.

So a time series analysis of the data will show different topics which are of concern to users at different points in time of the event.

I’m considering using the Google Visualization APIs, as it should do the job. I think the motion chart tool is very cool.

Technologies used

I intend to add more APIs such as Facebook, Google Buzz over the next few months.

EndNote

I’m currently focused on Twitter and have almost completed the mining portion of the tool.

Am now looking forward to implement the computation and visualization component of the tool, and perhaps all other media outlets like Facebook and Google Buzz.

Any thoughts? Talk to me here using the comments box.

Visualizing Social Data

[UPDATE on 21st March 2011]

I’m now working a similar but more powertool, read more about it here.

Introduction

This is a public post about a CS project which I have done in the previous semester for the course “Special Topics on Computer Science (B).

The idea behind this project is to visualize data from Twitter, Facebook and Google Buzz in the form of a TreeMap.

Usage

The usage is as follows:

  1. User inputs topics of interest
  2. Program scraps for dataacross Facebook, Twitter and Google Buzz
  3. Results from 2. is aggregated according to its attributes
  4. User can select the various attributes he wants to visualize.
  5. After selecting attributes, the TreeMap is created based on the data and attributes selected.

Video Demonstration

Here’s a video demo : http://www.youtube.com/watch?v=5DNRd4wrLL0

Environment and Tools Used

Here’s the stuff which I’ve used:

An easier way to listen to music from YouTube

I’ve just built a small project, TubeFace.me over the past couple of days or so as a form of practise and to scratch my own itch.

I usually use YouTube to listen to music; clicking and  searching through the results is quite a hassle to me. Not to mention opening new tabs.

I wanted to have an interface where I can perform search, view the results and play the video within the same page.

And so, i did just that. You can visit TubeFace.me right here.

So here’s a summary about TubeFace.me:

Features

  • instant search
  • view results on the same page
  • listen/watch videos on the same page
  • uses a grid based layout.
  • personalized recommendations if you login with Facebook.

Technologies

The technologies I’ve used are:

  • Tornado 1.2.1
  • Ubuntu 10.04 server Eidtion
  • JQuery 1.5,
  • Youtube API,
  • Facebook Graph Api

Visit TubeFace.me now and give it a try!