making twitter a reliable news source

When the tragedy (if that’s even a big enough term) of MH17 occurred, I found out through twitter.  It was reported within seemingly minutes of occurrence.  I followed the trend term for Malaysian Airlines and then MH17 for up to the second details of the event.  It’s not the first time I’ve done it, either.  I used twitter similarly to follow much less somber subjects, such as World Cup soccer, or the Google i/o Developers conference.  What I didn’t do was turn on my television, or go to the advertising-laden web page of a major news source.  I didn’t go to those sources because they were behind what I was seeing on twitter.  Sheer logistics made them slower.

I’m a pretty quick typist.  I can type about 350 characters in a minute.  That means, with the right content, I can submit 2 and a half tweets per minute.  Theoretically, I could summarize most of the facts from the MH17 disaster in less than ten minutes of tweeting.  It takes Fox News ten minutes to get Shepherd Smith’s hair right.  They just can’t compete in a timely fashion.  Where twitter starts to miss it’s mark is in the reliability of the information.  Because just like the facts, I can tweet out a substantial amount of lies in 10 minutes as well.   Is there a way for twitter to give you a reliability index for the tweets you’re reading?  I think so, and here’s how.

Let’s start with a simple example: a news event that we know occurred, and know where it occurred.   We know the Major League Baseball All-Star game happened last week on Tuesday.  We know it happened in Minneapolis, and, if I took thirty seconds to Google it, I could probably give you the GPS coordinates of the field.  So let’s start there.  The best and fastest facts will probably come from someone who is tweeting from a close, geographical proximity to the game.  Easy enough: if they’re close to the event, the reliability goes up.  Now let’s say Jim, a fictional guy at the game, tweets “Jeter hits a home run! Way to go out on top! #allstargame”.  Since he’s at the game, there is some reliability to that tweet.  But what if Jim tweets, “Babe Ruth hits a home run! #allstargame”?  Jim still has the proximal reliability, but is factually wrong.  So we can increase or decrease his reliability, by the presence or absence of a corresponding tweet from an alternate source.  This works for major events that people may care about (Jeter’s home run), but isn’t really important for inconsequential items: “The hotdogs at the #allstargame are delicious”.  We strengthen the reliability by the degrees of separation between the people tweeting the information.  If Jim and his follower Tom both tweet about Jeter’s home run it is less reliable than if Jim tweets it, and Anne, whom he only has a fifth degree connection, also tweets it.  Since both Tom and Anne tweeted about Jeter’s home run, and they were both near the event, we can assign a pretty high level of certainty that it is accurate.  The best part is this can all done by content matching and some fancy math.

Got it?  OK, let’s move on.

Now a substantially more complicated event.  A car crash on the DC beltway at rush hour that has traffic backed up for miles.  Bill tweets, “Guess I’m not getting home early #wreck #dctraffic.”  This tweet has no reliability at this point.  It’s something Bill put out, but he could have been commenting that he thinks DC traffic, in general, is a wreck.  Here’s how it gets reliability AND applicability (a two for one folks).  Natalie also tweets out “Accident on the beltway has traffic backed up for miles! #stayaway.”  Assuming they are in the same proximity, they have increased the reliability of the event.  Similar to the Jim and Anne example, the less they know each other the better.   More collaborating tweets is better and increases the applicability.

There are a number of different hurdles to overcome for this to work.  Word associations is probably the largest.  I can envision a path for machine learning on this, but this post is long enough.  I’ll leave it there and hope someone else has some ideas.  You’re welcome for the million dollar idea, twitter.



