September 30, 2009

First time so far that I’ve reached out for help (other than for some illustration which I haven’t pushed live yet): @arniehernandez is helping me setup and migrate to a new hosting provider. Probably will be done and ready by tomorrow, but I have a lot going on late this week AND this weekend, and some possible press on the horizon this Sunday, so I’m reticent to push live ’til things have settled and we’ve had time to really put the new environment though its paces.

Another potential complication is my long-planned shift to my new Twitter user data retrieval and caching system. Long-run it’ll scale better, but for right now it’s going to seriously tax my current infrastructure (the one we’re moving away from). So I feel stuck!


Wonder Why

September 16, 2009

Great little blurb about GraphEdge today, by Scott Kirsner, on Boston.com

One thing Scott wrote really caught my eye.

Over the past week, I’d somehow attracted 123 new followers, but 21 people chose to stop following me. (Why, I wonder? It can’t be for a lack of wit or trenchant 140-character insights.)

That “wonder why” part is the thing I want to address in upcoming releases. Will hopefully be tying growth/decline/churn to tweeting-rates, and perhaps to individual tweets. Would be nice to know what you tweeted just before 40 people dumped you, right?! Even better – what’s the profile of the users who dumped you? Maybe you don’t care that 40 spam-like accounts dropped you because of something you tweeted. Or maybe the dumps are from accounts that are right in the heart of you target audience. Either way, it would be nice to know. Today we show you the names, but we need to profile the group, as well.

This would be a part of a larger effort to make the report data more actionable. It’s great to look at the stats, but we’ll be working hard in coming weeks to tie-in the “why’s” and the “what now’s” that the data inevitably will surface.

Update – New API Features

September 16, 2009

Just to follow-up on previous post, looks the the API team have released—not a fix, but a new implementation to replace the old API call feature. Will implement that ASAP to get the app collecting data again for my high-followers accounts.

Twitter API – Big Trouble

September 15, 2009

Very sad to me that, right now, the biggest threat I think my new business is facing isn’t competition from other Twitter apps, or Google, or any other development: it’s the Twitter platform API itself.

For more than a week, there has been a failure of one of their primary API calls. Many apps aren’t aware of it because it only effects API calls to get friends/followers for Twitter accounts with large numbers of followers. If all you need is the number (the count) you can get that. If you need to know specifically who is following, then you CAN’T get that, even though it’s a basic part of the API.

It’s an accepted bug, given high priority, and I still can’t get anyone in the API team to tell me they’re even working on it. I’ve been crying and pleading and lately shouting, both in the bug tracking area and the discussion group, and still I’m losing data every day. Worse, my business development efforts are at a virtual stand-still, because I cannot show a broken product, and all my most promising leads are (logically) companies with large numbers of followers.

What now?? Is this what I get for building on a young platform? I knew Twitter’s consumer experience was notoriously flakey, but I wouldn’t mind outages if they’d just support their development community by keeping the building blocks of their API up and running.


September 12, 2009

With the codefrenzy 1 week over, I’ve finally cleaned up and released all the new features I’ve been developing, including the very cool “more info” report page for followers. We now have a very cool stripe on the bars which shows the GraphEdge average % of followers for each bucket. The relevant followers stuff. Also, lots of behind the scenes improvements and fixes… more than 7M unique twitter accounts to know about and keep fresh… that took a lot of planning and work! And it’s growing.

Stay tuned for more info about next steps. I’m moving into a marketing/outreach phase, so I don’t know what the next development features will be, but I think it’s time to start getting into retweeting stuff. Open to suggestions, too, so let me know.

CodeFrenzy: Day 5

August 30, 2009

Only 1 day ahead of schedule, when I had hoped to be three days ahead by now, or more. However, not to fret: this remote task processing thing has completely changed the scale at which the app can handle new accounts with huge sets of followers.

Moving on now to the second of my big tasks, and one that’s far more tangible for the service’s users. An overhaul to the Report. Features I’m weighing right now (please weigh-in!)

  • Profiling on followers: histograms on how many other people they’re following, how many followers they have, how recently they’ve been active on Twitter (measured by time since last tweet).
  • Ability to explore a list of my account’s illegitimate followers (including the reason they’re classified as illegit.).
  • “Most relevant” followers’ followees. Currently I do most popular. For Starbucks, for example, about 11,200 of their 290K followers follow Barack Obama in addition to Starbucks. He’s the most popular “peer” to Starbucks, in Starbucks’ set of followers. But everyone follows Obama (2M followers and counting), so that doesn’t really help Starbucks distinguish their followers from any other Twitter users. So I’ll create a ratio of also-follows to the total number followers for the popular user. This gets us closer to the heart of the matter: who else really matters to Starbucks’ followers?
  • Ability to explore the complete list of follow and un-follow events, by day, w/ paging (I currently limit the list to 100 follows and 300 unfollows).
  • Comparisons of each individual account’s data with the aggregated data from all GraphEdge accounts. Example: The histogram of how many friends your followers have sits next to the same analysis for ALL followers in all GraphEdge accounts. This tells you how you’re doing compared to some other benchmark.
  • Other stuff, some minor, some less minor. Bug fixes, etc.

I don’t think I have time to do all that, plus break the report presentation into multiple pages, which is what it needs. But these are the things I’m looking at. As a roadmap, it’s not bad.

I think I’ll focus on the stuff that’ll make the biggest immediate impact on prospective new accounts. That’s going to be charts. Why don’t I use this as my punch-list:

  • Follower Profiles with historgrams for friends, followers, and activity.
  • Most-relevant peers. This may be ambitious because I can’t generate this report without having data for all followers’ friends, at crawl-time. Right now I get only followers themselves, not their networks as well, and only pick up data on most-popular follows after I know who they are. In this case I don’t know who they are until after I have their data (because their data helps me determine who will be on the list). My “rollers” (as I’ve starting calling my remote processing agents) may be able to help me here, but there may be side-effects to increasing my twitter data table by 10-20X.
  • Plus organization of report page presentation changes needed to accommodate these new features.
[2009:08:30 08:09:02] – – – – – Completed Tasks – – – – –  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 201 – findOldAndNewAccounts [ID 1]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 190 – updateVoxBotFollowers [ID 2]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 187 – clearOrphanedTaskReservations [ID 31]  <br/>
[2009:08:30 08:09:02] — 79 tasks of pri 182 – getAcctTwitterUserDataForPendingReport [ID 27]  <br/>
[2009:08:30 08:09:02] — 117 tasks of pri 180 – getTwitterUserDataForPendingReport [ID 3]  <br/>
[2009:08:30 08:09:02] — 79 tasks of pri 170 – findUserAddDropsForTwitterID [ID 4]  <br/>
[2009:08:30 08:09:02] — 79 tasks of pri 160 – getUpdatedFollowersForTwitterID [ID 5]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 145 – findUserAddDropsForAllAccounts [ID 13]  <br/>
[2009:08:30 08:09:02] — 12 tasks of pri 70 – processTaskResults [ID 32]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 64 – verifyTwitterDataInternally [ID 33]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 62 – queueTwitterUsersForDataVerification [ID 24]  <br/>
[2009:08:30 08:09:02] — 36 tasks of pri 59 – collapseNetworkConnectionsToSummary [ID 25]  <br/>
[2009:08:30 08:09:02] — 37 tasks of pri 58 – crawlSecondLevelConnections [ID 18]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 57 – runSecondLevelAnalysisForAcct [ID 19]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 55 – crawlAcctNetworkUserConnections [ID 17]  <br/>
[2009:08:30 08:09:02] — 1 tasks of pri 50 – crawlAnotherAcctNetworkConnections [ID 16]  <br/>
[2009:08:30 08:09:02] — 1882 tasks of pri 35 – verifyTwitterData [ID 23]  <br/>

[2009:08:30 08:09:02] — 2330 total completed, of 17 types.  <br/>

Codefrenzy: Day 3

August 28, 2009

Parallel remote processing of atomic GraphEdge Task Queue tasks: done.

It took 50% more time than I (secretly) wanted to spend on it. But only absorbing 60% of the time I budgeted, so I’m calling it a victory. That’s 2 extra found days for CodeFrenzy 2009… good thing because I’m going to need at least one of them to clean up the house after this little party!