The Statistics Behind Digg Submissions

Analysis,Digg by on June 4, 2007 at 8:51 pm

Ever since Digg announced their API, I’ve been eager to see what stats I could generate. Since my wife is out at Book Club tonight, I spent a bit of time with Digg’s API. All of the analysis below was conducted on all of the stories submitted in May:

How long does it take for stories to get promoted?

after-submission2.png

Very few stories get promoted within 2 hrs. And very few stories get promoted after 24 hours. There is definitely a window of opportunity that lasts for 24 hours after submission.

Introducing ‘Promote Rate’

Up to date, the most interesting studies done on Digg have involved basic analysis of already promoted stories. Pronet Advertising has a good look at the top 10 brands on Digg, and SEOMoz has a YouMoz article on Digg that talks about the best time to submit a story.

While both of these articles are quite interesting, I think the greatest indicator of success on Digg is something I’ve been calling ‘Promote Rate’. Basically, it is the percentage of stories of a given set of characteristics that were promoted to the first page.

Best Time of Day to Submit to Digg:

by-hour.png

Promote rates are higher on the weekends and in the evenings. A story submitted around 9PM on a weekday enjoys a 66% higher promotion rate than an 8 AM post.

Best Category to Submit to:

category.png

OK, so submitting an article to “Linux/Unix” looks to be 16x more likely to get promoted than if you submitted an article to “Business & Finance”. Certainly Diggers prefer Linux stories to the latest TPG buyout.

How much of this preference is topical vs. the category of the article? I looked at all of the stories submitted with the word ‘Linux’ in the title inside and outside the “Linux/Unix” category:

linux.png

Articles with the word ‘Linux’ in the title are promoted 9x more frequently if they are submitted in the “Linux/Unix” category.

Does having a user image matter?

user-image2.png

Users with images have more stories promoted than users without images. I would posit that a user image may indicate an active user with more friends, but submit stories without an image at your own risk =).

Anyway, that’s all for this evening. I’m looking at a few more things and will post a follow up in a little while.

Notes:

  • Be careful with causality. While I think some of the conclusions are reasonable, I haven’t always gone to the extent necessary to prove causality - we may just be seeing correlation.
  • I experienced XML errors with a small fraction of the calls to the Digg API - I didn’t try to recover these records, so the dataset is not 100% complete.

Digg unbans nearly all domains

Digg by on February 23, 2007 at 2:32 pm

Less than 4 days after my post about banned websites at Digg made the front page of Digg (and was subsequently buried), Niel Patel points out that Digg has unbanned a significant number of those websites.

Maybe this was just a random coincidence, but I like to think that publicizing the list of banned domains actually helped bring about the change.

Thanks Rahul for pointing this out!

Update
I reran the list of 183 banned domains and found just about all of them to be unbanned. The only remaining banned sites on the list are:

neogaf.com
thevideosense.com
blinklist.com
geocities.com
digg.com
idontlikeyouinthatway.com

The first 4 sites were the ‘temporary ban’ type. Digg is digg. In fact, idontlikeyouinthatway is the only site from the original list that was permanently banned. It would seem likely that all the permanent bans were lifted (but not the temporary ones), and that the idontlikeyouinthatway was rebanned. Or someone accidentally deleted a db table.

The Secret List of Sites Banned by Digg

Analysis,Digg by on February 19, 2007 at 9:56 am




Update

Nearly everyone of these sites has been unbanned. There are only 6 sites on this list that remain banned.

neogaf.com
thevideosense.com
blinklist.com
geocities.com
digg.com
idontlikeyouinthatway.com

The first 4 sites were the ‘temporary ban’ type. Digg is digg and will likely remain banned. In fact, idontlikeyouinthatway is the only site from the original list that was permanently banned. It would seem likely that all the permanent bans were lifted (but not the temporary ones), and that the idontlikeyouinthatway was rebanned (or maybe it was extra banned to begin with)…

Original Post:

Ever wonder which sites are banned by Digg? Who would have thought that 3 of the top 10 Alexa sites and sites like CareerBuilder, DHL and 43Things would be banned? To develop as complete a list as possible, I tested the top 10,000 Alexa domains and top 1,000 Blogshares blogs to see which were banned. Overall, I found 183 banned sites.

The banned sites fell into several categories:

  • User Generated Content sites without subdomains. One bad actor on these sites can ruin it for everyone. istock_000002759661xsmall2.jpg Popular UGC sites like Myspace, Squidoo, 43Things, Geocities are all banned, whereas sites like Typepad, Blogspot, WordPress do just fine because it is easy to ban one bad actor. If I were Seth Godin, I’d give Squidoo lenses their own subdomains pronto - there is good content on Squidoo that will never see the light of Digg.
  • Sites about SEO & Affiliate Marketing. These include TopRankBlog, DigitalPoint, Revenews, John Chow, Paula Mooney, etc. There is some great content that’s been banned … and plenty of poor content as well (theRichJerk).
  • International Sites, particularly Asian sites (Baidu, Sohu, Sina, Yandex, etc.). I can’t speak to the quality of these sites, but four of them are in Alexa’s top 20 and others are very popular. Digg and Digg users would certainly benefit from international versions of its site. (Hint, follow the Google model, not the Yahoo model).
  • Scummy sites. There are plenty of sites here that I’m not surprised to find banned. Gossip Sites (perezhilton), Adult-themed sites (pornotube), adware/spyware sites (smileycentral), etc.

I’m sure that plenty of sites were banned due to attempts at gaming Digg, but I obviously can’t distinguish those from the sites on the list above.

The big list of banned domains:

Domain (Alexa)

baidu.com (4)
myspace.com (6)
sina.com.cn (10)
sohu.com (16)
163.com (17)
rapidshare.com (26)
wretch.cc (32)
yandex.ru (43)
rapidshare.de (65)
geocities.com (69)
digg.com (75)
digitalpoint.com (103)
126.com (105)
pornotube.com (188)
ynet.co.il (192)
21cn.com (194)
elmundo.es (248)
smileycentral.com (300)
libero.it (329)
livejasmin.com (330)
freewebs.com (339)
careerbuilder.com (388)
o2.pl (393)
sina.com (397)
juggcrew.com (404)
anonym.to (435)
startimes2.com (446)
ezinearticles.com (453)
forumer.com (469)
bangbros.com (512)
fishki.net (526)
donews.com (562)
6rooms.com (605)
yoqoo.com (617)
cjb.net (630)
myfreepaysite.com (637)
tvix.cn (666)
nichedsites.com (712)
tinyurl.com (727)
surfjunky.com (780)
as.com (785)
bolaa.com (819)
iwebtool.com (824)
perezhilton.com (832)
askjolene.com (835)
text-link-ads.com (949)
ce.cn (984)
getafreelancer.com (1053)
douban.com (1168)
thesuperficial.com (1210)
tiscali.it (1218)
1shoppingcart.com (1358)
katz.ws (1376)
clubic.com (1386)
segundamano.es (1580)
porkolt.com (1628)
indiafm.com (1656)
43things.com (1694)
wikimapia.org (1724)
ecademy.com (1749)
dreamhost.com (1819)
clickbank.net (1827)
thumblogger.com (1857)
hidebehind.com (1916)
oneindia.in (2004)
directtrack.com (2008)
egotastic.com (2019)
globes.co.il (2197)
tlen.pl (2228)
globe7.com (2263)
javimoya.com (2349)
wwtdd.com (2395)
serials.ws (2414)
sexyclips.org (2444)
techweb.com.cn (2504)
goarticles.com (2654)
furl.net (2662)
lix.in (2695)
care2.com (2747)
consumptionjunction.com (2825)
box.net (2879)
usfreeads.com (2923)
lynxtrack.com (2986)
dhl-usa.com (3010)
newsnow.co.uk (3051)
mojoflix.com (3063)
blueyonder.co.uk (3119)
fleshbot.com (3159)
freepay.com (3180)
lunarpages.com (3187)
9down.com (3289)
blinklist.com (3319)
bigpond.com (3382)
jajah.com (3596)
xpeeps.com (3603)
zooloo.co.il (3689)
m90.org (3696)
infos-du-net.com (3743)
agloco.com (3755)
johnchow.com (3887)
idontlikeyouinthatway.com (3898)
nothingtoxic.com (4007)
brinkster.com (4076)
blingo.com (4216)
earnersforum.com (4219)
6x.to (4260)
cheapflights.co.uk (4300)
naughtyathome.com (4333)
microsiervos.com (4335)
stubhub.com (4353)
justjared.com (4382)
petitiononline.com (4544)
assisass.com (4683)
ebags.com (4714)
ffshrine.org (4751)
planetnana.co.il (4769)
searchwarp.com (4912)
pimpmyspace.org (4954)
pokernews.com (4970)
totallycrap.com (5052)
giveawayoftheday.com (5089)
vbseo.com (5322)
dlisted.com (5323)
suite101.com (5361)
blogmarks.net (5436)
exploitedbabysitters.com (5480)
wierdporno.com (5537)
webworkshop.net (5846)
netidentity.com (5871)
neogaf.com (5932)
nforce.nl (5982)
parisexposed.com (6053)
defamer.com (6182)
therichjerk.com (6218)
yigg.de (6325)
ebooksclub.org (6371)
rs6.net (6400)
articlesbase.com (6445)
weakgame.com (6450)
podomatic.com (6524)
humornsex.com (6615)
vidaextra.com (6738)
clixgalore.com (6852)
todaysfreevideo.com (7001)
freeworldgroup.com (7022)
steakandcheese.com (7081)
webgains.com (7150)
crackserver.com (7159)
spankwire.com (7294)
funnyinside.com (7295)
bastardly.com (7403)
bildirgec.org (7417)
softsearch.ru (7442)
koreus.com (7560)
toprankblog.com (7568)
kingsofchaos.com (7642)
mihd.net (7977)
nastyboards.com (8118)
serialz.to (8121)
azjmp.com (8155)
totallynsfw.com (8260)
gambling911.com (8265)
shoutwire.com (8374)
poosieflix.com (8387)
stormpay.com (8475)
revenews.com (8703)
knuttz.net (8765)
gamereplays.org (8816)
indianpad.com (8867)
stormfront.org (8874)
habrahabr.ru (8900)
jkonline.cn (8976)
presseportal.de (9295)
thevideosense.com (9320)
bet365.com (9826)
offtopic.com (9841)
sweetnjuicey.com (9938)
fishki.ne (blogshares)
geeksmakemehot.com (blogshares)
mess.be (blogshares)
microsiervos.co (blogshares)
sfoxes.blogspot.com (blogshares)
popbytes.com (blogshares)
theundersigned.net (blogshares)

Methodology:

  • How to test a domain on Digg. Digg performs several validation checks when a URL is submitted. After these checks, Digg takes you to a page to enter the title and description. The checks occur in this order:
    • Is the URL valid?
    • Has the URL been submitted before?
    • Is the domain banned? Digg has three types of banning:
      • url is on the banned submit list. This seems to be a permanent ban.
      • This URL has been reported by users and cannot be submitted at this time. Perhaps a temporary ban? Sites previously listed with this tag don’t appear to be currently banned.
      • Please link directly to the story source.This URL has been reported as a news middle-man, it will remain blocked for 0 days. It looks like the bans start at 300 days or so…
  • Getting the top 10,000 domains. I used Ruby to query Amazon’s Alexa Top Sites web service and get the list of the top 10,000 sites. Five minutes later, I was $25 poorer and 10,000 domains richer.
  • Constructing queryable URLs. Alexa doesn’t provide subdomain information, so I added a “www” to the front of every domain, and a fake parameter to the back of each domain, thus creating a valid, unique URL for testing. So, 43things.com became www.43things.com?a13=1
  • I then tested all 10,000 URLs (in the middle of the night so as to not load Digg’s servers) to see if they passed all three tests. The ones that failed the ‘banned domain’ test are those I included in the list above.

Known Flaws:

  • Digg blocks at the subdomain level. I didn’t have the data to query subdomains. So, I added a www at the front of every domain. I missed all subdomains such as mydiggspamblog.blogspot.com or ww2.myspamsite.com
  • Not all websites accepted my fake parameter. These domains failed the valid URL test. 6% of websites didn’t return a valid page when presented with the parameter - most commonly because they perform some redirect when a user types domain root. Check out the diamond retailer: www.tiffany.com for an example.
  • Of course, I missed many, many websites that were banned by Digg.

More resources & Related Posts:

The social side of Digg

Digg by on December 12, 2006 at 1:27 am

I’ve been a member of Digg for close to 6 months, a regular reader/digger close to 3 months and a submitter for a few weeks. Only now am I starting to understand the social side of Digg (its huge).

If you’re looking to just understand Digg, read this great post and the references he provides. I wanted to focus a few thoughts on screenshots on the social side of digg which is largely invisible to most new users.

The introduction most users see to the social side of Digg is this innocuous-looking friends tab on their profile page and the Friends activity box above the fold on the left nav rail (note the stories listed by each friend, and the sort order of the friends – more on this at the end):

friendspage1.JPG

A user can completely overlook this aspect of the site. Friend association is all 1-way. Anyone can call you their friend, and you can call anyone your friend. There are no notifications when this happens, other than the appearance of a new friend on your page. The only way that you’ll know if someone has declared you their friend is if you visit that page.

Once you’ve declared friends, your experience on Digg changes radically. These changes are most visible in two locations:

1. Any list of deals: Deals voted on by your friends are tagged with an green ribbon. When you mouse over the “# of diggs” button, you see which friends had dugg the article. Your eyes are automatically drawn to these stories and I found myself much more likely to vote on a friend’s story.

mainpage.JPG

2. Any comment thread. Much like the deal lists, comment threads take on a new look. Comments by your friends are boxed in a shaded green, and bright yellow stars are placed next to the posts that they dugg up or down. You can even see what they thought about your comments.

comments.JPG

Important changes also happen to your Friend’s History tab:

friendshistory.JPG

The friend’s history and the profile pages of your friends play an important role in how people digg. On the friend’s tab, many friends select their current #1 story. This provides extra visibility to stories that they are particularly excited about, and it certainly results in extra diggs.

I was also surprised that Digg’s default sort mechanism was alphabetical. Friends with numbers at the front of their names are always at the top. This seems a little surprising, given the extra exposure that top placement gives those users and their stories. It is only a matter of time before digg spammers figure out the same things that locksmiths did years ago in the Yellow Pages: 1AAA Locksmith is ordered alphabetically above AAA locksmith.

The story I submitted that reached the front page and was buried, has received 53 diggs from these pages after it was buried (in 48 hrs). That’s almost enough to make the front page again, twice. It is no wonder that rumors are swirling about promoters buying diggs.

Dugg and then Buried

Digg,SEO by on December 10, 2006 at 6:12 pm

I had a blog post reach the front page of Digg, and then after 630 Diggs had it buried. At the time it was buried, it had the most votes on the front page (I should have taken a screenshot). The post hit the front page in under an hour, requiring only about 30 Diggs (it was a Saturday morning).

The post stirred some controversy as it implied non-altruistic motives of some Top Diggers. I didn’t think the post was that controversial, but it definitely hit a few nerves. I followed the buried post up with a profile of one Digger and the impact he had on a site that he Dugg 147 times over the last 60 days. That post was on its way to the front page (15 Diggs in 2 hours), but was also buried.

A few interesting tidbits:

  • 39 Diggs after the bury. After the post was buried, it received 39 Diggs in the subsequent 24 hours – enough to hit the first page (on a weekend). This truly speaks to the power of the friend effect. The only way that Diggers could find the story was through the profile pages of their friends.
  • 14,097 Unique users. No doubt would have been higher had the post not been buried – the traffic from Digg had mostly disappeared after the story was buried.
  • 630 new feed subscribers recorded by Feedburner. These don’t seem to actually represent subscribers though. We’ll see how that number looks several days later.
  • 32 new inbound blog links were picked up by Technorati. I’m sure the actual number of inbound links is higher. I’ll check that number in the search engines in a few weeks.
  • 4 new Digg friends. I’m just understanding the social side of Digg and I have to admit it is pretty cool. I love that Digg stands alone without the social side, but it becomes even more compelling as you get sucked in.

My biggest takeaway is that controversy may work in the blogosphere, but it doesn’t work on Digg.

Update 1:
Technorati inbound links updated from 20 to 32.

What your Alexa stats would look like after 20 diggs in 30 days

Analysis,Digg by on December 9, 2006 at 3:03 pm

nwfdailynews.com (North West Floriday) has had 20 stories dugg in the last 30 days (including 5 on 1 day alone). The graph below shows Alexa’s traffic stats along with the dates that the stories were dug. Top Digger giantapplecore (Ranked #39), submitted all 20 of those stories, and 147 stories from NWFDailyNews over the last two months.

Just about all the articles submitted by giantapplecore are syndicated AP articles. Given the tremendous success giantapplecore has had with Digg, I would guess that NWFDailyNews is a model for monetizing Digg traffic. Take a look at the adsense placements on this page: I would expect that the upper right box has been moved around a bunch and been fairly well optimized.

alexa2.PNG

Although NWFDailyNews seems to be oriented towards conservatives, it doesn’t look like giantapplecore’s submissions have a particular political bias. He is very good at writing compelling articles and headlines and I would guess that he has found a profitable system for making money off of Digg and is just repeating it over and over.

Syndicated news stories aren’t what Digg is about (kind of like blog spam), but news.yahoo.com is the third most submitted site and they are largely a news syndicator.

What’s even more impressive is how giantapplecore has used to Digg to build nwfdailynews from nothing 6 months ago, to a top 20,000 site on Alexa:

graph.png

Related Posts on Digg:

Which sites do the Top Diggers Read?

Analysis,Digg by on December 9, 2006 at 12:30 am

How do you influence the influencers?

I’ve become slightly obsessed with figuring out Digg. I’m fairly confident that the top Diggers represent the key to understanding Digg, so I scraped the most recent 150 submitted stories from the top 100 Diggers (just under 15,000 stories) with the goal of finding out what they read.

A few observations:

  • Although the list largely consisted of several tech-centric sites, there were a few surprises on the list – several domains that I’d never heard of before
  • Many Top Diggers have a pretty narrow reading list. However, there are those that are pretty adventurous – I’ll put together a separate post about the those diggers
  • Several sites have benefited tremendously from “Patron Diggers” – one or two diggers that religiously reads and submits from the site (in some cases they may be site owners, employees, etc.) See this post on nwfdailynews.com and GiantAppleCore.

The first output from my research is the top 50 domains that Top Diggers are reading. I’ve highlighted the domains where a single Digger accounted for an inordinate share of Diggs.

#DomainTotal Submits from Top 100 Diggers#Unique DiggersPatron Digger (#Submits)
1youtube.com51473Foenetik(105)
2news.com.com35857
3news.yahoo.com29856
4nytimes.com24949Parislemon (72)
5news.bbc.co.uk21053iFelix (71)
6today.reuters.com17944
7thinkprogress.org17511jlegum (119)
8physorg.com16628
9livescience.com15716starexplorer (104)
10arstechnica.com15435
     
11nwfdailynews.com1471giantAppleCore(147)
12washingtonpost.com14235
13engadget.com11941
14breitbart.com11710elebrio (87)
15abcnews.go.com11433
16cnn.com10533
17wired.com10240
18video.google.com9237
19gearlive.com912andru (90)
20theinquirer.net9126
     
21businessweek.com8929
22gizmodo.com8626
23msnbc.msn.com8545
24forbes.com7934
25linuxdevices.com797deviceguru (50)
26howtoforge.com765hausmasta (72)
27eweek.com7123
28money.cnn.com6827
29eurekalert.org6611
30lewrockwell.com662Rhiannon1214 (50)
     
31betanews.com6510
32informationweek.com6320
33kotaku.com6118
34space.com6010
35usatoday.com5631
36joystiq.com5518
37theregister.co.uk5421
38sciencedaily.com5312
39sports.espn.go.com5314
40timesonline.co.uk5331
     
41latimes.com5221
42tgdaily.com5111Bleek-II (30)
43newscientist.com5019
44news.zdnet.com4816
45mediamatters.org476snipehack (38)
46time.com4725
47blogs.zdnet.com4620
48dailytech.com4516Bleek-II (20)
49guardian.co.uk4527
50sfgate.com4522Tomboy501 (16)

nwfdailynews.com is the Northwest Florida News. Go figure.

Update

I’ve uploaded the source data I used for the analysis. Use that file to answer questions and comments along the lines of “You forgot ‘insert favorite site here’. I see that site on Digg all the time”. You’ll feel better knowing that I didn’t forget it – this was a study of the top diggers, not a study of all the sites submitted to Digg.

What I learned from my first attempt to get Dugg

Analysis,Digg by on December 3, 2006 at 10:01 pm

I submitted a post to digg on the Google Holiday gifts, hoping that it would be Dugg. As far as I can tell, I was the first person on the net to post a picture of the Google present, although someone had already posted the specs on a Webmasterworld thread. My post didn’t hit the front page, but several days later a very similar one did. My instincts were right (that the content was interesting to Digg users), but my execution was off.

Here is my submission:

Google’s 2006 Holiday Gift to Publishers – See a Photo (11 diggs) submitted by davenaff 5 days ago (via www.naffziger.net)
Google is sending out LCD photo frames to Publishers. Check out a picture of the Google Holiday package – it looks like they are preparing to send this out worldwide.
Here is the submission that hit the frontpage (submitted two days later):
Here it is: ‘The 2006 Google Christmas Card’ (747 Diggs)
submitted by CLIFFosakaJAPAN 3 days ago (via www.neatorama.com)
What do you have to do to get on Google’s Christmas card mailing list? Shawn Hogan describes the gift he received, a digital photo frame. I feel a little jealous…

I learned a few things:

  • Write titles that appeal to a broad audience – I narrowed the audience by using the words ‘for publishers’. The successful post title was unclear about who would receive a google holiday card (or that it was even something physical). Heck, for all the reader knew, they too could receive the Google holiday card.
  • Ask a question – Asking a question in the description encourages interaction and user comments – especially if the user knows the answer (and thinks the submitter doesn’t). It also makes users think about the story more. My most successful AdWords campaigns would combine a question with a call to action. I wonder if the same thing applies here.
  • Be active – Ideally get a top Digger to submitCLIFFosakaJAPAN is the #5 digg user. The top Digg users must be getting pitched stories, and if they aren’t I’m sure they will in the future. A story submitted by a top user has a much greater chance of getting to page 1. I wouldn’t be surprised if these pitches aren’t that sophisticated and I highly doubt that many PR agencies have established relationships with them (they should).

There is a ton of good content on the web about submitting to Digg, but these were my first takeaways and I felt it was useful to highlight them with a real-world successful/unsuccessful example.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. | Dave Naffziger’s Blog | Dave & Iva Naffziger