As the social web has grown, we find ourselves, more and more, cross-posting content to different service. For example, when I post a photo to Instagram, I often cross-post it to Twitter, Facebook, Foursquare, and Flickr. In the process of developing this latest version of JeffCroft.com (which pulls in my content from several social networks), I found myself wishing there were a way to identify the same content in multiple places.
Each place that photo goes adds some metadata to it. I wanted to collect all of this metadata and display the photo as one item on JeffCroft.com that, for example, included both how many Flickr comments the photo got and how many Instragram likes it got — and include links to this piece of content on all the networks it exists on.
Unfortunately, there is usually not a good, reliable way to do this. It’s difficult — if not impossible — to mash up most services’ APIs in a way that says, “this photo on Instagram is the same as this photo on Flickr and it’s also attached to this tweet on Twitter, and it was taken at the venue from this Foursquare check-in.”
I don’t have a solution for this — I’m just identifying it as a problem. Feels like something someone could solve, although I haven’t the slightest clue where to start.
This is exactly what has kept me from using features or services that require me to post duplicate content. For instance, I resisted using Instagram because I publish all the photos I care about to flickr; and I’ve generally not shared the same photos on facebook or twitter that I plan to share on flickr. What I’ve found myself doing is posting throw-away photos on twitter, then I save photos I think are good for flickr. It would be great if there was a generic way to tie all these features and services together in the way you describe.
The first “solution” that comes to my mind would be to have a single service responsible for posting the content to all those other sites. Then it’d collect the primary identifier each service gives the content for future reference. Doesn’t seem very elegant though.
Couldn’t agree more—this is a real problem on today’s web. The good news is that there are a couple solutions being worked on. PubSubHubbub and Salmon Protocol, which unifies the conversation.
Unfortunately, support is pretty sparse—at least last I checked. It’s a shame. There is so much great conversation taking place online, just not in any sort of unified way.
This ties into something I’ve been thinking about recently: we’re too dependant on third-party services. Our content lives elsewhere, and it’s easier to use pull APIs to retrieve it and link to it from our own domain than it is to publish once and push to many channels. Last time I checked neither Forrst nor Google+ had write APIs. Facebook does, but it changes too frequently.
Hey Chris: Not entirely sure I agree, but I definitely see the point you’re making. People who see my site often ask why I post to services and then pull that data into here, instead of the other way around (posting here, and then pushing it out to services). The answer is pretty simple: tooling.
Take a simple link. I could post a link here and push it to a network like Delicious. But if I did that, I’d have to write a tool for posting links to this site. When I’m browsing my feedreader and I want to post a story I see to my site, I’d need to copy the URL, launch Safari (or whatever tool I’ve built), enter the URL to my tool, and fill out the form. If I use Delicious, well, my feedreader already has a button for “post to Delicious.” All I have to do is click it. Bam!
Take a Tweet, for instance. If I post these short updates to my site and then push them to Twitter, I have to build and maintain the tool that I post into. If I use Twitter, instead, I just grab any of the countless Twitter clients out there and go to work.
I don’t feel overly dependent on a third-party service as long as they have a read API that lets me get my data out and make a copy of it in my own database, like I do here. Instagram, Flickr, Delicious, Pinboard, Twitter, and Foursquare (the services I’m currently importing content into this site from) all have that. So, I’m comfortable enough.
Seriously. You just give away amazing start-up ideas?
I dove into this a little while ago as an aggregator for my personal site, which then turned into a prototype hosted service which didn’t pan out. (And then I got tired of maintaing a server for my own site.) I cross-post from Twitter to Facebook so having the same thing doubled up nearly every time was annoying. It wasn’t true of all content, though, so I couldn’t just exclude Facebook.
Text-based content was pretty straightforward, using a combination of timestamp deltas and diffs to find matches. I used timestamps in case I posted something very similar previously*, and to constrain the comparison content. Unless there was an unusual lag in the cross-posting, it worked pretty well.
Images were something I thought about but never got around to. Content comparison is a lot trickier since the images get resized and recompressed. Some ideas I had for doing that were comparing the originals if possible, and failing that, using timestamp, location, and caption information to judge the similarity. I had daydreamed some sophisticated location-/time-based clustering for automatic “life-story” generation, so using them for deduplication seemed promising.
*Looking back at all my twitter posts I discovered I’ve actually posted the exact same thought a few times, each time thinking it was the first I had thought of it. My poor followers.
I forgot to mention The Locker Project (and Singly, a hosted version of it + API). It does aggregation and some duplication checking that’s probably worth looking at: lockerproject.org
To me the solution that would seem most obvious is doing the posting from one place and gettning all the data in that same place. It would make life much easier, and it would benefit the different platforms involved, since I also tend to refrain from posting the same things with all tools that I use. So who is going to grab this business opportunity?
what tool did you use to code this website?
was it django?
sometimes i get dizy with all the cross posting lol but if i dont i will forget who saw what, :O