How To Back Up Your Adult Tumblr Blog
As promised in my last post, this is the post in which I tell you how to make a full and complete backup of your porn Tumblr blog (or, really, any Tumblr). My goal for you is a set of files on your own hard drive that contains all the text and all the links and all the pictures (even the full-sized high-res click through ones) that you’ve got on your current Tumblr blog, all linked together in a way that you can open the site in your browser and browse through it just like you would online. This can be done. It’s not even very hard. And once you’ve got it done, you’ll have all the raw material you would need re-create your tumblr blog on some other hosting, if anything should happen to Tumblr or to your porn blog on Tumblr.
But why should you worry about that? Why might you need a Tumblr backup?
Well, as I write this, the news is official: Yahoo has purchased Tumblr for more than a billion dollars, cash. (Tumblr shareholders did not want any stinky Yahoo stock, which should tell you something.) The business press has been pointing out for awhile that Yahoo will need to deal with what the suits in the corporate/financial/advertising world consider to be Tumblr’s “porn problem“. And Yahoo itself has a terrible reputation for buying cool, trendy, successful websites, running them into the ground or neglecting them to death, and then shuttering them. (Remember Geocities? Shaddup, it was cool once. A long time ago…) As Violet Blue puts it in her Sex Tech column at ZDNet:
Yahoo! is well-known for misunderstanding the user base of properties it acquires and ruining – then scrapping – once-active and beloved properties.
…
But if Flickr’s rep [under Yahoo’s ownership] with poorly policing ‘art nudes’ is any hint of Tumblr’s fate, then we’re likely to see lots of once-happy users forced into confusing self-rating protocols, having their accounts banned and years of content deleted with no recourse, and a new content policy practically written by trolls who want the easiest path to shut down people they don’t like.
I, myself, have been speculating for a couple of weeks that Tumblr would soon start cracking down on its “porn problem”, starting with an idle prediction in my The Pornocalypse Comes For Us All post and expanding on it when I discovered (apparently before pretty much anybody else noticed) that Tumblr had started trying to hide all the porn blogs from Google. At first the specific reason was not clear, but in the last few days the drumbeat of anticipatory news about the Yahoo purchase began to make the pieces fall into place. It’s safe to speculate that Tumblr began trying to minimize its “porn problem” while the sale was being negotiated, and there’s a strong basis for concern that (swiftly or eventually) Yahoo will continue that process and attempt to rid the Tumblr ecosystem of porn blogs. Even if they don’t, their track record of failure with acquisitions is such that there’s a good chance that all of Tumblr will have failed or shut down within a few years. And, for people who aren’t following Bacchus’s First Law of The Internet, backups are really important.
Enough nattering. You want tools and instructions.
I’m going to show you two ways to do this, a best-but-somewhat-complex way and an easy-but-somewhat-incomplete way.
Complete Tumblr Backup Solution:
First, the good way, the one recommended by my friend and prolific Tumblr-user Dr. Faustus. I’ve tested this and it works. When you’re done, you’ll have a complete copy of your Tumblr site on your own hard drive that you could navigate with your internet unplugged.
The program you want is: HTTrack/WinHTTrack Website Copier. It’s an open-source free-software general utility for copying and mirroring websites, available for most current versions of Windows as well as for a wide variety of Linux/Unix flavors. The Windows version presents a fairly old-fashioned interface with a bunch of cryptic options, but most of them come pre-set with sensible defaults that you actually don’t need to mess with. Plus, there’s good documentation. (Note well: there are many other programs out there that can accomplish this job. I’m recommending this one because it works and because I’m aware of it; I’m not claiming it’s the best or the easiest.)
Download the software and install it, then run it. You’ll be presented with a welcome screen where you need to click “Next”. Then, this screen:
The arrows show you the two fields that need your attention. All you really need to do is give this backup project a name and tell the software where to save the backup. Then hit “Next”:
On this screen you need to type in the URL for the Tumblr you want to back up. It will be something like: http://yourtumblr.tumblr.com — and there’s one vital reason you need to press the “Set options” button. When you do, you’ll see this:
I’ve pointed arrows at three optional settings tabs that you may want to adjust, and at the one mandatory options tab where you must change a setting. I’m going to ignore the optional ones for now, except to say that you would tinker with these if you wanted to change the sorts of media files you’re saving beyond the basic .gif, .jpg, and .png (you’d need to do this if you were saving a Tumblr that had .wav files or .zip files or .mp3s), or if you need to limit this program from slamming your internet connection too hard. It’s the mandatory “Spider” tab you really need to click:
See the box where it says “follow robots.txt rules”? That robots.txt they’re talking about is the very same unwelcome bugger that got us into this mess in the first place. As a general proposition, one should usually instruct one’s electronic robot spider minions to follow robots.txt rules; unruly robot spiders are a menace to the internet and to web servers everywhere. But this principle of polite internet behavior assumes that you haven’t had your own data locked behind the hostile barbed wire of some corporate data-silo forced-labor camp where the robots.txt has been put in place to hide your porny visage so that the corporate camp commissars will look prettier in the pages of Forbes Magazine and the Wall Street Journal. When it’s your data, you’re perfectly within your moral rights to ignore the robots.txt in order to extricate it; and so that’s what we’re going to do here. Change it using the drop-down menu to say “no robots.txt rules”.
Yay! We’re almost done. Hit the “OK” button, hit “Next”, hit “Finish”, and your site copying should begin.
How long it will take to finish depends on the available bandwidth of your net connection, the memory and processing speed of your computer, and on whether you tweaked any of the options that control things like how many simultaneous connections your computer is making and how many files it’s trying to download in parallel. It also depends on how many pages there are on the Tumblr blog you are backing up, and on how big the images are. The default settings seem to be fairly gentle about not maxing out your internet connection or putting an unruly amount of strain on the server at the site you are trying to copy. Using default settings and a fairly crappy internet connection, I downloaded a test adult Tumblr blog (with permission of the blogger) in about two hours, that had roughly a thousand posts and took up about three-quarters of a gigabyte of room on my hard drive. Your mileage may, and probably will, vary.
What does success look like? You’ll have a folder on your hard drive with the name you provided on the first options screen. If you open it, you will find many sub-folders, and much that may seem mysterious. You should also find a file called “index.html” — and if you click on it, it should open in a new browser window where you’ll be looking at your backed up Tumblr site, using nothing but the files on your hard drive.
What have we not accomplished? Well, you’ve made what should be a full and true copy, but it’s not a nice clean export in some standard format that you could use to easily import all your posts into another content management system or blogging tool. HTML files and related images are scattered through a system of directories and subdirectories that, while logical, may not be the simplest thing to work with. Using the data you’ve got, a clever computer person could generate an XHTML document (or something similar) that could be semi-automatically imported into (say) WordPress. But it would take parsing; it would take work. Figuring out how to take the copy you just made and turn it back into a non-Tumblr website is a solvable problem, but how easy or hard it might be to actually do it depends on your access to computer expertise and tools. For now, you’re safe in the knowledge that you’ve got all the posts you’ve made this past however-many years. You’ve got the images, you’ve got their metadata (any tags you set for them and any credits you may have reblogged or included) and you’ve got the clever things you said about them, all, safe on your hard drive.
Now would be a good time to back up your hard drive. I’m just sayin’.
Partial/Easier Tumblr Backup Solution:
Perhaps all the above is too involved or too complex for you. Or maybe you tried, and failed. For you, there’s a simple little web tool called Backup Jammy where you just type your desired URL into the box and press “Go”. That’s it. A single huge web page appears on your screen with all your Tumblr post content in a simplified format. Then you can use your browser’s “Save as web page” function to save it to your hard disk.
I don’t really recommend this tool. It doesn’t save nearly as much data as HTTrack/WinHTTrack does. In particular, all you seem to get is the standardized Tumblr 500-pixel versions of your images, and none of the higher-res versions that you may have posted. And if you have more posts than will fit in the memory of your computer at one time, you will have to do this in chunks, and save the chunks with appropriate names so you don’t overwrite one with another. It’s a less-complete solution. However, it’s also much easier, especially if your Tumblr blog only has a couple of hundred posts. And it might be enough for you. Certainly it’s better than nothing.
Conclusion
Given the existential threat that the adult Tumblr ecosystem is facing, I hope that smarter people than me will soon take some of the many fine website copying/mirroring tools that are out there, and meld them with friendly idiot-resistant interfaces and powerful parsing tools in a way that provides a seamless Tumblr export in a standardized format that’s ready for import into other blogging tools and posting on other social media platforms. I very much hope so, anyway. But that won’t happen today. A crufty backup you make today is worth a thousand times more than a perfect backup you never make before the platform goes down or is nerfed into uselessness or puts up filters to prevent the users from spidering their own content.
I’m painfully aware that the adult Tumblr backup solutions I’m offering here are messy, imperfect, and incomplete. All I can say in my defense is that they are the very best that I could find and test and describe and put up on the web in a single working day. For many of you, the Tumblr backup options listed here won’t be satisfactory or sufficient, and I apologize in advance for that. But if even a few of the great porn Tumblrs that went dark to public searching in the last few weeks are saved now and preserved on a hard drive and someday returned to the public web because of today’s effort, I’ll count it a day very well spent.
Similar Sex Blogging:
Shorter URL for sharing: https://www.erosblog.com/?p=9919
Thank you for this! I’ve been worried since I heard that Yahoo wanted to buy Tumblr. But, I’m on a mac … and though I was routed to a MacPorts page from the website you suggested, it’s all greek to me. Do you have any suggestions or links of pages that would describe how I can back-up my page using a mac? I would be oh so grateful. Thank you!!! xoLC
LadyCheeky, I looked hard for a Mac solution, but I didn’t find one. The one export utility Tumblr ever released was for Macs, but then they changed the API and broke it, never releasing an update.
Since modern Macs have a flavor of Unix “under the hood” (or so I’m told, I don’t have one) it shouldn’t be that hard to get HPTrack up and running on your machine. Presumably that’s what the MacPorts page you found is trying to help with. But it’s all Greek to me, too, and I don’t have an Apple machine to play with. So, regretfully, I don’t have much to offer you.
However, tools to mirror and copy websites are pretty basic building blocks — every major computing platform has them. I’m sure there’s something good for the Mac that will do the job, and hopefully somebody will chime in here with suggestions or links.
Hi there, since Mac is a flavor of Unix, you can use wget to backup the whole site :), there are many examples if you search.
hope it helps, i’m also using Mac.
Our Tumblr account (hosting multiple blogs) has been deleted twice, so we can appreciate the value of this information. We have been able to recover most lost content through cached RSS feeds but the tools described here look very helpful. Thanks for making these instructions (and warning) available, we share your hope for a unified tool to migrate gracefully away from Tumblr (if need be).
Geo, I saw some of those wget examples when I was searching the web for solutions. But all the ones I saw were pretty impossible to understand for people who are not unix heads. If anybody out there has written a “how to use wget to backup your tumblr blog for people who never heard of wget or Unix before today and didn’t even know that sneaky powerful shit was hiding on their Mac” tutorial, I’d love to include a link to it here.
You might consider contacting the Archive Team (http://archivet....org/ ) about the missing pieces here — they have the technical chops and the interest to make something good happen.
I’ve already been harassing their Jason Scott (@textfiles) to put Tumblr on the ArchiveTeam Deathwatch list. I’d *love* it if ArchiveTeam started downloading all of Tumblr *now* to beat the rush when Yahoo announces the shutdown. But one thing I don’t know is whether ArchiveTeam shares my disrespect for inappropriate robot exclusions. Assuming yes, but don’t know. There’s no way to respect the robots.txt and get the adult blogs at this point.
For Mac Site Sucker sitesucker.us/mac/ works well.
Fred
[…] you Windows users Eros Blog has published a handy […]
[…] How To Back Up Your Adult Tumblr Blog — ErosBlog: The Sex Blog […]
Here’s a tool that claims to create an export file for your tumblr in a format suitable for WordPress import. No endorsement from me, I haven’t tried it or checked it out in any way, but FYI: http://tumblr2w....net/
Bacchus, thank you so much for this – especially the walk-through! You’ve made it very easy to save my history, and that’s massively appreciated :)
xx Dee
@Bacchus I don’t know if this tool works well or not when you import, but it’s important to note that it doesn’t backup any media files; it just keeps the links pointing to the media on Tumblr.
So, one would have to find some way to backup the media as well. Instead of using that I think it’s better to use WordPress’s Tumblr importer. It’s far from perfect, it doesn’t import ALL your files, but at least most of them.
Also, for some reason it’s importing all the image posts as gallery posts (which screws up the post display), but again, at least the backup includes most of the media files.
I have installed WP in my local hd and now have a (somewhat working) backup.
Dee, you’re most welcome!
Lilitthd, in your #13 it sounds like you are talking about the tool I posted in #11, right? I didn’t even know WordPress had a Tumblr importer, so thanks for that.
@bachus Yep, I was talking about the http://tumblr2w....net/
WordPress have an “official” importer. You can get it in the WP dashboard going to “Tools” -> “Import” and selecting “Tumblr”. It will install the importer and guide you through the steps – it’s simple, really.
If the blasted thing didn’t mess up the image posts it would be almost perfect. Sigh.
Mac users may want to check out web devil. I used to used it back in the day for this very purpose but I don’t know when it was last updated.
http://www.mech....html
If the new Tumblr decides to continue to crank down on its erotica clientele, it may well find itself being deserted for a better venue, just like MySpace started to disappear, when Facebook came along.
Some enterprising young man (or woman), may just come along and decide to launch something called “ErosRoller”, that is much friendlier to its base…
Thanks for the heads up – I have over 36.000 posts – Do I have to undertake the back up in one go, or will WinHTTrack remember how far I got and continue from where it was stopped
It is able to resume an interrupted copy to an extent, but there seems to be a lot of duplication — it doesn’t re-download everything but it does re-download a lot. There are probably settings to tweak that I haven’t explored.
My advice, though, would be to tweak the bandwidth settings to something low enough that it can run in the background without bothering your other work, and then just let it run undisturbed for however many days it takes.
We tried to follow the instructions to the “t”, but this error message appeared:
No ‘index.html’ file in C:My Web SitesMy View from the Country Life! What do I do now? Thanks!
[…] How To Back Up Your Adult Tumblr Blog — ErosBlog: The Sex Blog May 20, 2013 Leave a reply […]
Thank you Bacchus, great job you’ve done. I am warned now, though, the fact that I will not know in advance how many spae will be required on my back-up medium, withholds me still a bit. I have now 2000 pages in my main blog and 300 Gb left on my harddisk. I am afraid of hours work (by my laptop) and then the message: “disk is full.”
XOX Xela
I think you’ll be fine — you’ve got more than 300 times the free space that I used in backing up the tumblr I tested the software with, which was at least half the size of yours.
Thank you so much , Bacchus. WinHTTrack is backing up my main blog (19,000 photo posts) as I type. After that, I will sic it on my secondary blogs. This is incredibly valuable information. I have gladly reblogged, in more than one place.
Sweet! I’m so glad this has been helpful.
I downloaded and am copying my blog right now. This blog is relatively young but I have two others that I’ve had much longer. I really don’t want to lose everything I’ve worked so hard on. Thank you for this useful info. >^.^<
[…] found it on an adult Tumblr, but it hardly seems worth linking to those any more now that Tumblr has sold out to Yahoo and locked all the porn away out of sight of the open […]
Mac users may also check out Webgrabber… I’ve used it forever, and it appears to have many of the same options as WinHTTrack. http://www.epic....html
Very prescient. You called it, yahoo killing tumblr, 5 years before it happened.
I’d rather have been wrong, honestly.