Override Robots.txt With wget

I find myself downloading lots of files from the web when converting sites into my company’s CMS. Whether from static sites or other CMS platforms, trying to do this manually sucks. But, thanks to wget’s recursive download feature, I can rip through a site, and get all of the images I need, while keeping even the folder structure.

One thing I found out was that wget respects robots.txt files, so the the site you are trying to copy has one with the right settings, wget will get only what is allowed. This is something that can be overridden with a few tweaks. I gladly used it and decided to pass it along. See the instructions at the site below.
Continue reading “Override Robots.txt With wget”

Its time to be a doer!

I just got invited to forrst and realized that I spend too much time consuming web content.

Not the typical myspace, facebook updates, and YouTube crap. But lots of web “stuff” languages, tutorials, browsing stock resource sites, reading twitter posts. I spend way more time consuming content than I do creating content or participating.
Continue reading “Its time to be a doer!”

Time Machine Only Restores Folders?

So. While venturing into Coldfusion Builder I managed to delete the entire contents of my htdoc directory.  11 gigs…gone…poof. My first thought was, “I can’t believe that just happened?” Then, “Damn, with that new 7200 rpm hard drive, they disappeared 50% faster.”  Not to worry, I’ve got a Time Machine backup at home. I’ll just restore it from last night and we’ll be all good.

It wasn’t that simple.

Continue reading “Time Machine Only Restores Folders?”