Evil Javascript and snopes.com

As someone who likes to select text as they read it, snopes.com use of javascript to disable that is rather annoying.

yes I know about noscript/etc… but they shouldn’t be disabling text selection in the first place, it does nothing to protect content

…to prove it, i cloned all of snopes.com and disabled that javascript

eg: (if you have JS enabled) http://www.snopes.com/holidays/christmas/humor/check.asp no text selection, lame!

http://198.27.92.150/snopes.com/holidays/christmas/humor/check.asp yay, can has text selection

full site: http://198.27.92.150/snopes.com/

—-
Stats:

du -sh ./* | grep snopes
144M ./snopes.com
13M ./snopes.js.tar.bz2
 
(text always impresses me how well it compresses)
 
# find snopes.com/|wc -l
6145
 
# find snopes.com/ -name "*.html"| wc -l
5328
 
that means: 6145 total files, 5328 are html pages for the stories

—-

you may find yourself asking “how the hell?”
simple! wget + find + xargs + sed + bored

wget \
--recursive \
--no-clobber \
--page-requisites \
--html-extension \
--convert-links \
--restrict-file-names=windows \
--no-parent \
-D snopes.com \
--limit-rate=500k\
--user=agent="Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1" \
<a href="http://snopes.com/" target="_blank">http://snopes.com</a>

then to disabled that javascript

find . -name "*html" -print | xargs sed -i 's/var omitformtags/#var omitformtags/g'

^ what that does is list every file ending in: “.html” and makes it a massive list, kinda like:

# find snopes.com/ -name "*.html"| tail
snopes.com/humor/lists/childsup.asp.html
snopes.com/humor/lists/olympics.asp.html
snopes.com/humor/lists/insurance.asp.html
snopes.com/humor/misxlate/engliz.asp.html

then it passes that list off to xargs, which runs the sed command on each file to comment out “var omitformtags” which in return breaks the JS that disables text selection.

took all of ~20 minutes to grab every file on snopes.com via that, then a few seconds to disable that javascript on 5328 html files

this is not only a lesson in dont annoy linux geeks, but also automation and how to edit 5000+ files in seconds

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>