Dupe Finders and One-Liners

Welcome back, dear readers. This week your chef had to take his act out on the road again, despite his notable lack of wanderlust. Thus your chef was faced today with the choice of strolling around his hotel’s serene Japanese garden and exploring the surroundings in a new city, or spending many hours evaluating new utility software and writing his recommendations. Maybe this will give you a clue about his choice:

Eliminate Double Vision

A reader mailed in to ask about software for detecting duplicate digital photos on your disk. Such programs try to identify duplicate pictures by examining their sizes or even the actual images. I suggested that he try these programs:

DupDetector (free; compares images by content; apparently orphaned by its developer, and not certified for Vista)

VisiPics (free; compares images by content)

Image Comparer ($35; compares images by content)

Image File DeDuper (also known as JPeg DeDuper; free; compares by file size first, then by content) (another reader commented that JPeg De-Duper is very fast with very few false positives, but misses some duplicates)

DoubleKiller Pro ($20), which also has a free version.

Md5sums (free; command-line tool that only compares file checksums)

I have not tried any of these programs, though I know I need to use one on my overgrown photo collection. I welcome your further comments on these or any others you know of.

Update on Free Firewalls

Lifehacker recently asked its readers to name the best free software firewall. The results generally confirm my existing conceptions: Approximately 36% chose Comodo Personal Firewall. ZoneAlarm Free garnered about 23% of the votes, closely followed by Windows’ built-in firewall with 22%. Sygate Personal Firewall took 7%, and various others got about 1% of the votes each.

The big surprise for me in the Lifehacker survey was the 9% vote for “Fire-what? Don’t use one.” I doubt that any of those 9% are Tool Bar readers too. But if you are among them, I’m telling you now: Get a firewall!

Even the simple built-in Windows firewall helps prevent hackers and malware from breaking in to your computer (and it might be working without your even being aware of it). More sophisticated firewalls, such as Comodo (which I use – see my remarks in posts #6, #47, #51, and #57), also stop malware that somehow gets into your computer from reaching out to the Internet and doing more harm. Used in conjunction with good antivirus, antispyware scanning, and host intrusion protection programs (the latter also included with the Comodo firewall), you can rest assured that your computer is reasonably well protected.

I continue to recommend Comodo for its excellent performance in testing and its generally good interface. And Comodo continues to reward me with safety, but also to punish me with some very irritating habits – particularly the way it lays its messages right on top of each other so you can’t read or click on them. And yesterday for unknown reasons, Comodo apparently forgot many of the programs I trained it to recognize and accept, resulting in a blizzard of new pop-up questions that really tried my patience. (Come on, Comodo, you don’t recognize Windows Media Player any more?) So watch this space in coming weeks for my assessments of other firewalls.

And now let’s see what insights Linux grandmaster Mark Lautman has for us today….

Did You Hear the One About.…

by Mark Lautman

A computer analyst said to a programmer, “You start coding. I'll go find out what the customer wants.”

“I haven't lost my mind; it's backed up on tape somewhere.”

This form of humor is called the “one-liner.” I got these examples from the fabulous collection at http://www.oneliners-and-proverbs.com/.

One-liners are great in Linux, too. For several posts I've been describing the “command line” and the “terminal window,” but I haven't exactly said what you can do with those things. For the next few weeks, I'll introduce some one-line commands that show what the terminal window can do for you.

If you maintain a Web site, you've probably come across the situation where you need to change one little thing in 100 HTML files. I've done my share of changes to relative directories, or even just replacing one word with another. One way of doing this is to open each cute little HTML file in a text editor, and do a find and replace. This works fine for the first 10 cute little HTML files, but after that they don't look so cute or little any more. You could write a Word macro, which cuts down the time quite a bit, but you'll need at least two lines to open and close the files.

In Linux you can change all 100 files with a single command. For example, the following command replaces all instances of “Tool” to “Bar” in all HTML files in a directory:

perl -p -e 's/Tool/Bar/ig' *.html

I took the above example from Rice University’s Edit Your HTML Files with a One-Line Perl Program. You can find variations on this theme at that site. If there were a contest for the most valuable one-line command, this is a sure winner.

My son tells me that real mammalians have hair. I tell him that real HTML files start with some type of a document declaration. Nobody does this, certainly not the big retail sites, but it’s a good practice. Below is an example of adding a document type to the first line of all HTML files in a directory. (This example is based of a collection of one-liners at Perl One Liners.

perl -i -ple 'print q{<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\">} if $. == 1; close ARGV if eof' *.html

Can you find the error in in this sentence? A posting at UNIX for Dummies Questions & Answers has a few single-line commands to find lines containing duplicate words, for example:

perl -ne 'print "$.: doubled $_\n" if /\b(\w+)\b\s+\b\1\b/'

The previous examples used Perl commands. Perl is my favorite language for abusing text files. There are other Linux utilities as well. Linux’s sed (Stream EDitor) is very popular. For example, have you ever received a file with annoying empty lines between each paragraph? If you have sed, you can eliminate all those lines with an amazingly short command (suggested by Eric Pement):

sed 'n;d' filename.txt

If you haven't received annoying empty lines, you've probably received files with annoying leading spaces and tabs. The following sed command, offered by sed one-liners, has a solution just for you:

sed 's/^[ \t]*//' filename.txt

[You’ll find a good introductory sed reference guide here. —JP]

Consultants, as we all know, get paid by the amount of work they do, not the quality. If you want to compare the number of words your consultants are giving you in text files, use the wc (word count) command:

wc -w `find . -name "*.txt"`

Below is a list of word counts from the Tool Bar's recent posts.

Next week we'll look at some Perl modules that are available for special tasks.  —Mark Lautman

That wraps up another great Tool Bar. Do come back next week and every week for more great tips and software recommendations, and don’t forget to bring all your friends. And please help keep this blog going by visiting our advertisers.

Share your thoughts by clicking on “comments” below or writing to .

 

2 comments

Paperback writer............
Technical writer...........

"Yesterday" and the whole concert sung by Ray Charles in Tel-Aviv 1972-3 the "Holyland". Courtesy of RAGBAGKEN.
YOUTUBE HAS EVERYTHING! ;-)

Q1 2010 Survey Results

Requires access rights

Employee Salaries (18 pp)

Freelance Writer Rates (11 pp)

Q4/09 Copy Editor Rates (9 pp)


Latest photos


Useful Information

  • Job Listings (visible to only members)

  • Employee Benefits

  • Other Sites and Resources

    Columns on Elephant

    Translatable but Debatable

    Each month, Mark L. Levinson presents one hard-to-translate Hebrew word at a time for discussion.

    Of Mice and Keyboard Shortcuts

    Michael Cohen will teach us practical shortcuts that save us time and make our lives easier.

    The Why of Style

    Mark L. Levinson examines the big and little factors that make writing effective.

    Broken Bell Education in Israel

    David Siegel looks at the problems in education in Israel and discusses what can be done.

    Jonathan's Tool Bar & Grill

    Jonathan Plutchok identifies free or inexpensive utilities or plug-ins that save time, increase productivity, improve your computing environment, perform a task you otherwise couldn't do... or is just too much fun to ignore. This column has grown into its own blog at http://jonathanstoolbar.blogspot.com where you can find new issues every week.

    It's in The Script

    Paul Schnall teaches us about the power of FrameScript and how to use it.

    Do it Yourself

    Did you ever wonder what was inside a PC, laptop, or other microcomputer system? Michael Cohen teaches us what's inside and how to configure and build our own. 

    Coaching for Success

    Dr. Tal discusses the principles of professional coaching, focusing on resiliency. 

    Hunters and Gatherers

    Eric Gluch looks at modern marketing.

    Moving to Chelm  

    Esther Shira Stepansky takes us on a humorous adventure in the modern day land of Chelm as we look at some of the challenges of making aliyah and finding work in Israel.  Making aliyah is supposed to be the fulfillment of my of your Jewish identity, so why does Israel make it so difficult?

    Why am I a Tech Writer?

    By Michael Altman

    Life as a Tech Writer

    By Mumpy

    Building Bridges (in Hebrew)

    Dr. Zaidel discusses another aspect of mediation within the framework of Israel's court-approved Alternative Dispute Resolution (ADR) process. 

    Don't Forget

    Hezy Asher teaches us how to improve our memory.

    World of Podcasting

    Tom Johnson's podcast episodes, provide tips on recording presentations, and other podcasting related news and events.

    Effective Management ניהול אפקטיבי

    By Eitan Reuveni

    Scribblin' With Steph

    By Stephanie Freid

    Life in Northern Israel

    By multiple authors

    Life on the Southern Front of Israel

    By Israel Ivri

    Event Summaries

    Summaries of events held by Elephant and other organizations throughout the Israeli technical/marcom community.

     

    Survey Reporting