Mac Check Text File For Dublicate Lines

Active6 years, 6 months ago
  1. Mac Check Text File For Duplicate Lines
  2. What Is Text File
  3. Remove Duplicate Files Mac
  4. Object File
  5. Mac Check Text File For Duplicate Lines In Excel

I just need a handy little tool that will check for duplicate lines in a text file, and it will delete those duplicates. So if the file said:

it will turn into:

Textpad

Nice and simple. But the text file will be large and full of long file locations, and I need to ensure that there is no more then ONE of any file. it does not matter which of the duplicates is deleted, as long as only one remains. So I would be okay with something like:

Our text file is constructed like a database table, with each line in the text file representing a single field in a single record. If we run a Select DISTINCT query against this text file, we’ll get back only the unique lines. This command would have been more logical if it had an ‘-a’ option to print all the lines in a file with the duplicates ‘squashed’ into a single line. Sorry Richard. The below “only print unique lines”, will omitt lines that have duplicates.

Find identical files in one or more directory trees with the free sfk dupfind command for the Windows, Mac OS X, Linux and Raspberry Pi command line. - download the free Swiss File Knife Base from Sourceforge. - open the Windows CMD command line, Mac OS X Terminal or Linux shell.

Here's all that I have so far:

I have no idea where to begin on making the loop to test all the possible duplicates.

BBMAN225
BBMAN225BBMAN225
3684 gold badges8 silver badges15 bronze badges

2 Answers

create a cmd file uniqeline.cmd with this content:

Call from the commandline:

renerene
34.9k12 gold badges85 silver badges117 bronze badges

Your code to store the lines in an 'array' is broken. You should be incrementing v instead of var.

The code to check for duplicates is simple, but slow. Simply loop through the existing values to see if it matches the current line. Only echo and store the current line if no match was found. The higher the number of unique lines, the slower it gets.

The script below expects the name of the file as the 1st and only parameter

The above will fail if a line begins with ; because the default FOR /F EOL option will skip lines that begin with ;. That can be fixed with some awkward syntax that sets both EOL and DELIMS to nothing: usebackq^ delims^=^ eol^=

The above will also fail if any line contains ! because delayed expansion will corrupt the value of the line when the FOR /F variable is expanded. That can be fixed by carefully enabling and disabling delayed expansion as needed.

But there are much faster and simpler solutions.

The fastest and simplest possible pure batch solution is to incorporate the line content into the name of a variable. To check for duplicates, simply check if the variable is already defined.

There are 2 major limitations with the above solution.

  • The duplicate comparison is case insensitive because variable names are case insensitive.

  • The solution will not properly detect duplicates containing = because = cannot be included in a variable name.


I believe rene's solution using SORT is the best generally applicable approach, although rene's code has the following shortcomings

  • The use of CALL significantly slows performance (noticeable with large files)

  • Lines beginning with ; are skipped

  • Special characters like &|<>^ cause problems

  • The script assumes there is only one space delimited token

The shortcomings are easily fixed:

All batch solutions are limited to a maximum line length of ~8191 characters.

Also, all solutions above will strip empty lines.

dbenhamdbenham
106k20 gold badges191 silver badges294 bronze badges

Not the answer you're looking for? Browse other questions tagged batch-file or ask your own question.

Duplicate files are a waste of disk space, consuming that precious SSD space on a modern Mac and cluttering your Time Machine backups. Remove them to free up space on your Mac.

There are many polished Mac apps for this — but they’re mostly paid software. Those shiny apps in the Mac app store will probably work well, but we have some good options if you don’t want to whip out your credit card.

Gemini and Other Paid Apps

If you do want to spend money on a duplicate-file-finder app, Gemini looks like one of the best options with the slickest interfaces. The trial version worked well for us, and the interface certainly stands out from barebones, free applications like dupeGuru. Gemini can also scan your iTunes and iPhoto library for duplicates. If you’re willing to pay $10 for a better interface, Gemini seems like a good bet.

There are other, similarly polished duplicate-file-finders in the Mac App Store, too — but Apple flags this one as an Editors’ Choice, and we can see why.

As a bonus, the demo version of Gemini allows you to search for and find duplicates, but not remove them. So, if you really wanted, you could use the demo to find duplicates on your Mac, locate them in Finder, and then remove them by hand. Other paid duplicate-file-finder apps have demos that function in a similar way, so this may be convenient if you just want to run an occasional scan and you don’t mind deleting a handful of duplicates by hand.

There are many good-quality, paid duplicate-file-finding apps for Mac. You can find them with a quick trip to the Mac App Store.

dupeGuru, dupeGuru Music Edition, and dupeGuru Pictures Edition

RELATED:10 Ways To Free Up Disk Space on Your Mac Hard Drive

We also recommended dupeGuru for finding duplicate files on Windows. This application is both open-source and cross-platform. It’s simple to use — open the application, add one or more folders to scan, and click Scan. You’ll see a list of duplicate files, and you can select them and easily move them to the Trash or another folder. You can also preview them, verifying that they actually are duplicates before tossing them away.

dupeGuru is available in three different flavors — a standard edition, an edition designed for finding duplicate music files, and an edition designed for finding duplicate pictures. These tools won’t just find exact duplicates, but should find the same songs encoded at different bitrates and the same picture resized, rotated, or edited.

This application is utilitarian, but it does its job well. You don’t get the shiny interface that you do with the paid Mac apps, but it’s a good free tool for finding and clearing duplicate files. If you want a free application for finding and removing duplicate files on a Mac, this is the one to use.

iTunes

Mac Check Text File For Duplicate Lines

iTunes has a built-in feature that can find duplicate music and video files in your iTunes library. It won’t help with other types of files or media files not in iTunes, but it can be a quick way to free up some space if you have a big media library with duplicate files.

What Is Text File

To use this feature, open iTunes, click the View menu, and select Show Duplicate Items. You can also hold the Option key on your keyboard and then click the Show Exact Duplicate Items link. This will only show duplicates with the same exact name, artist, and album.

After you click this, iTunes will show you a sorted list of duplicates next to each other. You can go through the list and delete any duplicates from your computer if they actually are duplicates you want to delete. When you’re done, click View > Show All Items to get back to the default list of media.

Remove Duplicate Files Mac

That’s it? Yup, that’s it. We didn’t want to recommend potentially confusing Terminal commands that output a list of duplicates to a text file, awkward methods that involve scrolling through a list of all the files on your Mac in the Finder, or applications that require disabling the Mac’s Gatekeeper feature to run untrusted binaries. The tools above will do the job, whether you want a barebones-and-free utility or a polished-but-paid application.

Object File

READ NEXT

Mac Check Text File For Duplicate Lines In Excel

  • › Free Download: Microsoft’s PowerToys for Windows 10
  • › How to Overclock Your Computer’s RAM
  • › What’s New in Chrome 77, Arriving September 10
  • › How to Use Pipes on Linux
  • › Why Does USB Have to Be So Complicated?