Comparing HD Drive contents?

Help for Current Versions of MX
Comparing HD Drive contents?


Post by jbloggs » Tue Sep 25, 2018 12:36 am

I have an 8TB hdd on which I am storing data only.

Earlier data drives are external USB.

How can I check that specific files on the USB HDD are also on the 8tb drive when they may have been copied to a different directory/folder.

I guess that I need command line process, otherwise I'd need to search the 8TB drive 1 file at a time.

Thanks in advance.

Re: Comparing HD Drive contents?


Post by m_pav » Tue Sep 25, 2018 5:42 am

Using the MX Package Installer, install the package meld. It's in the develpoment section, or you can simply search for it.

Once Meld s installed, open it and click on Directory, a new line of buttons appear. in the first button, select one of the directories and in the second button, select the other directory you wish to compare the first with, then click on the compare button. the rest should be self explanatory.

If you're intending to backup a directory to another drive, my suggestion is to use luckybackup. This is a front end to a command line tool called rsync, which can give you identical copies and when a file changes, it only copies the changes if the file type supports changes, or if not, it replaces the file on the destination drive. You can even set it to delete content on the destination that has been removed from the host, giving you an exact replica.
Re: Comparing HD Drive contents?


Post by fehlix » Tue Sep 25, 2018 9:05 am

jbloggs wrote:
Tue Sep 25, 2018 12:36 am
How can I check that specific files on the USB HDD are also on the 8tb drive when they may have been copied to a different directory/folder.
This question implies different goals:
To find specific files you can use file-search from the Whisker menu or right-click within Thunar file-manager to search files.

If your question is targeted more in the direction of how to make sure to have all files from the USB-drive also available on the HDD-drive given that identical files may reside within different directories I would use the following strategy to achieve this. Sure there are other ways to accomplish this goal.

* Create a new directory on the root of the HDD-drive, e.g USB-data
* Copy all file from USB-drive into this newly created USB-data-dir onto the HDD-drive e.g. by using Thunar.
* Now in the next step you will remove all duplicate files found on the HDD-drive including the just copied files with the USB-data-dir. The comparision is made by md5sum keeping the oldest.
By this you will have made sure to have all files and no duplicates on the HDD-drive.
1st Install fdupes, e.g by

Code: Select all

sudo apt-get install fdupes
2nd Remove dublicates, and keeping the old ones.
For quickly removing all newer identical, duplicate files on the HDD-drive and keeping only the oldest one without prompted for confirmation:

Code: Select all

fdupes --noprompt --recurse  --order=time --delete Path-to-HDD-drive
# or short
fdupes -Nrd -o time Path-to-HDD-drive
Those files left within the USB-data-dir after having removed duplicates
are the ones you have missed earlier to copy from USB to HDD.

You can read more here:

Code: Select all

FDUPES(1)                          General Commands Manual                          FDUPES(1)

       fdupes - finds duplicate files in a given set of directories

       fdupes [ options ] DIRECTORY ...

       Searches  the  given  path for duplicate files. Such files are found by comparing file
       sizes and MD5 signatures, followed by a byte-by-byte comparison.

       -r --recurse
              for every directory given follow subdirectories encountered within

       -R --recurse:
              for each directory given after this option  follow  subdirectories  encountered
              within  (note  the ':' at the end of option; see the Examples section below for
              further explanation)

       -s --symlinks
              follow symlinked directories

       -H --hardlinks
              normally, when two or more files point to the same disk area they  are  treated
              as non-duplicates; this option will change this behavior

       -n --noempty
              exclude zero-length files from consideration

       -f --omitfirst
              omit the first file in each set of matches

       -A --nohidden
              exclude hidden files from consideration

       -1 --sameline
              list each set of matches on a single line

       -S --size
              show size of duplicate files

       -m --summarize
              summarize duplicate files information

       -q --quiet
              hide progress indicator

       -d --delete
              prompt user for files to preserve, deleting all others (see CAVEATS below)

       -N --noprompt
              when used together with --delete, preserve the first file in each set of dupli‐
              cates and delete the others without prompting the user

       -I --immediate
              delete duplicates as they are encountered, without grouping into sets;  implies

       -p --permissions
              don't  consider  files  with different owner/group or permission bits as dupli‐

       -o --order=WORD
              order files according to WORD: time - sort by mtime, name - sort by filename

       -i --reverse
              reverse order while sorting

       -v --version
              display fdupes version

       -h --help
              displays help


       Unless -1 or --sameline is specified, duplicate files are listed together  in  groups,
       each  file displayed on a separate line. The groups are then separated from each other
       by blank lines.

       When -1 or --sameline is specified, spaces and backslash characters  (\) appearing  in
       a filename are preceded by a backslash character.

       fdupes a --recurse: b
              will follow subdirectories under b, but not those under a.

       fdupes a --recurse b
              will follow subdirectories under both a and b.

       If fdupes returns with an error message such as fdupes: error invoking md5sum it means
       the program has been compiled to use an external program to calculate  MD5  signatures
       (otherwise, fdupes uses internal routines for this purpose), and an error has occurred
       while attempting to execute it. If this is the case, the specified program  should  be
       properly installed prior to running fdupes.

       When  using  -d  or  --delete,  care should be taken to insure against accidental data

       When used together with options -s or --symlink, a user could accidentally preserve  a
       symlink while deleting the file it points to.

       Furthermore,  when  specifying a particular directory more than once, all files within
       that directory will be listed as their own duplicates, leading to data loss  should  a
       user preserve a file without its "duplicate" (the file itself!).

       Adrian Lopez <adrian2@caribe.net>

Forum Novice
Forum  Novice
Posts: 32
Joined: Tue Jul 11, 2017 11:27 pm

Re: Comparing HD Drive contents?


Post by gonzo01 » Wed Sep 26, 2018 7:18 pm

thanks guys, much appreciated.

fehlix, just what I needed. Will give instructions a go asap.

m_pav - very useful for future - wasn't aware of meld. Looks good.

