All posts tagged Beyond Compare

Duplicate Files

Martin · 3 years

A week ago, my phone's data storage maxed out, and I needed to unload my photos and videos to clear up space.

Actually, scratch that - my photos and videos were already backed up via OneDrive, but somewhere along the line, the app lost the ability to automatically delete media that's already been uploaded. I wanted to make sure I had everything backed up before deleting it all from my phone, so I had to find a solution to compare the phone's storage to my OneDrive copy. I have a Samsung Galaxy S10e, and it lets you plug a USB cable in and browse the files just like an external hard drive.

After some research, I settled on software called Beyond Compare. The interface was easy enough to figure out, and it quickly churned through all of my media. It also let me use all the features in the demo, without buying. After an hour or two of working through my pictures I was done! Beyond Compare aligns duplicate (or similar) files by name, so it was simple to find the discrepancies. I was able to clear up over 30GB of data!

I purchased a copy of Beyond Compare after I finished, since I felt like the tool may be handy in the future and I appreciated what it'd done for me.

Yesterday, my wife had a similar issue, but with slightly different circumstances. She needed to upgrade her phone, which was also about out of space, and wanted to make sure everything was backed up before swapping over to the new device. I've got her on OneDrive as well, so the backup has been happening, but she has an iPhone and weirdly, the files on the phone were not named the same as the files backed up in the cloud.

I loaded up Beyond Compare, thinking I might solve the problem as easily as my own, but I forgot that without the file names being the same there was no practical way for the software to show duplicates side-by-side.

I spent the next hour creating a PowerShell script that would go through all the images from the phone and rename them to match the files on OneDrive, which appeared to be just the date the photo was taken in this format: "YearMonthDay_HourMinuteSecond_iOS.jpg". Once I got the script working however, I realized two things: first, the hours were off by 8 - an easy fix - and second, that the "date taken" property in Windows Explorer did not include seconds, so I couldn't recreate the file names exactly as they were on OneDrive. So Beyond Compare would, sadly, be useless here.

After spending a bit more time looking at other duplicate file checkers, I tried out dupeGuru. It took a long time to analyze all the photos from the phone and the OneDrive backup - about an hour to get through 18,000 photos. Once that was done, however, it was pretty easy to sort out which files we wanted to keep and which were duplicates.

In addition to the 3,000 duplicates it found between the phone and the OneDrive storage, it also found 2,000 duplicates within OneDrive - a happy surprise! It took me a few hours to go through everything, but it wasn't too difficult - just a little tedious (and with a few random hiccups, probably because OneDrive didn't like the software trying to delete hundreds of items at once).

We still have to get all the media she has in her iMessages out somehow, but that shouldn't be too hard.

In the meantime, I'm wondering why Windows (or OneDrive) doesn't have built-in tools for this sort of thing. We're almost all digital hoarders in some capacity, with the inevitable duplicate file here and there, and Windows has tools for comparing and hashing files included in the Command Prompt/PowerShell. It seems like an easy, obvious thing to have.